-
Notifications
You must be signed in to change notification settings - Fork 59
alsa: Restore test settings after each test #1279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
alsa: Restore test settings after each test #1279
Conversation
|
Can one of the admins verify this patch?
|
d4ca717 to
451e887
Compare
test-case/check-alsabat.sh
Outdated
| frames=${OPT_VAL['n']} | ||
|
|
||
| start_test | ||
| save_alsa_state "${0##*/}".state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than have every caller trim and append .state, you can let save_alsa_state do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to separate the concerns of handling ALSA and handling of the arbitrary filename. This approach was used to ensure no collisions between scripts.
case-lib/lib.sh
Outdated
| save_alsa_state() | ||
| { | ||
| dlogi "save_alsa_state called with ${1}" | ||
| trap "restore_alsa_state '${1}'" EXIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already an EXIT handler: func_exit_handler(). You can't have both; how did you test this?
You can likely have func_exit_handler() invoke restore_alsa_state when the conditions are right.
|
Isn't this a followup to #1270? |
3fb569a to
5060e2e
Compare
|
test this please |
52064bc to
0edd342
Compare
test-case/check-alsabat.sh
Outdated
| frames=${OPT_VAL['n']} | ||
|
|
||
| start_test | ||
| save_alsa_state "${0##*/}".state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to separate the concerns of handling ALSA and handling of the arbitrary filename. This approach was used to ensure no collisions between scripts.
case-lib/lib.sh
Outdated
| cur_exit_trap_code=$(trap -p EXIT | awk -F\' '{print $2}') | ||
| if [[ -z "${cur_exit_trap_code// }" ]]; then | ||
| trap "restore_alsa_state ${1}" EXIT | ||
| else | ||
| trap "restore_alsa_state ${1}; ${cur_exit_trap_code}" EXIT | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach works perfectly in my local tests, but generates TIMEOUTs in the CI in runs that previously were FAILs.
I didn't want to dump the ALSA traps inside the func_exit_handler, as it's a generic solution for every test and not every test requires ALSA setup and teardown.
I welcome comments as to why this would cause func_exit_handler to not run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to dump the ALSA traps inside the func_exit_handler, as it's a generic solution for every test and not every test requires ALSA setup and teardown.
func_exit_handler() easily can be "smart" and run restore_alsa_state() only when save_alsa_state() was run before; this does not have to be "dumped".
I welcome comments as to why this would cause func_exit_handler to not run.
I think my previous "it does not work" comment was on some earlier and quite different version.
But even now, I don't like "stacking" trap functions like this because it scatters the exit handling code across multiple places and it becomes more difficult to see what it does in any given test and situation and in which order. Who says that restore_alsa_state should run first on EXIT? Maybe it shouldn't be first...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No TIMEOUT problems anymore after I've moved the called inside the func_exit_handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just remembered that the Jenkins runner waits forever for PASS or FAIL or SKIP to be printed, see commit 4395b60 and others. I'm afraid Jenkins does not care about the test process exiting and does not care about the exit value either. That's likely a (very old) design mistake.
This would tend to prove that the earlier version did not restore func_exit_handler() properly.
BTW can you set a trap from a trap? Dunno.
This discussion is obsolete and just for completeness, sorry.
It's connected, but it's not a follow-up per se. It aims to resolve the specific problem of ALSA tests changing machine's ALSA settings and messing with the following test executions. Note that it does nothing to e.g. prevent people from connecting to the machines and messing around in settings on an ad hoc basis. |
Do we really need to strive for ALSA state restore on_exit ? Isn't it easier and just enough for our CI's purposes to restore the default state before at each test case where it is needed ? This way it solves both these cases:
|
Yes, that's the ideal solution, given the assumption that we have a perfect description of the "default state" every time we run a test for every machine we run a test on. However, this quick fix allows us to be sure that tests do not interfere with each other no matter the deficiencies in our defined "default state" for a given machine at a given time. |
right, my assumption (it is already in #1270 comments) that each test plan has the DUT's default ALSA state set at the beginning of its execution, and it is saved for the following test cases to reset from it and have the common ground. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the ideal solution, given the assumption that we have a perfect description of the "default state" every time we run a test for every machine we run a test on.
That "assumption" is not really optional: if test results depend on some out-of-control configuration, then there is a test setup bug that must be fixed somewhere (BTW see
area:Ansible
Tests cleaning up after themselves does not hurt but it's less important than comprehensive test setup. Also, cleanup must be easy to turn off when the test is failing. Anything that runs after a failure makes debugging harder - whether that's in interactive use or when staring at logs from automated runs.
case-lib/lib.sh
Outdated
| alsactl restore --file /var/tmp/"$1" --pedantic --no-ucm --no-init-fallback || dlogi "alsactl state restoration failed!" | ||
| rm /var/tmp/"$1" || dlogi "Old state file removal failed!" | ||
| fi | ||
| return "$status" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of this... for instance it might print a misleading stack trace if the previous command failed.
case-lib/lib.sh
Outdated
| cur_exit_trap_code=$(trap -p EXIT | awk -F\' '{print $2}') | ||
| if [[ -z "${cur_exit_trap_code// }" ]]; then | ||
| trap "restore_alsa_state ${1}" EXIT | ||
| else | ||
| trap "restore_alsa_state ${1}; ${cur_exit_trap_code}" EXIT | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to dump the ALSA traps inside the func_exit_handler, as it's a generic solution for every test and not every test requires ALSA setup and teardown.
func_exit_handler() easily can be "smart" and run restore_alsa_state() only when save_alsa_state() was run before; this does not have to be "dumped".
I welcome comments as to why this would cause func_exit_handler to not run.
I think my previous "it does not work" comment was on some earlier and quite different version.
But even now, I don't like "stacking" trap functions like this because it scatters the exit handling code across multiple places and it becomes more difficult to see what it does in any given test and situation and in which order. Who says that restore_alsa_state should run first on EXIT? Maybe it shouldn't be first...
|
Removed the trap stacking and moved the |
case-lib/hijack.sh
Outdated
| } | ||
|
|
||
| # Restore ALSA settings after execution if state file exists | ||
| if [ -f /var/tmp/"${SCRIPT_NAME##*/}".state ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been wanting to reduce SCRIPT_NAME usage in #546 for a few years :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if [ -f /var/tmp/"${SCRIPT_NAME##*/}".state ]; then | |
| if [ -f /tmp/"${SCRIPT_NAME##*/}".state ]; then |
While different distros do things differently, /tmp is in general on tmpfs ("RAM disk") whereas /var/tmp is persistent across reboots - which I'm pretty sure we do not WANT.
BTW did you consider using the process ID in the name? I mean $$
There is also https://en.wikipedia.org/wiki/TMPDIR, it's complicated :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another idea:
SOF_TEST_ALSA_STATE=/tmp/"${SCRIPT_NAME##*/}".$$.state
... and then at restore time you can test -n SOF_TEST_ALSA_STATE? So the filename is defined in only one place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose /var/tmp/ at first because I remembered that the DUT is rebooted in our CI during testing, but you're right - that reboot occurs before running the test case .sh file, so /tmp should be used instead.
I've applied your last suggestion and squashed the commits.
I've taken a look at the failed shellcheck and it all seemed like problems that already existed in the codebase, rather than something newly introduced.
PR checked in CI run No. 54699
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but you're right - that reboot occurs before running the test case .sh file, so /tmp should be used instead.
There is no reboot test AFAIK right now:
PS: there are some suspend/resume tests but they are not very reliable (see #1017 and internal sof-framework/issues/408).
Currently, some tests change ALSABAT settings during their execution, but do not restore them to their prior state. This causes tests that occur later in the execution order to intermittently fail. To remedy this, all tests that use ALSA now use a common save_alsa_state function, which shall save current ALSA state to a file and then restore that state on the bash script EXIT signal. Signed-off-by: Lukasz Mrugala <lukaszx.mrugala@intel.com>
6a22042 to
0aa7b2d
Compare
|
SOFCI TEST |
1 similar comment
|
SOFCI TEST |
Currently, some tests change ALSABAT settings during their execution, but do not restore them to their prior state.
This causes tests that occur later in the execution order to intermittently fail.
To remedy this, all tests that use ALSA now use a common
save_alsa_statefunction, which shall save current ALSA state to a file and then restore that state on the bash script EXIT signal.Fixes #1275