alsa: Restore test settings after each test #1279

LukaszMrugala · 2025-06-16T20:59:30Z

Currently, some tests change ALSABAT settings during their execution, but do not restore them to their prior state.

This causes tests that occur later in the execution order to intermittently fail.

To remedy this, all tests that use ALSA now use a common save_alsa_state function, which shall save current ALSA state to a file and then restore that state on the bash script EXIT signal.

Fixes #1275

sofci · 2025-06-16T21:03:14Z

Can one of the admins verify this patch?

reply test this please to run this test once

marc-hb · 2025-06-18T00:19:08Z

test-case/check-alsabat.sh

 frames=${OPT_VAL['n']}

 start_test
+save_alsa_state "${0##*/}".state


Rather than have every caller trim and append .state, you can let save_alsa_state do it.

I wanted to separate the concerns of handling ALSA and handling of the arbitrary filename. This approach was used to ensure no collisions between scripts.

marc-hb · 2025-06-18T00:21:45Z

case-lib/lib.sh

+save_alsa_state()
+{
+    dlogi "save_alsa_state called with ${1}"
+    trap "restore_alsa_state '${1}'" EXIT


There is already an EXIT handler: func_exit_handler(). You can't have both; how did you test this?

You can likely have func_exit_handler() invoke restore_alsa_state when the conditions are right.

marc-hb · 2025-06-18T00:22:42Z

Isn't this a followup to #1270?

marc-hb · 2025-06-18T11:43:19Z

test this please

LukaszMrugala · 2025-06-18T15:00:40Z

test-case/check-alsabat.sh

 frames=${OPT_VAL['n']}

 start_test
+save_alsa_state "${0##*/}".state


I wanted to separate the concerns of handling ALSA and handling of the arbitrary filename. This approach was used to ensure no collisions between scripts.

LukaszMrugala · 2025-06-18T15:04:39Z

case-lib/lib.sh

+    cur_exit_trap_code=$(trap -p EXIT | awk -F\' '{print $2}')
+    if [[ -z "${cur_exit_trap_code// }" ]]; then
+        trap "restore_alsa_state ${1}" EXIT
+    else
+        trap "restore_alsa_state ${1}; ${cur_exit_trap_code}" EXIT
+    fi


This approach works perfectly in my local tests, but generates TIMEOUTs in the CI in runs that previously were FAILs.
I didn't want to dump the ALSA traps inside the func_exit_handler, as it's a generic solution for every test and not every test requires ALSA setup and teardown.
I welcome comments as to why this would cause func_exit_handler to not run.

I didn't want to dump the ALSA traps inside the func_exit_handler, as it's a generic solution for every test and not every test requires ALSA setup and teardown.

func_exit_handler() easily can be "smart" and run restore_alsa_state() only when save_alsa_state() was run before; this does not have to be "dumped".

I welcome comments as to why this would cause func_exit_handler to not run.

I think my previous "it does not work" comment was on some earlier and quite different version.

But even now, I don't like "stacking" trap functions like this because it scatters the exit handling code across multiple places and it becomes more difficult to see what it does in any given test and situation and in which order. Who says that restore_alsa_state should run first on EXIT? Maybe it shouldn't be first...

No TIMEOUT problems anymore after I've moved the called inside the func_exit_handler.

I just remembered that the Jenkins runner waits forever for PASS or FAIL or SKIP to be printed, see commit 4395b60 and others. I'm afraid Jenkins does not care about the test process exiting and does not care about the exit value either. That's likely a (very old) design mistake.

This would tend to prove that the earlier version did not restore func_exit_handler() properly.

BTW can you set a trap from a trap? Dunno.

This discussion is obsolete and just for completeness, sorry.

LukaszMrugala · 2025-06-18T15:11:01Z

Isn't this a followup to #1270?

It's connected, but it's not a follow-up per se. It aims to resolve the specific problem of ALSA tests changing machine's ALSA settings and messing with the following test executions. Note that it does nothing to e.g. prevent people from connecting to the machines and messing around in settings on an ad hoc basis.

golowanow · 2025-06-18T15:30:44Z

Currently, some tests change ALSABAT settings during their execution, but do not restore them to their prior state.
This causes tests that occur later in the execution order to intermittently fail.

Do we really need to strive for ALSA state restore on_exit ?

Isn't it easier and just enough for our CI's purposes to restore the default state before at each test case where it is needed ?

This way it solves both these cases:

.. the specific problem of ALSA tests changing machine's ALSA settings and messing with the following test executions.

.. people from connecting to the machines and messing around in settings on an ad hoc basis.

LukaszMrugala · 2025-06-18T15:39:53Z

Do we really need to strive for ALSA state restore on_exit ?

Isn't it easier and just enough for our CI's purposes to restore the default state before at each test case where it is needed ?

Yes, that's the ideal solution, given the assumption that we have a perfect description of the "default state" every time we run a test for every machine we run a test on. However, this quick fix allows us to be sure that tests do not interfere with each other no matter the deficiencies in our defined "default state" for a given machine at a given time.

golowanow · 2025-06-18T15:56:06Z

Yes, that's the ideal solution, given the assumption that we have a perfect description of the "default state" every time we run a test for every machine we run a test on.

right, my assumption (it is already in #1270 comments) that each test plan has the DUT's default ALSA state set at the beginning of its execution, and it is saved for the following test cases to reset from it and have the common ground.

marc-hb

Yes, that's the ideal solution, given the assumption that we have a perfect description of the "default state" every time we run a test for every machine we run a test on.

That "assumption" is not really optional: if test results depend on some out-of-control configuration, then there is a test setup bug that must be fixed somewhere (BTW see area:Ansible Problems that Ansible can help with and the same label internally)

Tests cleaning up after themselves does not hurt but it's less important than comprehensive test setup. Also, cleanup must be easy to turn off when the test is failing. Anything that runs after a failure makes debugging harder - whether that's in interactive use or when staring at logs from automated runs.

marc-hb · 2025-06-18T16:01:09Z

case-lib/lib.sh

+        alsactl restore --file /var/tmp/"$1" --pedantic --no-ucm --no-init-fallback || dlogi "alsactl state restoration failed!"
+        rm /var/tmp/"$1" || dlogi "Old state file removal failed!"
+    fi
+    return "$status"


I'm not a fan of this... for instance it might print a misleading stack trace if the previous command failed.

marc-hb · 2025-06-18T16:06:22Z

case-lib/lib.sh

+    cur_exit_trap_code=$(trap -p EXIT | awk -F\' '{print $2}')
+    if [[ -z "${cur_exit_trap_code// }" ]]; then
+        trap "restore_alsa_state ${1}" EXIT
+    else
+        trap "restore_alsa_state ${1}; ${cur_exit_trap_code}" EXIT
+    fi


I didn't want to dump the ALSA traps inside the func_exit_handler, as it's a generic solution for every test and not every test requires ALSA setup and teardown.

func_exit_handler() easily can be "smart" and run restore_alsa_state() only when save_alsa_state() was run before; this does not have to be "dumped".

I welcome comments as to why this would cause func_exit_handler to not run.

I think my previous "it does not work" comment was on some earlier and quite different version.

But even now, I don't like "stacking" trap functions like this because it scatters the exit handling code across multiple places and it becomes more difficult to see what it does in any given test and situation and in which order. Who says that restore_alsa_state should run first on EXIT? Maybe it shouldn't be first...

LukaszMrugala · 2025-06-18T19:25:16Z

Removed the trap stacking and moved the restore_ call inside the func_exit_handler. I'm using the existence of the .state file as an indicator whether to run the restore_ or not. Disabling that mechanism should be easy. Removed the filename passing as I've realised I can be using a preexisting variable instead.

marc-hb · 2025-06-18T20:00:21Z

case-lib/hijack.sh

    }

+    # Restore ALSA settings after execution if state file exists
+    if [ -f /var/tmp/"${SCRIPT_NAME##*/}".state ]; then


I've been wanting to reduce SCRIPT_NAME usage in #546 for a few years :-)

Suggested change

if [ -f /var/tmp/"${SCRIPT_NAME##*/}".state ]; then

if [ -f /tmp/"${SCRIPT_NAME##*/}".state ]; then

While different distros do things differently, /tmp is in general on tmpfs ("RAM disk") whereas /var/tmp is persistent across reboots - which I'm pretty sure we do not WANT.

BTW did you consider using the process ID in the name? I mean $$

There is also https://en.wikipedia.org/wiki/TMPDIR, it's complicated :-(

Here's another idea:

SOF_TEST_ALSA_STATE=/tmp/"${SCRIPT_NAME##*/}".$$.state

... and then at restore time you can test -n SOF_TEST_ALSA_STATE? So the filename is defined in only one place.

I chose /var/tmp/ at first because I remembered that the DUT is rebooted in our CI during testing, but you're right - that reboot occurs before running the test case .sh file, so /tmp should be used instead.
I've applied your last suggestion and squashed the commits.
I've taken a look at the failed shellcheck and it all seemed like problems that already existed in the codebase, rather than something newly introduced.
PR checked in CI run No. 54699

but you're right - that reboot occurs before running the test case .sh file, so /tmp should be used instead.

There is no reboot test AFAIK right now:

[FEATURE] Support reboot cycles/stress-tests #1038

PS: there are some suspend/resume tests but they are not very reliable (see #1017 and internal sof-framework/issues/408).

Currently, some tests change ALSABAT settings during their execution, but do not restore them to their prior state. This causes tests that occur later in the execution order to intermittently fail. To remedy this, all tests that use ALSA now use a common save_alsa_state function, which shall save current ALSA state to a file and then restore that state on the bash script EXIT signal. Signed-off-by: Lukasz Mrugala <lukaszx.mrugala@intel.com>

redzynix · 2025-07-11T08:26:12Z

SOFCI TEST

LukaszMrugala · 2025-07-11T08:35:08Z

SOFCI TEST

LukaszMrugala force-pushed the alsabat-test-state-restoration-quickfix branch 6 times, most recently from d4ca717 to 451e887 Compare June 17, 2025 15:32

marc-hb reviewed Jun 18, 2025

View reviewed changes

LukaszMrugala force-pushed the alsabat-test-state-restoration-quickfix branch 4 times, most recently from 3fb569a to 5060e2e Compare June 18, 2025 08:58

LukaszMrugala force-pushed the alsabat-test-state-restoration-quickfix branch 2 times, most recently from 52064bc to 0edd342 Compare June 18, 2025 13:18

LukaszMrugala commented Jun 18, 2025

View reviewed changes

marc-hb reviewed Jun 18, 2025

View reviewed changes

LukaszMrugala requested a review from marc-hb June 18, 2025 19:25

LukaszMrugala marked this pull request as ready for review June 18, 2025 19:25

LukaszMrugala requested review from a team, golowanow and lgirdwood as code owners June 18, 2025 19:25

marc-hb reviewed Jun 18, 2025

View reviewed changes

marc-hb mentioned this pull request Jun 20, 2025

[BUG] Some tests modify machine state #1275

Open

LukaszMrugala force-pushed the alsabat-test-state-restoration-quickfix branch from 6a22042 to 0aa7b2d Compare June 23, 2025 14:27

LukaszMrugala requested a review from marc-hb June 24, 2025 07:50

marc-hb mentioned this pull request Jun 26, 2025

alsactl DUT restoration #1270

Open

KamilxPaszkiet assigned golowanow and unassigned golowanow Jul 14, 2025

	if [ -f /var/tmp/"${SCRIPT_NAME##*/}".state ]; then
	if [ -f /tmp/"${SCRIPT_NAME##*/}".state ]; then

alsa: Restore test settings after each test #1279

Are you sure you want to change the base?

alsa: Restore test settings after each test #1279

Uh oh!

Conversation

LukaszMrugala commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sofci commented Jun 16, 2025

Uh oh!

marc-hb Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marc-hb commented Jun 18, 2025

Uh oh!

marc-hb commented Jun 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LukaszMrugala commented Jun 18, 2025

Uh oh!

golowanow commented Jun 18, 2025

Uh oh!

LukaszMrugala commented Jun 18, 2025

Uh oh!

golowanow commented Jun 18, 2025

Uh oh!

marc-hb left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LukaszMrugala commented Jun 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

redzynix commented Jul 11, 2025

Uh oh!

LukaszMrugala commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

LukaszMrugala commented Jun 16, 2025 •

edited

Loading

marc-hb Jun 18, 2025 •

edited

Loading

marc-hb left a comment •

edited

Loading