Skip to content

Conversation

@dcpleung
Copy link
Contributor

@dcpleung dcpleung commented Jul 2, 2025

This reworks Zephyr IPC interface to utilize the newly introduced IPC message service, which provides a more generic IPC interface.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, on MTL/ace15 test results look good, but on LNL/PTL, DSP panics on first IPC sent.
Not sure immediately what can explain this, the code should be the same for all these platforms (aside some missing thing in DT definitions). The app/prj.conf is shared for all.

ipc4_notification_watchdog_init(&notif, cpu_get_id(), true);
intel_adsp_ipc_send_message_emergency(INTEL_ADSP_IPC_HOST_DEV,
notif.primary.dat, notif.extension.dat);
(void)ipc_send_message_emergency(notif.primary.dat, notif.extension.dat);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - we do have low and high priority messages !

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a new functionality. Its been around for years. But its not about low/high priority. Its a separate emergency ipc sending mechanism that ignores the current communication state and message queue.

@lgirdwood
Copy link
Member

Hmm, on MTL/ace15 test results look good, but on LNL/PTL, DSP panics on first IPC sent. Not sure immediately what can explain this, the code should be the same for all these platforms (aside some missing thing in DT definitions). The app/prj.conf is shared for all.

[  282.845982] kernel: snd_sof:snd_sof_run_firmware: sof-audio-pci-intel-lnl 0000:00:1f.3: booting DSP firmware
[  282.845988] kernel: snd_sof_intel_hda_common:hda_dsp_cl_boot_firmware: sof-audio-pci-intel-lnl 0000:00:1f.3: IMR restore supported, booting from IMR directly
[  282.847178] kernel: snd_sof_pci_intel_mtl:mtl_dsp_core_power_up: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x178d04]=0x2000101 successful
[  282.847182] kernel: snd_sof_pci_intel_mtl:mtl_dsp_cl_init: sof-audio-pci-intel-lnl 0000:00:1f.3: Primary core power up successful
[  282.847186] kernel: snd_sof_pci_intel_mtl:mtl_dsp_cl_init: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x73214]=0x80000000 successful
[  282.847206] kernel: snd_sof_pci_intel_mtl:mtl_enable_interrupts: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x1800]=0x41 successful
[  282.847222] kernel: snd_sof_pci_intel_mtl:mtl_enable_interrupts: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x1140]=0x1 successful
[  282.847239] kernel: snd_sof_pci_intel_mtl:mtl_dsp_cl_init: sof-audio-pci-intel-lnl 0000:00:1f.3: FW Poll Status: reg[0x160200]=0x50000005 successful
[  282.960273] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx      : 0x1b080000|0x0: GLB_NOTIFICATION|FW_READY
[  282.960286] kernel: snd_sof:sof_set_fw_state: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state change: 3 -> 6
[  282.960295] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx done : 0x1b080000|0x0: GLB_NOTIFICATION|FW_READY
[  282.960569] kernel: snd_sof:snd_sof_run_firmware: sof-audio-pci-intel-lnl 0000:00:1f.3: firmware boot complete
[  282.960588] kernel: snd_sof:sof_set_fw_state: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state change: 6 -> 7
[  282.960616] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0x44000000|0x31400008: MOD_LARGE_CONFIG_SET [data size: 8]
[  282.960796] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx      : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[  282.960806] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ DSP dump start ]------------

@kv2019i do you know if baseFW is the destination for this large module set ? On teh positive side the exception handler is managing to send the exception IPC in part..

@dcpleung

This comment was marked as outdated.

@lyakh
Copy link
Collaborator

lyakh commented Jul 4, 2025

the respective Zephyr PR zephyrproject-rtos/zephyr#91606

@lgirdwood
Copy link
Member

the respective Zephyr PR zephyrproject-rtos/zephyr#91606

Adding @ranj063 as we may have more users.

@dcpleung
Copy link
Contributor Author

I tested it a bit more. It seems like the exception comes when requesting D0->D3 transition more than one time. The test showed that the first D0->D3->D0 worked fine and FW was ready again. However, the second D0->D3 request generated the exception IPC.

@dcpleung
Copy link
Contributor Author

The issue with D0->D3->D0 is fixed so suspend/resume should be working on LNL and PTL.

@lyakh lyakh changed the title ipc: rework Zephyr IPC interfacs ipc: rework Zephyr IPC interfaces Jul 17, 2025
@lyakh
Copy link
Collaborator

lyakh commented Jul 18, 2025

likely unrelated to this PR: a pause-resume failure on PTL similar to the one observed in other PRs. https://sof-ci.01.org/sofpr/PR10089/build14052/devicetest/index.html?model=PTLH_RVP_NOCODEC&testcase=multiple-pause-resume-50 . I'll retrigger tests to see if we can reproduce it

@lyakh
Copy link
Collaborator

lyakh commented Jul 18, 2025

SOFCI TEST

@lgirdwood
Copy link
Member

Adding @kv2019i

@dcpleung
Copy link
Contributor Author

Zephyr side PR has been merged.

(struct intel_adsp_ipc_ept_priv_data *)priv;

k_spinlock_key_t key;
if (len == INTEL_ADSP_IPC_CB_MSG) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t len is an enum??... I see that it comes from zephyrproject-rtos/zephyr@2402005 and I don't find that a good idea... Can we change that to a simple (possibly unsigned) int?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size_t is from the API which I have no control of.

As for the enum, in a sense it is doing some "grouping" of values together. So, this "set" is for sending message, another set is for callback, etc. I can change it there if you insist. This can be done when this is merged and we go back to remove the old driver.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dcpleung hm, but that's an Intel-ADSP private API, so its only user is almost certainly SOF.. Do the same strict policies apply to it making changing it too difficult? At least we should change the name - it isn't really len any more? And changing size_t to a compatible explicit integer type should be doable too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I have renamed the variable to cb_type. But I cannot change the data type since it's in the callback prototype.

/* attach handlers */
intel_adsp_ipc_set_message_handler(INTEL_ADSP_IPC_HOST_DEV, message_handler, ipc);
/* initialize IPC endpoint */
ipc_ept_init(ipc);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in fact, can we not just use Zephyr functions directly here? Would make reading easier by saving a level of indirection

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function also sets a pointer inside the private data to the IPC struct. We can expand both in the two places calling ipc_ept_init(). It's that if we ever have to change the init sequence, we don't have to do it at two places... and to avoid changing one and not the other.

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I'll save my +1 for later point when we have the Zephyr change available and this can be merged.

app/prj.conf Outdated
CONFIG_HEAP_MEM_POOL_SIZE=2048

# Use the new IPC message service in Zephyr
CONFIG_INTEL_ADSP_IPC_OLD_INTERFACE=n
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit iffy as prj.conf is not INtel specific, but given this is paving way for a vendor neutral IPC interface, I think this is ok here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once this is merged, I'll remove the old driver in Zephyr so this will go away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... on second thought, I have moved this into Intel ADSP configs. CONFIG_INTEL_ADSP_IPC will be removed from the Zephyr side, and becomes a SOF local kconfig.

@kv2019i
Copy link
Collaborator

kv2019i commented Oct 17, 2025

FYI @dcpleung . Zephyr version in SOF was updated in #10268 , but this does not pull in zephyrproject-rtos/zephyr#96233 yet. There seems to be regressions with newer Zephyr, so the update was delayed (see discussion in #10268 ).

kv2019i added a commit to kv2019i/sof that referenced this pull request Oct 31, 2025
kv2019i added a commit to kv2019i/sof that referenced this pull request Oct 31, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates the Intel ADSP IPC implementation from the old Zephyr IPC interface to the new IPC service backend. The changes introduce a new IPC endpoint-based architecture that uses callback mechanisms instead of direct handler functions.

Key changes:

  • Refactored IPC message handling to use the new ipc_service_* APIs with endpoint registration
  • Added conditional compilation support for both old and new IPC interfaces via CONFIG_INTEL_ADSP_IPC_OLD_INTERFACE
  • Introduced a new public API ipc_send_message_emergency() for emergency IPC message sending

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/ipc/ipc-zephyr.c Migrated from old intel_adsp_ipc_* APIs to new ipc_service_* endpoint-based APIs with callback architecture
src/platform/intel/ace/lib/watchdog.c Updated to use new ipc_send_message_emergency() function
src/include/sof/ipc/common.h Added declaration for ipc_send_message_emergency() function
app/boards/*.conf Enabled new IPC interface by disabling CONFIG_INTEL_ADSP_IPC_OLD_INTERFACE
app/boards/intel_adsp/Kconfig.defconfig Added dependency on IPC_SERVICE_BACKEND_INTEL_ADSP_HOST_IPC for INTEL_ADSP_IPC

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1 to 3
# Use the new IPC message service in Zephyr
CONFIG_INTEL_ADSP_IPC_OLD_INTERFACE=n
CONFIG_TIGERLAKE=y
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration CONFIG_INTEL_ADSP_IPC_OLD_INTERFACE=n is duplicated at lines 2 and 65 with the same comment. Remove the duplicate entry at the end of the file.

Copilot uses AI. Check for mistakes.
*/
static uint32_t g_last_data, g_last_ext_data;

struct k_sem *wait_ack_sem;
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global pointer wait_ack_sem is accessed in both ipc_receive_cb() (potentially in interrupt context) and ipc_platform_wait_ack() (thread context) without synchronization. This creates a race condition where the pointer could be set to NULL in ipc_platform_wait_ack() while ipc_receive_cb() is checking or using it. Consider using atomic operations or protecting access with a spinlock.

Copilot uses AI. Check for mistakes.
Comment on lines +117 to +132
static void ipc_ept_init(struct ipc *ipc)
{
const struct device *ipc_dev = INTEL_ADSP_IPC_HOST_DEV;

ipc_ept_priv_data.priv = ipc;

ipc_service_register_endpoint(ipc_dev, &ipc_ept, &ipc_ept_cfg);
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of ipc_service_register_endpoint() is not checked. If endpoint registration fails, subsequent IPC operations will fail silently. Add error handling to check the return value and log an error or return an error code from ipc_ept_init().

Suggested change
static void ipc_ept_init(struct ipc *ipc)
{
const struct device *ipc_dev = INTEL_ADSP_IPC_HOST_DEV;
ipc_ept_priv_data.priv = ipc;
ipc_service_register_endpoint(ipc_dev, &ipc_ept, &ipc_ept_cfg);
static int ipc_ept_init(struct ipc *ipc)
{
const struct device *ipc_dev = INTEL_ADSP_IPC_HOST_DEV;
ipc_ept_priv_data.priv = ipc;
int ret = ipc_service_register_endpoint(ipc_dev, &ipc_ept, &ipc_ept_cfg);
if (ret) {
LOG_ERR("Failed to register IPC endpoint: %d", ret);
return ret;
}
return 0;

Copilot uses AI. Check for mistakes.

static void ipc_complete(void)
{
ipc_service_send(&ipc_ept, NULL, INTEL_ADSP_IPC_SEND_DONE);
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of ipc_service_send() is not checked in ipc_complete(). If sending the completion signal fails, the host will not be notified that IPC processing is complete, potentially causing the system to hang. Add error handling to check the return value and log an error.

Suggested change
ipc_service_send(&ipc_ept, NULL, INTEL_ADSP_IPC_SEND_DONE);
int ret = ipc_service_send(&ipc_ept, NULL, INTEL_ADSP_IPC_SEND_DONE);
if (ret) {
LOG_ERR("ipc_service_send(INTEL_ADSP_IPC_SEND_DONE) failed: %d", ret);
}

Copilot uses AI. Check for mistakes.
Comment on lines +143 to +156
void ipc_send_message_emergency(uint32_t data, uint32_t ext_data)
{
struct intel_adsp_ipc_msg msg = {.data = data, .ext_data = ext_data};

ipc_service_send(&ipc_ept, &msg, INTEL_ADSP_IPC_SEND_MSG_EMERGENCY);
Copy link

Copilot AI Nov 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value of ipc_service_send() is not checked in ipc_send_message_emergency(). In an emergency context (such as watchdog timeout), it's critical to know if the message was sent successfully. Consider checking the return value and logging an error, or changing the function signature to return an int to allow callers to handle failures.

Suggested change
void ipc_send_message_emergency(uint32_t data, uint32_t ext_data)
{
struct intel_adsp_ipc_msg msg = {.data = data, .ext_data = ext_data};
ipc_service_send(&ipc_ept, &msg, INTEL_ADSP_IPC_SEND_MSG_EMERGENCY);
int ipc_send_message_emergency(uint32_t data, uint32_t ext_data)
{
struct intel_adsp_ipc_msg msg = {.data = data, .ext_data = ext_data};
int ret = ipc_service_send(&ipc_ept, &msg, INTEL_ADSP_IPC_SEND_MSG_EMERGENCY);
if (ret) {
LOG_ERR("Emergency IPC send failed: %d", ret);
}
return ret;

Copilot uses AI. Check for mistakes.
struct ipc *ipc = (struct ipc *)arg;
struct intel_adsp_ipc_ept_priv_data *priv_data = priv;

if (cb_type == INTEL_ADSP_IPC_CB_MSG) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch(cb_type)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to switch.

k_spin_unlock(&ipc->lock, key);

return false;
priv_data->cb_ret = -1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does -1 mean in this context? If it's a negative error, should be an errno code? Or another macro in enum intel_adsp_cb_ret?

Copy link
Contributor Author

@dcpleung dcpleung Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't know. It returned false before. So I am just mimicking that behavior here. I guess I can change it to -EBUSY since technically the IPC is still "busy" as it is not done.


static bool ipc_is_complete(void)
{
return ipc_service_send(&ipc_ept, NULL, INTEL_ADSP_IPC_SEND_IS_COMPLETE) == 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh... this looks strange - calling "send" to check the status...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh Yeah, some of this is a bit awkward, but let's not blame the messenger for all of it. This was not our preferred API in Zephyr, but this (the IPC service interface) is what we could get merged now and it's the Zephyr main API we need to ues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to introduce a new API with a query type call, but that new API was rejected in the arch meeting. So we are stuck with the existing IPC API which only has a send function.

void ipc_platform_complete_cmd(struct ipc *ipc)
{
ARG_UNUSED(ipc);
intel_adsp_ipc_complete(INTEL_ADSP_IPC_HOST_DEV);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this break the "old" API? Shell it then be removed completely or do we need #ifdefs in these calls too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that once this is merged, the old API will be removed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that once this is merged, the old API will be removed.

@dcpleung sure, understand, but isn't there a window when this is merged and the old is still there (and used)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh We have no need to keep this around. E.g. once we merge this, we can immediately submit a PR to remove the old API from Zephyr. Once that is merged, we can clean up SOF. This is really just a quick transition and I appreciate Daniel for doing the transition APIs, so this didn't require a flag day to take into SOF.

kv2019i added a commit to kv2019i/sof that referenced this pull request Nov 4, 2025
This reworks the Zephyr IPC interface to utilize the generic
IPC service and backend. So we no longer have to maintain
a SoC specific IPC driver.

Signed-off-by: Daniel Leung <daniel.leung@intel.com>
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since most jenkins CI results are missing, an update can be pushed, or we can improve in a follow-up

default y

config INTEL_ADSP_IPC
bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this needed? The type is already set where this option is defined?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is superfluos

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zephyr no longer has this option. So this is where this kconfig is defined, and thus needing a type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in a .defconfig?.. Hm, not a blocker, but appears strange to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would eventually go away too. It was used to enable the IPC driver in Zephyr. Now that the driver is automatically included depending on device tree, so that is no longer the case. There are couple places in SOF using this kconfig and I am not sure about the impact of removing them. So keep it for now.


CONFIG_INTEL_ADSP_IPC=y


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also remove 2 empty lines here

Copy link
Contributor

@tmleman tmleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? What drives this strong need to implement the IPC service in this form?

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 5, 2025

@tmleman wrote:

Why do we need to do this? What drives this strong need to implement the IPC service in this form?

This really is to complete the Zephyr transition. I.e. #5794 . The current code has two big problems:

  1. Intel SOF/Zephyr builds use an Intel specific IPC driver interface in Zephyr. This is not good, we should be using a IPC driver interface that can be implemened by multiple drivers for different hardware backends (just like we do for DMA, DAI, timers, etc). This is what this PR does.
  2. The original IPC Zephyr driver for Intel ADSP did not cover the full IPC interface (unlike we had in IPC3 in SOF). To really clean up (1), we need to expand the interface to cover the full IPC send/receive. I'm doing this on top of this PR in IPC payload API abstraction (on top of unmerged PR10089)  #10346

@tmleman
Copy link
Contributor

tmleman commented Nov 5, 2025

Ok, I see how using a more generic interface is beneficial and simply the right approach. However, this interface doesn't fit our hardware and checking a checkbox is a weak justification.

The original IPC Zephyr driver for Intel ADSP did not cover the full IPC interface

What exactly do you mean?

kv2019i
kv2019i previously approved these changes Nov 5, 2025
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me. In retrospect, we could have used vendor-neutral names on Zephyr side already (this interface is not specific to Intel DSPs, but will be used by all SOF platforms). It's just the specific IPC service backend that is specific to Intel hardware. But this is not related to this PR and can be changed later on Zephyr side.

default y

config INTEL_ADSP_IPC
bool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is superfluos

void ipc_platform_complete_cmd(struct ipc *ipc)
{
ARG_UNUSED(ipc);
intel_adsp_ipc_complete(INTEL_ADSP_IPC_HOST_DEV);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyakh We have no need to keep this around. E.g. once we merge this, we can immediately submit a PR to remove the old API from Zephyr. Once that is merged, we can clean up SOF. This is really just a quick transition and I appreciate Daniel for doing the transition APIs, so this didn't require a flag day to take into SOF.

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 5, 2025

@tmleman wrote:

Ok, I see how using a more generic interface is beneficial and simply the right approach.
However, this interface doesn't fit our hardware and checking a checkbox is a weak justification.

To me this is fixing technical debt we have added. Prior to the Zephyr transaction, SOF had a IPC interface where common and platform logic was separated. With the Zephyr transition, we broke this and ended up having platform specific code end up to common SOF code (especially ipc-zephyr.c). This is now fixing that technical debt.

To me, we are not losing anything in when we use this interface with e.g. Intel DSP hardware.

The original IPC Zephyr driver for Intel ADSP did not cover the full IPC interface
What exactly do you mean?

In short, in around v2.2, ipc_platform_send_msg() sent the whole message (header + payload) and platform drivers implemented this in different ways (on different hardware). The intel_adsp_ipc.h interface that was added to Zephyr (and ipc-zephyr.c to SOF), did not expose the functionality of this interface. The intel_adsp_ipc.h only covers the IPC header (pri+ext words) and the users of intel_adsp_ipc.h need to send/receive the IPC payload via some other interface. With this PR (and my #10346 ), I want to return us to the v2.2 level of semantics, so we have a clear split of IPC common and hw specific code. And as we now have drivers in Zephyr, the hw specific part should be in Zephyr drivers (and the IPC service framework backends provide a framework we can use for this without inventing something new).

@tmleman
Copy link
Contributor

tmleman commented Nov 5, 2025

@kv2019i wrote:

ipc_platform_send_msg() sent the whole message (header + payload) and platform drivers implemented this in different ways (on different hardware).

It seems to me that you are forcibly connecting two different concepts. IPC as a protocol and IPC as a HW feature. If we are not planning to migrate the entire SOF code to Zephyr, then leaving the protocol logic on the SOF side is not necessarily an error. Our specific Intel IPC protocol utilizes two separate HW features: IPC and memory window. Using the memory window is optional. The IPC driver we had in Zephyr almost entirely covered the functionality of this IP.
Nothing I read here convinces me that this is how it should look.

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 6, 2025

@tmleman wrote:

ipc_platform_send_msg() sent the whole message (header + payload) and platform drivers implemented this in different ways (on different hardware).

It seems to me that you are forcibly connecting two different concepts. IPC as a protocol and IPC as a HW feature. If we are not planning to migrate the entire SOF code to Zephyr, then leaving the protocol logic on the SOF side is not necessarily an error. Our specific Intel IPC protocol utilizes two separate HW features: IPC and memory window. Using the memory window is optional. The IPC driver we had in Zephyr almost entirely covered the functionality of this IP. Nothing I read here convinces me that this is how it should look.

It's a fair question what is the correct abstraction level. And there's a reason why IPC has been of the last interfaces to be moved
under generic Zephyr interfaces -- it's not an easy case.

Looking at this from SOF point of view. We have two protocols (IPC3 and IPC4) and dozens of supported hardware targets. IPC3/4 have differences. Both have 1-2 header words (that may be transferred inband with the "control plane" hardware interface) and a message payload (typically put to a fixed shared memory). There are of course many ways to abstract this, but we do have established practice in abstracting this at "struct ipc_msg *msg" level and we have implemented this over dozens of different hw targets in SOF.

There's clear common logic in SOF to send and receive IPC messages. Whether small messages can be sent inband (like IPC4 on Intel DSPs), it's an implementation detail that can be hidden in the driver (like we did in SOF IPC3). During the last few years, we've systematically moved code that varies from hw to another, to Zephyr drivers. Single interface, multiple drivers. In order to do this, we need a common interface for SOF IPC in Zephyr that can be implemented by different hw targets. For this to work, the interface has to be something that is generic enough. In Zephyr, the IPC service framework is the closest matching common interface used by different vendors for this (to send IPC messages between processors). Our SOF IPC3/4 has its quirks, but it seems it is possible to map it over the the generic interface in use.

Currently, for IPC, Intel uses src/ipc/ipc-zephyr.c file which directly talks to a Intel ADSP driver interface of one particular hardware type. Other vendors have IPC drivers implemented on SOF side, using old XTOS IPC driver interface.

We can of course reverse the course and keep the IPC abstraction on SOF side (e.g. move src/ipc/ipc-zephyr.c under sof/src/platform/intel/). This would be first such interface to be kept at SOF, so quite a big change, but certainly something to discuss now. I've been working towards #5794 as our common architectural target and e.g. asked @dcpleung to do this work on Zephyr side.

@kv2019i kv2019i dismissed their stale review November 6, 2025 08:46

Let me clear my +1 until we complete the discussion on whether #5794 is still the target or not.

@tmleman
Copy link
Contributor

tmleman commented Nov 6, 2025

Currently, for IPC, Intel uses src/ipc/ipc-zephyr.c file which directly talks to a Intel ADSP driver interface of one particular hardware type.

I think this implementation doesn't address this problem. How the len parameter is used stems from the hardware architecture, and this driver is still Intel-specific, just less readable. I think we could move most of this logic to a Zephyr IPC backend and use the Intel IPC driver within it. Then here we could utilize the generic Zephyr API that could work with multiple backends (specific to each vendor). Below is a draft diagram illustrating roughly what I mean:

sequenceDiagram
    autonumber
    
    participant HOST as HOST Driver
    participant DRV as Intel ADSP<br/>IPC Driver<br/>(Backend)
    participant SVC as Zephyr<br/>IPC Service
    participant SOF as SOF IPC<br/>(ipc-zephyr.c)
    participant WQ1 as Work Queue 1<br/>(ipc_do_cmd)
    participant WQ2 as Work Queue 2<br/>(ipc_send_response)
    
    Note over HOST,WQ2: HOST sends IPC Command to DSP
    
    HOST->>DRV: Set TDR BUSY + command data: intel_adsp_ipc_isr()
    Note right of HOST: Write to TDR/TDD registers
    activate DRV
    Note right of DRV: ISR Context
    DRV->>SOF: handle_message() schedule IPC process
    activate SOF
    SOF->>SOF: ipc_schedule_process()
    deactivate SOF
    Note right of DRV: ACK to HOST:<br/>DSP received message<br/>(ends interrupt)
    DRV->>HOST: set TDR BUSY
    deactivate DRV

    Note over WQ1: Worker 1: Process Command
    activate WQ1
    WQ1->>SOF: ipc_platform_do_cmd()
    activate SOF
    activate SVC
    SOF->>SVC: cfg.received(data, len, priv)
    Note right of SOF: Backend calls<br/>registered callback
    SVC->>SOF: return data
    deactivate SVC
    SOF->>SOF: ipc_cmd(hdr)
    SOF->>SOF: ipc_prepare_to_send()
    SOF->>SOF: ipc_schedule_process()
    Note right of SOF: Schedule work queue task
    activate SVC
    SOF->>SVC: backend->release_rx_buffer(token, data)
    Note right of SVC: Signal processing complete
    SVC->>DRV: intel_adsp_ipc_complete()
    deactivate SVC
    deactivate SOF
    deactivate WQ1

    Note over WQ2: Worker 2: Send Response
    activate WQ2
    Note right of WQ2: Send response<br/>using IPC service API
    WQ2->>SOF: ipc_platform_send_msg()
    activate SOF
    SOF->>SVC: ipc_service_send(&ept, response, len)
    SVC->>DRV: intel_adsp_ipc_send_message()
    DRV->>HOST: Set IDR BUSY + response data
    Note right of DRV: Write to IDR/IDD<br/>Trigger HOST interrupt
    deactivate SOF
    deactivate WQ2
    
    DRV->>HOST: Interrupt (IDR BUSY set)
    activate HOST
    HOST->>HOST: Read IDR/IDD registers
    HOST->>HOST: Process response
    HOST->>DRV: Set IDA DONE bit
    Note right of HOST: ACK response received
    deactivate HOST

    Note over DRV: ACK received
    activate DRV
    Note right of DRV: intel_adsp_ipc_isr()
    DRV->>DRV: Check IDA DONE bit
    DRV->>DRV: Set IDA DONE bit
    deactivate DRV
Loading

We could even attempt to move the logic with memory window handling to the backend.

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 7, 2025

@tmleman wrote:

I think this implementation doesn't address this problem. How the len parameter is used stems from the hardware architecture, and this driver is still Intel-specific, just less readable.

Thanks, the diagram is very useful. I see we are looking at this somewhat differently. I'm still not fully understanding the problem with 'len'. More specifically, while your diagram has it, we don't have such parameter in existing implementation. And I don't think this is feature of (Intel) hardware but a SW/FW protocol choice that applies to all SOF DSPs. Both IPC3 and IPC4 protocols have no concept of "length" for received messages. Basicly a fixed amount of data (SOF_IPC_MSG_MAX_SIZE) is expected to in the (optional) payload buffer, and it's upto the protocol handlers (ipc3/4-handler) to parse the message.

There was an early attempt (before IPC4) to migrate SOF to a TLV based IPC protocol which would have allowed common code to understand the size of the IPC messages (not a full OSI split, but at least more like in TCP/IP or OSI, where network code can understand size of the messages even if doesn't understand the protocol). Thta did nothappen and what we have now, this is not a Intel hw quirk, or IPC4 quirk, but is there for IPC3 as well and all hw targets SOF supports. The hw specific driver will use some hardware mechanism (ISR, doorbell registers) to wake up on reception of new data, provide upto 64bit of data inline and then a build-time defined set of bytes (SOF_IPC_MSG_MAX_SIZE) will be available to the DSP. Then IPC3/4 protocol code is called to parse the (optional) payload and figure out what data is there.

I don't think this PR changes this and I'd assume all hardware that currently has XTOS/IPC3 drivers, could use ipc-zephyr.c code after this PR.

@tmleman
Copy link
Contributor

tmleman commented Nov 7, 2025

I'm still not fully understanding the problem with 'len'.

@param[in] len Number of bytes to send.

I don't think this PR changes this and I'd assume all hardware that currently has XTOS/IPC3 drivers, could use ipc-zephyr.c code after this PR.

With values like INTEL_ADSP_IPC_SEND_DONE?

The IPC service API doesn't meet our needs at all, why don't we modify it? Why don't we add a new one that would be more useful?

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 7, 2025

@tmleman wrote:

I'm still not fully understanding the problem with 'len'.
@param[in] len Number of bytes to send.

Aa, that 'len'. This has nothing to do with Intel DSP, but is a property of the Zephyr IPC framework.

But right, the naming is confusing, "enum intel_adsp_cb_len" should be something generic "enum ipc_adsp_cb_len". This will be same for all SOF IPC backends and not specific to Intel. This is just used to map SOF style IPC to the Zephyr IPC framework and 'len' is used to make the mapping. This will be same for all DSP hardware and not vendor specific.

The IPC service API doesn't meet our needs at all, why don't we modify it? Why don't we add a new one that would be more useful?

We tried the new one approach first. Daniel implemented, but it was rejected in Zephyr architecture review. Full details at zephyrproject-rtos/zephyr#91606

.. so the IPC service based approach (zephyrproject-rtos/zephyr#96233) is following feedback from Zephyr architecture forum. We of course have the option to not use this, and expose driver-specific interfaces instead and make the abstractions on SOF side (i.e. make ipc-zephyr.c -> ipc-zephyr-intel-adsp.c ).

@tmleman
Copy link
Contributor

tmleman commented Nov 7, 2025

@kv2019i wrote:

We of course have the option to not use this, and expose driver-specific interfaces instead and make the abstractions on SOF side (i.e. make ipc-zephyr.c -> ipc-zephyr-intel-adsp.c ).

I think that would be a better approach.

@kv2019i
Copy link
Collaborator

kv2019i commented Nov 7, 2025

I'm still in favor of this PR. This allows to use existing infra to make new backends in Zephyr, and not invest in making new infrastructure on SOF side to handle multiple IPC hw implementations. But let's give a chance to other reviewers to respond and weigh in.

@softwarecki
Copy link
Collaborator

This change seems unnecessary. We are forcing the use of the Zephyr interface, which does not fit our needs. To achieve this, we misuse it by passing the operation type as message length. This approach does not provide any real benefit from using the existing interface. We cannot swap in another driver anyway, because only ours will be incompatible in correct way to work.

I agree with Tomek that sof currently uses two independent hardware for ipc (ipc and memory window), and merging them into a single driver is not a good idea. I would keep ipc handling in sof as a separate component. This way, if someone wants to implement ipc differently (UART, Ethernet, etc.) they can disable intel-zephyr-ipc in the config and enable their own ipc implementation exposing the same interface.

@softwarecki softwarecki self-requested a review November 12, 2025 15:24
Copy link
Contributor

@tmleman tmleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking before merge. As we agreed internally, I will prepare a PR with proposed changes in the backend. Integration of these changes would complicate its further development.

@dcpleung dcpleung marked this pull request as draft November 12, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants