Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] sof: replace log calls with zephyr logging api #5411

Closed
wants to merge 2 commits into from

Conversation

aiChaoSONG
Copy link
Collaborator

The log context in zephyr is per file or module.

To use zephyr logging api, LOG_MODULE_REGISTER is used
to register a log context, LOG_MODULE_DECLARE is used
to refer to the registered context.

For function in header file, LOG_MODULE_DECLARE should
be used within that function to avoid context collapse,
a condition one source file have multiple context
registered or declared.

Signed-off-by: Chao Song [email protected]

@aiChaoSONG aiChaoSONG changed the title sof: replace tr_xxx log calls with native zephyr api [ONLY_FOR_REVIEW] sof: replace tr_xxx log calls with native zephyr api Feb 24, 2022
@aiChaoSONG
Copy link
Collaborator Author

aiChaoSONG commented Feb 24, 2022

CI is not ready to use cavstool.py to read logs from shared memory, waiting the readiness of CI.
comp_xxx, pipe_xx log calls still use XTOS api, WIP.
Tested on TGL up extreme board with xcc.

Build failure on NXP SOCs, because log context not registered for NXP specific source files

@aiChaoSONG aiChaoSONG marked this pull request as draft February 24, 2022 03:04
src/audio/buffer.c Outdated Show resolved Hide resolved
src/audio/channel_map.c Outdated Show resolved Hide resolved
@aiChaoSONG aiChaoSONG changed the title [ONLY_FOR_REVIEW] sof: replace tr_xxx log calls with native zephyr api [ONLY_FOR_REVIEW] sof: replace log calls with zephyr logging api Mar 2, 2022
@aiChaoSONG
Copy link
Collaborator Author

aiChaoSONG commented Mar 2, 2022

version 2:
- apply Liam's suggestion, define an empty LOG_MODULE_REGISTER/LOG_MODULE_DECLARE for XTOS, this helps to remove a lot of #ifdef __ZEPHYR__
- switch dai_xxx and pipe_xxx trace calls to zephyr api
- comp_xxx WIP

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume a subsequent PR will replace usage of tr_info() etc with LOG_INFO() ?

_TRACE_INV_ID, _TRACE_INV_ID, \
fmt, ##__VA_ARGS__)

#ifdef __ZEPHYR__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, best to have a comment here say this is tmp until all users have been updated.


#define LOG_MODULE_REGISTER(ctx, level)
#define LOG_MODULE_DECLARE(ctx, level)

Copy link
Member

@lgirdwood lgirdwood Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to define LOG() here for xtos users, but I'm assuming this will be done in a subsequent PR ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I am not invent a new macro, and use it to wrap LOG_XXX(zephyr) and tr_xxx(XTOS). I just reuse those tr_xxx/comp_xxx/pipe_xxx macro, when sof build with zephyr, they are replaced with LOG_XXX, otherwise, they are replaced with xtos logging utilities.

@lgirdwood
Copy link
Member

Btw - I think this is ready for review, if its part 1 of the work.

Copy link
Collaborator

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DECLARE_SOF_RT_UUID() macro should be empty when __ZEPHYR__ is defined. Then the compiler should automatically catch any usage of the older logger.

CI is not ready to use cavstool.py to read logs from shared memory, waiting the readiness of CI.

I believe cavstool.py is only compatible with (uncompressed) printk(), I think it's unrelated to Zephyr logging APIs. @andyross ?

Uncompressed logging is costly and won't fit in some products, to measure it try this one-line patch. It grows the sof-apl.ri Zephyr image size by about 30k (It also makes it crash after a short time but that's probably a different issue)

--- a/src/include/sof/trace/trace.h
+++ b/src/include/sof/trace/trace.h
@@ -216,7 +216,7 @@ void mtrace_event(const char *complete_packet, uint32_t length);
 #    define MTRACE_DUPLICATION_LEVEL LOG_LEVEL_DEBUG
 #  endif
 #else /* copy only ERRORS */
-#  define MTRACE_DUPLICATION_LEVEL LOG_LEVEL_WARNING
+#  define MTRACE_DUPLICATION_LEVEL LOG_LEVEL_DEBUG
 #endif /* CONFIG_TRACEM */
 
 /* This function is _not_ passed the format string to save space */

Tested on TGL up extreme board with xcc.

Did you get any logs? Through which kernel driver and userspace?

I assume a subsequent PR will replace usage of tr_info() etc with LOG_INFO() ?

I don't think we should allow a concurrent and inconsistent mix of tr_...() and LOG_...(), trace.h has way too many layers of indirections and #ifdef conditionals already. Once we have the Zephyr side working then we can do a mass tr_ -> LOG_ search/replace to get rid of all the tr_... calls and at the same time a LOG_...() redirection to the internal and XTOS trace_with_whatever(). So there will be a single logging API use at any given time.

/* create a new component in the pipeline */
struct comp_buffer *buffer_new(const struct sof_ipc_buffer *desc)
{
struct comp_buffer *buffer;

tr_info(&buffer_tr, "buffer new size 0x%x id %d.%d flags 0x%x",
tr_info(&ipc_tr, "buffer new size 0x%x id %d.%d flags 0x%x",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bug fix or something... can you submit it separately?

/* 1547fe68-de0c-11eb-8461-3158a1294853 */
DECLARE_SOF_UUID("zll-schedule", zll_sched_uuid, 0x1547fe68, 0xde0c, 0x11eb,
DECLARE_SOF_UUID("ll-schedule", zll_sched_uuid, 0x1547fe68, 0xde0c, 0x11eb,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not a typo. There's some conflict with the ll-schedule in XTOS when you do this, check the git and github history.

Copy link
Collaborator Author

@aiChaoSONG aiChaoSONG Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is by mistake, thank you for spotting this. I was intended to declare a logging context ll-schedule both for src/schedule/zephyr_ll.c and src/schedule/ll_schedule.c. because ll_schedule.c is used with SOF+Zephyr on NXP SOCs.

It's OK to define a "zll-schedule" context and a "ll-schedule" context, but that is not consistent between and NXP.

#define tr_info(ctx, fmt, ...) \
trace_event_with_ids(_TRACE_INV_CLASS, ctx, \
_TRACE_INV_ID, _TRACE_INV_ID, fmt, ##__VA_ARGS__)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are all the non-atomic variants being removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are moved, not removed. check a little below, they are moved to the #else part.

@andyross
Copy link
Contributor

andyross commented Mar 3, 2022

I believe cavstool.py is only compatible with (uncompressed) printk(), I think it's unrelated to Zephyr logging APIs. @andyross ?

Logging works, there is a log_backend_adsp module that forwards to the same intel_adsp_trace_out() utility used underneath printk(). (And of course LOG_PRINTK tends to be defined in such a circumstance, meaning that printk() itself gets expanded into a log call. Zephyr's log/trace/print subsystems all like to use each other as backends via a bunch of non-orthogonal tricks. It's sort of a mess, but in this case it does work.)

@aiChaoSONG
Copy link
Collaborator Author

Btw - I think this is ready for review, if its part 1 of the work.

I'd rather you wait for one or two days, I am working on the comp_xxx part, it is the last part, but somehow it depends on #5459.

With this change, there will be build issue with SOF+Zephyr on NXP SOCs, I can help to fix it, but I am not able to test the fix, can you suggest something here? Or should we involve NXP friends here?

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 3, 2022

Thanks @andyross for correcting me. Summarizing a further private chat: so SOF/Zephyr has already been using the Zephyr logging system for months through printk(). I didn't realize this because the backend already being used seemed "too simple" to me but it is nevertheless a genuine Zephyr logging backend, more precisely this one: https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/logging/log_backend_adsp.c

The part that I understood correctly: neither this backend nor the (literally) corresponding cavstool.py know anything about dictionaries, which means extending its usage to all log messages (beyond tr_err and tr_warn right now) will significantly increase memory usage - about 30k on APL based on my rough experiment above. Fixing this requires either stacking the dictionary "protocol" on top of the existing protocol or switching to a different backend.

Another unknown is the performance impact of dropping DMA logging and switching to shared memory entirely.

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 3, 2022

CI is not ready to use cavstool.py to read logs from shared memory, waiting the readiness of CI.

I think most of the work is in sof-test, CI shouldn't have to change much. We can "steal" the etrace slot - it's the same shared memory after all. The shared memory is already used by Zephyr and it can already be read by cavstool.py; I've been doing that for months. So the sof-test work can be started already, it's not blocked by this.

@aiChaoSONG
Copy link
Collaborator Author

I think most of the work is in sof-test,

@marc-hb Yes, namely, enable cavstool.py in sof-test.

@aiChaoSONG
Copy link
Collaborator Author

aiChaoSONG commented Mar 3, 2022

Version 3:

  • all dai_xxx, pipe_xxx and tr_xxx calls are using zephyr api, and log to shared memory
    Note: support for SOF+Zephyr on NXP SOCs is not included in this PR

Our CI is not use the latest Zephyr, but a tested stable zephyr, so it fails the same way. the same error message I posted here: #5459 (comment), this zephyr PR maybe related: zephyrproject-rtos/zephyr#43174

Version 4:

  • add supoort for NXP SOCs, all done.

@aiChaoSONG aiChaoSONG changed the title [ONLY_FOR_REVIEW] sof: replace log calls with zephyr logging api [WIP] sof: replace log calls with zephyr logging api Mar 4, 2022
@@ -152,6 +152,26 @@ enum {
/** \brief Retrieves subid (comp id) from the component device */
#define trace_comp_get_subid(comp_p) ((comp_p)->ipc_config.id)

#ifdef __ZEPHYR__
/* class (driver) level (no device object) tracing */
#define comp_cl_err(drv_p, __e, ...) LOG_ERR(__e, ##__VA_ARGS__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason for keeping these legacy macros in common code ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to proceed in steps. This first, very big step is switching the "backend" to the Zephyr logging system. Once it's passing some tests we can start some massive search/replace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its already passing tests as we are using it in the Zephyr wrapper, this is just unnecessary churn. The aim of this PR should be to use the native Zephyr API and move the wrapper to xtos.

Copy link
Collaborator

@marc-hb marc-hb Mar 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a bit of Zephyr logging already but it's not tested at all, not collected and completely ignored.

I don't see any extra churn in this PR, most of it seems final.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, the Zephyr parts are tested by Zephyr CI. This gets everything ready for the DMA backend which will use the same kernel interface and flow control on current HW, albeit with a different trace tool from Zephyr.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I wasn't very clear sorry: if we merge this now, we lose all firmware logs in all our Zephyr test results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in this case the xtos wrapper for LOG would be used until the Zephyr DMA backend is ready. The etrace parts should work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For zephyr logging, context is define per module/file, when we use zephyr api, we don't need to pass a context pointer to it. but for XTOS logging calls, we need to pass a context to it. That's why I still use the tr_xxx here, for zephyr, it will be replaced with LOG_XXX, and the context pointer is ignored, for XTOS, everything keeps unchanged.

any reason for keeping these legacy macros in common code ?

Directly replace tr_xxx with zephyr LOG_XXX and provide wrapper for XTOS is not OK, because LOG_XXXs don't know how to handle the context pointer. Or I get it wrong here?

@@ -37,6 +37,18 @@ extern struct tr_ctx pipe_tr;
tr_dbg(&pipe_tr, __e, ##__VA_ARGS__)

/* device tracing */
#ifdef __ZEPHYR__

#define pipe_err(pipe_p, __e, ...) LOG_ERR(__e, ##__VA_ARGS__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, we should replace all the legacy macros with Zephyr versions.

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 5, 2022

neither this backend nor the (literally) corresponding cavstool.py know anything about dictionaries, which means extending its usage to all log messages (beyond tr_err and tr_warn right now) will significantly increase memory usage - about 30k on APL based on my rough experiment above.

BTW @aiChaoSONG how much does this PR make the image grow?

@aiChaoSONG
Copy link
Collaborator Author

aiChaoSONG commented Mar 7, 2022

@lgirdwood @marc-hb There are some issues, summarized below:

  • The PR can only be built with GCC, there is some linking issue with XCC, tgl_zph_error.txt, some logging context is redefined, I tried to provide a fix in [TEST]audio: component: don't inline functions that have logging #5459, but somehow it broke cmocka test.
  • The XCC compiler linking issue is introduced by the comp_xxx part. With tr_xxx/pipe_xxx/dai_xxx replaced, the build is good with both GCC and XCC.
  • I defined a lot of logging contexts to align with current code, Ex, each dai/audio comp/file in src/lib has a context. Look like there is logging performance issue if we do it in this way. I think it is either because we defined to many logging context, and zephyr won't allow it(need to confirm with Andy/Nashif on Tuesday), or because zephyr is still using share memory backend.

@lgirdwood
Copy link
Member

lgirdwood commented Mar 7, 2022

@lgirdwood @marc-hb There are some issues, summarized below:

Ok, but then this PR is closed as its states the new Zephyr fixes the issue. I've no issue making these non inline, but please state the XCC build errors these fix in the PR/commit message.

  • The XCC compiler linking issue is introduced by the comp_xxx part. With tr_xxx/pipe_xxx/dai_xxx replaced, the build is good with both GCC and XCC.
  • I defined a lot of logging contexts to align with current code, Ex, each dai/audio comp/file in src/lib has a context. Look like there is logging performance issue if we do it in this way. I think it is either because we defined to many logging context, and zephyr won't allow it(need to confirm with Andy/Nashif on Tuesday), or because zephyr is still using share memory backend.

Do you mean the logging is slow ?Can you provide more detail on "performance issue".

@aiChaoSONG
Copy link
Collaborator Author

aiChaoSONG commented Mar 8, 2022

Can you provide more detail on "performance issue".

With tr_xxx/pipe_xxx/dai_xx replaced with zephyr api, and set log level to LOG_LEVEL_DBG

  • logs not print out in real-time: zephyr prints load info periodically, if I start to run aplay on hw:0,0, log print stoped, once I exit aplay, logs come together with severe log drops.
  • severe log drops:
logs from cavstool.py --log-only
[00217260] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current is from another pipeline and can't be scheduled together
--- 5 messages dropped ---
[00217261] <dbg> pipe: pipeline_task: pipeline_task()
[00217261] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 3, dir = 1
[00217261] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 2, dir = 1
[00217261] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 1, dir = 1
[00217261] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 65, dir = 1
--- 6 messages dropped ---
[00217262] <dbg> pipe: pipeline_task: pipeline_task()
--- 1 messages dropped ---
[00217262] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 2, dir = 1
[00217262] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 1, dir = 1
[00217262] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current->comp.id = 65, dir = 1
[00217262] <dbg> pipe: pipeline_comp_copy: pipeline_comp_copy(), current is from another pipeline and can't be scheduled together
--- 10 messages dropped ---
[00217263] <dbg> pipe: pipeline_task: pipeline_task() sched
--- 6 messages dropped ---
[00217264] <dbg> pipe: pipeline_trigger_run: execute trigger cmd 0 on pipe 30
--- 2 messages dropped ---
[00217264] <dbg> pipe: pipeline_comp_trigger: pipeline_comp_trigger(), current->comp.id = 1, dir = 0
--- 7 messages dropped ---
[00217264] <inf> ll_schedule: num_tasks 3 total_num_tasks 3
--- 4 messages dropped ---
[00217264] <dbg> pipe: pipeline_comp_reset: pipeline_comp_reset(), current->comp.id = 64, dir = 0
--- 2 messages dropped ---
[00217264] <dbg> pipe: pipeline_comp_reset: pipeline_comp_reset(), current->comp.id = 2, dir = 0
--- 4 messages dropped ---
[00217264] <inf> dai: dai_get type 3 index 0 new sref 2
--- 2 messages dropped ---
[00217264] <inf> ipc: ipc_comp_dai_config() dai type = 3 index = 0
--- 4 messages dropped ---
[00217265] <inf> dma: dma_put(), dma = 0x9e065f70, sref = 0
--- 2 messages dropped ---
[00217265] <inf> ipc: ipc: comp 66 -> free
[00217265] <dbg> ipc: ipc_cmd: ipc: last request 0x30000000 returned 0
--- 1 messages dropped ---
[00217265] <dbg> ipc: ipc_cmd: ipc: last request 0x30000000 returned 0
--- 6 messages dropped ---
[00217265] <inf> pipe: pipeline_free()
--- 1 messages dropped ---
[00217265] <inf> ipc: ipc: new cmd 0x30020000
--- 3 messages dropped ---
[00217266] <inf> ipc: ipc: comp 5 -> free
--- 1 messages dropped ---
[00217266] <inf> ipc: ipc: new cmd 0x30020000

With tr_xxx/pipe_xxx/dai_xx replaced with zephyr api, and set log level to LOG_LEVEL_INF

Only very slight log drop, and log print is realtime.

demo video: https://www.youtube.com/watch?v=l8_kSImcC8s

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 8, 2022

https://docs.zephyrproject.org/latest/reference/logging/index.html

Deferred logging reduces the time needed to log a message by shifting time consuming operations to a known context instead of processing and sending the log message when called.

@lgirdwood
Copy link
Member

@aiChaoSONG can you confirm the log timestamps are correct and not delayed ? DBG level is heavy (especially when using mailbox), but we can track this and retest when DMA backend is enabled.

@keqiaozhang
Copy link
Collaborator

SOFCI TEST

1 similar comment
@lgirdwood
Copy link
Member

SOFCI TEST

@lgirdwood
Copy link
Member

looks like CI got stuck, unblocking.

Multiple use of static inline functions across different C
files will result in same symbol names defined in all of the
corresponding object files, because XCC compiler emits the
same symbol names based on the source file for those static
variables inside functions.

If Zephyr logging is used in SOF, we will have log context
redefinition issue with XCC due to above reason.

This patch workarounds the issue by removing the log calls
in static inline functions that are used across multiple C
files if Zephyr is used.

BugLink: zephyrproject-rtos/zephyr#43786

Signed-off-by: Chao Song <[email protected]>
The log context in zephyr is per file or module.

To use zephyr logging api, LOG_MODULE_REGISTER is used
to register a log context, LOG_MODULE_DECLARE is used
to refer to the registered context.

For function in header file, LOG_MODULE_DECLARE should
be used within that function to avoid context collapse,
a condition one source file have multiple context
registered or declared.

Signed-off-by: Chao Song <[email protected]>
@sys-pt1s
Copy link

Can one of the admins verify this patch?

@lgirdwood
Copy link
Member

@aiChaoSONG can you fix conflicts so we can rerun CI. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants