Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/syslog: add mutex to syslog_default_write #9404

Merged

Conversation

tmedicci
Copy link
Contributor

@tmedicci tmedicci commented May 26, 2023

Summary

This commit reverts 19f269e and substitutes the semaphore to a mutex, used as a locking mechanism avoid syslog messages being messed.

Considering SMP, threads running on different CPUs could call syslog_default_write simultaneously. It then calls up_nputs that, in most of the implementations, calls up_putc multiple times to write to the serial. So, calling it at the same would cause syslog messages to interfere with each other.

up_nputs and up_putc are low-level functions that could be used during initialization and by the interrupt handler. Hence, it isn't advisable to implement any locking mechanism on them. On the other hand, syslog can also be used from the interrupt: if the interrupt buffer is selected (CONFIG_SYSLOG_INTBUFFER=y), the syslog messages will be flushed when we aren't serving an interrupt. On the contrary, if it isn't set, it would call the registered sc_force callback, which calls syslog_default_putc instead of the lock-protected syslog_default_write.

Also, this PR enables the syslog buffering on xtensa-based devices to ensure concurrent tasks will not mess with the syslog messages.

Impact

Please refer to #3599 (comment): this PR ensures the syslog generation/capture is thread-safe by using syslog buffering. It also ensures that publication is thread-safe by using a lock mechanism in syslog_default_write, protecting up_nputs from being called by other task/thread.

Testing

Internal CI testing + esp32s2-devkit:sta_softap with the following configs added:

CONFIG_DEBUG_ASSERTIONS=y
CONFIG_DEBUG_ASSERTIONS_EXPRESSION=y
CONFIG_DEBUG_ERROR=y
CONFIG_DEBUG_FEATURES=y
CONFIG_DEBUG_FS=y
CONFIG_DEBUG_FS_ERROR=y
CONFIG_DEBUG_FS_INFO=y
CONFIG_DEBUG_FS_WARN=y
CONFIG_DEBUG_INFO=y
CONFIG_DEBUG_NET=y
CONFIG_DEBUG_WARN=y
CONFIG_DEBUG_WIRELESS=y
CONFIG_DEBUG_WIRELESS_ERROR=y
CONFIG_DEBUG_WIRELESS_INFO=y
CONFIG_DEBUG_WIRELESS_WARN=y
CONFIG_ESP_WPA_DEBUG_PRINT=y
CONFIG_SYSLOG_BUFFER=y

which produced the following messages during startup prior to the proposed solution:

[CPU0] esp_event_post: Event: base=WIFI_EVENT id=2 data=0 data_size=0 ticks=4294967295
[CPU1] esp_wifi_sta_start: OK to start Wi-Fi station
[CPU0][ CPesU1p_]e svtpi_wffors_k_sctab:t: W mi-ouFin tpstt=a 0xst3farct9f
368 relp[atCPh=U0w]ap Ii. (co3n05f ) buwiff=0i:x
 fc[9CPe1U0e0]
 Set ps type: 0
[CPU0]

@@ -39,8 +40,14 @@

void up_nputs(const char *str, size_t len)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can up_nputs be used from the interrupt context?

Copy link
Contributor Author

@tmedicci tmedicci May 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't supposed to... I've checked the code for situations and I couldn't find any misuse of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since up_putc can be called in the interrupt context, and up_nputs is an accelerated version of up_putc, it's better to make up_nputc work in the interrupt handler too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, look like we need patch all arch up_nputs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, look like we need patch all arch up_nputs?

Yes, it's advisable to. Should we do this in this PR? (I can't test other architectures) or submit PRs specific to each arch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So either (1) up_nputs() must not use semaphores or (2) up_nputs() must not be used prior to calling nxsem_initialize(). There is a global variable g_nx_init_state that can also be used to check where nx_start is in the initialization sequence.

I don't think that it is a issue here, but one other thing that we should all keep in mind: As a general rule, we should not use semaphores or mutexes on the IDLE thread. The IDLE thread is the thread that runs when you boot up the system. All initialization is performed on the IDLE thread, hence, it is sometimes called the initialization thread.

Why not? Because of the result of calling sem_wait() were to cause the IDLE thread to attempt to wait, the system will crash. The IDLE thread can never wait and must always be in the ready-to-run state.

This is not normally a problem during initialization because the system is running single-threaded during most of initialization and sem_wait() will never block while running single-threaded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @patacongo !

Thank you for your observations. Indeed, if the up_nputs is meant to be used by the interrupt handler, it shouldn't use semaphores.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaoxiang781216 , I was thinking about 1a048f8 (from #6618):

Could it be reverted (or a mutex be added there)? If syslog is called from the interrupt handler, it won't end up calling syslog_default_write (it'd use the registered sc_force instead, which is syslog_default_putc. By doing that, we can avoid creating a "thread-safe" version of up_nputs.

What do you think about it? It solves this same issue from an upper level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but it's better to use mutex_t instead semaphore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR updated!

@tmedicci tmedicci force-pushed the bugfix/concurrent_syslog_msgs branch from 2596543 to cb501e5 Compare May 26, 2023 18:01
@tmedicci tmedicci marked this pull request as draft May 30, 2023 17:26
This commit reverts 19f269e and
substitutes the semaphore to a mutex, used as a locking mechanism
to avoid syslog messages being messed.

Considering SMP, threads running on different CPUs could call
syslog_default_write simultaneously. It then calls `up_nputs` that,
in most of the implementations, calls `up_putc` multiple times to
write to the serial. So, calling it at the same would cause syslog
messages to interfere with each other.

`up_nputs` and `up_putc` are low-level functions that could be used
during initialization and by the interrupt handler. Hence, it isn't
advisable to implement any locking mechanism on them. On the other
hand, syslog can also be used from the interrupt: if the interrupt
buffer is selected (`CONFIG_SYSLOG_INTBUFFER=y`), the syslog
messages will be flushed when we aren't serving an interrupt.
On the contrary, if it isn't set, it would call the registered
`sc_force` callback, which calls `syslog_default_putc` instead of
the lock-protected `syslog_default_write`.
During initialization it isn't possible to use up_puts once it's
protected against concurrent access through a mutex lock. Instead,
using up_putc makes it similar to ESP32S2 and ESP32S3 and perfectly
fits for showprogress usage.
By enabling the syslog buffering on xtensa-based devices, it's
possible to ensure concurrent tasks will not mess with the syslog
messages.
@tmedicci tmedicci force-pushed the bugfix/concurrent_syslog_msgs branch from cb501e5 to 4bfde91 Compare June 1, 2023 18:46
@tmedicci tmedicci changed the title xtensa/nputs: add a mutex to prevent concurrent tasks from writing drivers/syslog: add mutex to syslog_default_write Jun 1, 2023
@tmedicci tmedicci marked this pull request as ready for review June 1, 2023 18:50
@xiaoxiang781216 xiaoxiang781216 merged commit 223c5a3 into apache:master Jun 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants