Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data processing scheduler - SOF tasks as Zephyr preemptive threads #7089

Merged
merged 5 commits into from
Mar 14, 2023

Conversation

marcinszkudlinski
Copy link
Contributor

DP scheduler is a scheduler that creates a separate preemptible Zephyr thread for each SOF task
There's only one instance of DP in the system, however, the threads may run on any core
Threads are pinned to specific core, there's no SMP processing.

The task execution may be delayed and task may be re-scheduled periodically

NOTE:

  • task - means a SOF task
  • thread - means a Zephyr preemptible thread

USECASES:

  • to run modules that have longer processing time than 1 LL cycle. I.e. speech recognition, that may run for seconds
  • to create pipelines that have some of modules executed on other cores that main LL thread for a pipeline
  • to general run zephyr thread using SOF task abstraction (like IPC thread)

TODOs

EDF:
Threads run on the same priority, lower than thread running LL tasks. Zephyr EDF mechanism
is used for decision which thread/task is to be scheduled next. The DP scheduler calculates
the task deadline and set it in Zephyr thread properties, the final scheduling decision is made
by Zephyr.

task with budgets:
task start at high priority, after given execution time the priority is dropped

@iuliana-prodan
Copy link
Contributor

@marcinszkudlinski what will be the difference between DP scheduling and DMA domain?
They are both using zephyr's threads, but one will be preempted by DMA IRQ (DMA domain) and the other by timer (DP scheduling)?

@marcinszkudlinski
Copy link
Contributor Author

marcinszkudlinski commented Feb 14, 2023

@iuliana-prodan DMA domain is DOMAIN, DP is a scheduler that can run in Timer or DMA domain
DMA domain is an LL scheduler that uses 1 zephyr thread, DP uses separate thread for each component/module

in DP can have different tick for each module, you can mix LL (low latency) and DP (data processing) modules, each module in a single pipeline may work on separate core
i.e. you can create a pipeline like:

DMIC (1ms, core 0, LL scheduling) --> processing (10ms, core 1, DP scheduling) --> HDA (1ms, core 0, LL scheduling)
of course last module would need to wait 10ms before 1st processing (or processing silence till than)

and next step: when there are more DP modules running on single core, i.e. one 1ms tick, second 2 seconds tick (like speech recognition) - DP will allow processing using EDF scheduling with preemtion and avoid starvation of 1ms module

Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent V1 @marcinszkudlinski ! I really only have one or two bigger issues in the code. I added a lot of typo/grammer comments (as it was just convenient to add them while reading the documentation parts), but these are really minor, I could understand the concepts.

zephyr/Kconfig Show resolved Hide resolved
src/include/sof/schedule/dp_schedule.h Outdated Show resolved Hide resolved
src/include/sof/schedule/dp_schedule.h Outdated Show resolved Hide resolved
src/include/sof/schedule/dp_schedule.h Outdated Show resolved Hide resolved
src/include/sof/schedule/dp_schedule.h Outdated Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Show resolved Hide resolved
src/include/sof/schedule/dp_schedule.h Show resolved Hide resolved
src/include/sof/audio/component.h Show resolved Hide resolved
src/audio/component.c Show resolved Hide resolved
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ATM this cannot work with a component on a core other than the pipeline. The problem is accessing buffer object list elements without locking in places like pipeline_for_each_comp(). We should block attempts to specify a core different from the pipeline core for DP components until such support is implemented.

src/ipc/ipc4/helper.c Outdated Show resolved Hide resolved
src/audio/pipeline/pipeline-params.c Outdated Show resolved Hide resolved
*
* Other possible usage of DP scheduler is to schedule task with DP_SCHEDULER_RUN_TASK_IMMEDIATELY
* as start parameter. It will force the task to work without any delays and asyn to LL.
* This kind of scheduling may be used for tasks like IPC.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we just use Zephyr APIs directly for those threads? Looks like it would also simplify the DP implementation. What are the advantages for users to use DP for this instead of Zephyr APIs directly? Besides the IPC thread doesn't need to run periodically, does it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Won't simplify much, couple of lines
  2. in future there's a concept of "taksks with budgets", not supported by zephyr directly. We discuss it with Zephyr and they only provide some callback to make it possible to implement
  3. I would keep an abstraction for a) porting b) common API with schedulers like LL working in completely diffrent way

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant tasks like IPC. IPC with Zephyr is currently using the EDF scheduler, which is just a thin API wrapper around Zephyr work queues, not much more than that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know about current zephyr_EDF, but we need to have IPC to be preemptive also, as a "task with budget" - IPC should be processed very quickly, but if processing takes to long it must yield to LL. Current zephyr implementation of EDF, as you said, does not do much and is not a really EDF, just "start the earliest deadline task and hope it finish processing in time".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcinszkudlinski @lyakh Let's not detour to IPC too much here. The current IPC with EDF is clearly not optimal and should be either written on top of native work queues, or something else if we need to have preemption after budget has expired (has not implemented so far in SOF, so this itself is a new feature).

Copy link
Contributor Author

@marcinszkudlinski marcinszkudlinski Feb 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current "edf" implementation is based on Zephyr work queue and run by a single Zephyr thread. Right, this thread can be preempted. BUT the tasks themselves are cooperative to each other. All the current edf can do is to decide which task from the queue starts first

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the Zephyr work queue is an in-order list of items to run, it doesn't handle preemption between tasks (which aren't threads, so they can't be preempted, they're just callbacks that run on the same thread). This isn't really a bug, it's just the way the data structure works.

If you want to preempt something, it has to be running in a thread with its own stack that can be suspended.

Note that there's an alternative, the "p4wq", which is a little more complicated, but based on the idea of running queue items in pooled threads such that they all have their own priorities and can preempt each other, but without the overhead of having to spawn one thread per task. This was actually written with SOF in mind originally, as it happens, though it will likely require some work to enable. See https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/sys/p4wq.h

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyross maybe I misunderstood your "it will likely require some work to enable" but it is already used by SOF for IDC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I read about it was using a k_work_queue? If it's already a p4wq then these tasks are already preemptible (provided the thread pool is large enough) and I'm not sure I see the point to this PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, yes, as a further optimisation we could in the future consider moving this to p4wq, that would allow us to use threads more optimally - use a common pool of threads instead of a thread per task, right @andyross ?

src/include/sof/schedule/dp_schedule.h Outdated Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Outdated Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Outdated Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Outdated Show resolved Hide resolved
src/audio/pipeline/pipeline-schedule.c Outdated Show resolved Hide resolved
src/include/sof/audio/component_ext.h Outdated Show resolved Hide resolved
@marcinszkudlinski
Copy link
Contributor Author

ATM this cannot work with a component on a core other than the pipeline. The problem is accessing buffer object list elements without locking in places like pipeline_for_each_comp(). We should block attempts to specify a core different from the pipeline core for DP components until such support is implemented.

I assure it can, at least works in tests now. The whole point of DP is to mix 1) diffrent ticks 2) diffrent cores

@marcinszkudlinski
Copy link
Contributor Author

@lyakh maybe i'm missing something how the buffers between componenets are managed and cache is managed. Lets discuss it in meeting. As I said, the main reason for DP is to allow offloading time consuming operation with long buffering / long processing to secondary cores. If it require some changes in pipeline_for_each_comp() - we need to implement it asap and add to this PR

@lyakh
Copy link
Collaborator

lyakh commented Feb 16, 2023

ATM this cannot work with a component on a core other than the pipeline. The problem is accessing buffer object list elements without locking in places like pipeline_for_each_comp(). We should block attempts to specify a core different from the pipeline core for DP components until such support is implemented.

I assure it can, at least works in tests now. The whole point of DP is to mix 1) diffrent ticks 2) diffrent cores

@marcinszkudlinski passed tests aren't a proof of correctness. We need to design and implement a robust approach to running linked components or pipelines on different cores instead of enabling it and then fixing inevitable bugs. This PR also only disables local interrupt as protection while aiming to support multiple cores. It seems like testing wasn't excessive.

@marcinszkudlinski
Copy link
Contributor Author

ATM this cannot work with a component on a core other than the pipeline. The problem is accessing buffer object list elements without locking in places like pipeline_for_each_comp(). We should block attempts to specify a core different from the pipeline core for DP components until such support is implemented.

I assure it can, at least works in tests now. The whole point of DP is to mix 1) diffrent ticks 2) diffrent cores

@marcinszkudlinski passed tests aren't a proof of correctness. We need to design and implement a robust approach to running linked components or pipelines on different cores instead of enabling it and then fixing inevitable bugs. This PR also only disables local interrupt as protection while aiming to support multiple cores. It seems like testing wasn't excessive.

ok, so we need to fix this issue ASAP, secondary cores support is a must

@marcinszkudlinski
Copy link
Contributor Author

marcinszkudlinski commented Feb 16, 2023

Missing commit added - all "resolved" conversation should be fixed in the code now. Pls verify.

Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I can by this one again and realized I had some unposted comments. Cleaned them up and submitting late. Nothing major.

src/schedule/zephyr_dp_schedule.c Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Outdated Show resolved Hide resolved
src/schedule/zephyr_dp_schedule.c Show resolved Hide resolved
src/audio/pipeline/pipeline-schedule.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment to clarify

src/include/sof/audio/component_ext.h Outdated Show resolved Hide resolved
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of nitpicks, the zephyr-side usage here all seems fine.

some platforms don't use Zephyr, therefore they can't
use DP scheduler. Add a config option

Signed-off-by: Marcin Szkudlinski <[email protected]>
The DP scheduler is a scheduler based on Zephyr preemptible
threads. It will start each SOF task as a separate Zephyr
thread.
At current implementation the scheduler can trigger each
task/thread periodically or on demand.
TODO: more sophisticated scheduling decisions, with deadline
and task budgets calculations.

Signed-off-by: Marcin Szkudlinski <[email protected]>
A component need to keep an information about how it need to be
scheduled - as LowLatency or DataProcessing. The information comes
from IPC4 init instance message

Signed-off-by: Marcin Szkudlinski <[email protected]>
Pipeline creation
 - create a task for each DP module when started
   on primary or secondary core
 - delete a task for each DP module when stopped
 - don't call comp_copy in LL context for DP modules

Signed-off-by: Marcin Szkudlinski <[email protected]>
This commit prevents DP modueles from run on different cores than
the pipleine LL modules. This limitation is enforced because of
possible cache races in pipeline_for_each_comp()
To be removed till safe implementation is ready

Signed-off-by: Marcin Szkudlinski <[email protected]>
@marcinszkudlinski
Copy link
Contributor Author

Cosmetic changes to satisfy checkpatch

@lgirdwood
Copy link
Member

@keqiaozhang can you check the CI queue, both cavs and ace tests have run, but the parent? job is still showing queuing. Thanks.

@lgirdwood lgirdwood merged commit ada31db into thesofproject:main Mar 14, 2023
@marc-hb
Copy link
Collaborator

marc-hb commented Mar 14, 2023

This was merged with clear failure: https://github.com/thesofproject/sof/actions/runs/4413659612/jobs/7734391466

/zep_workspace/sof/src/schedule/zephyr_dp_schedule.c:134:20: warning: incorrect type in argument 1 (different address spaces)
/zep_workspace/sof/src/schedule/zephyr_dp_schedule.c:348:36: warning: incorrect type in assignment (different address spaces)

cc: @lyakh

@keqiaozhang
Copy link
Collaborator

@keqiaozhang can you check the CI queue, both cavs and ace tests have run, but the parent? job is still showing queuing. Thanks.

Fixed. Please check it in new PRs.

@marc-hb
Copy link
Collaborator

marc-hb commented Mar 16, 2023

"(different address spaces)" fix submitted by @aiChaoSONG in #7286. thanks!

@marcinszkudlinski marcinszkudlinski deleted the dp_scheduler_public branch June 7, 2023 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants