-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data processing scheduler - SOF tasks as Zephyr preemptive threads #7089
Data processing scheduler - SOF tasks as Zephyr preemptive threads #7089
Conversation
c915609
to
e7f2fbf
Compare
@marcinszkudlinski what will be the difference between DP scheduling and DMA domain? |
@iuliana-prodan DMA domain is DOMAIN, DP is a scheduler that can run in Timer or DMA domain in DP can have different tick for each module, you can mix LL (low latency) and DP (data processing) modules, each module in a single pipeline may work on separate core DMIC (1ms, core 0, LL scheduling) --> processing (10ms, core 1, DP scheduling) --> HDA (1ms, core 0, LL scheduling) and next step: when there are more DP modules running on single core, i.e. one 1ms tick, second 2 seconds tick (like speech recognition) - DP will allow processing using EDF scheduling with preemtion and avoid starvation of 1ms module |
c71cd7f
to
b336bad
Compare
b336bad
to
31574c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent V1 @marcinszkudlinski ! I really only have one or two bigger issues in the code. I added a lot of typo/grammer comments (as it was just convenient to add them while reading the documentation parts), but these are really minor, I could understand the concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ATM this cannot work with a component on a core other than the pipeline. The problem is accessing buffer object list elements without locking in places like pipeline_for_each_comp()
. We should block attempts to specify a core different from the pipeline core for DP components until such support is implemented.
* | ||
* Other possible usage of DP scheduler is to schedule task with DP_SCHEDULER_RUN_TASK_IMMEDIATELY | ||
* as start parameter. It will force the task to work without any delays and asyn to LL. | ||
* This kind of scheduling may be used for tasks like IPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we just use Zephyr APIs directly for those threads? Looks like it would also simplify the DP implementation. What are the advantages for users to use DP for this instead of Zephyr APIs directly? Besides the IPC thread doesn't need to run periodically, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Won't simplify much, couple of lines
- in future there's a concept of "taksks with budgets", not supported by zephyr directly. We discuss it with Zephyr and they only provide some callback to make it possible to implement
- I would keep an abstraction for a) porting b) common API with schedulers like LL working in completely diffrent way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant tasks like IPC. IPC with Zephyr is currently using the EDF scheduler, which is just a thin API wrapper around Zephyr work queues, not much more than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know about current zephyr_EDF, but we need to have IPC to be preemptive also, as a "task with budget" - IPC should be processed very quickly, but if processing takes to long it must yield to LL. Current zephyr implementation of EDF, as you said, does not do much and is not a really EDF, just "start the earliest deadline task and hope it finish processing in time".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcinszkudlinski @lyakh Let's not detour to IPC too much here. The current IPC with EDF is clearly not optimal and should be either written on top of native work queues, or something else if we need to have preemption after budget has expired (has not implemented so far in SOF, so this itself is a new feature).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current "edf" implementation is based on Zephyr work queue and run by a single Zephyr thread. Right, this thread can be preempted. BUT the tasks themselves are cooperative to each other. All the current edf can do is to decide which task from the queue starts first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the Zephyr work queue is an in-order list of items to run, it doesn't handle preemption between tasks (which aren't threads, so they can't be preempted, they're just callbacks that run on the same thread). This isn't really a bug, it's just the way the data structure works.
If you want to preempt something, it has to be running in a thread with its own stack that can be suspended.
Note that there's an alternative, the "p4wq", which is a little more complicated, but based on the idea of running queue items in pooled threads such that they all have their own priorities and can preempt each other, but without the overhead of having to spawn one thread per task. This was actually written with SOF in mind originally, as it happens, though it will likely require some work to enable. See https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/sys/p4wq.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andyross maybe I misunderstood your "it will likely require some work to enable" but it is already used by SOF for IDC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I read about it was using a k_work_queue? If it's already a p4wq then these tasks are already preemptible (provided the thread pool is large enough) and I'm not sure I see the point to this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, yes, as a further optimisation we could in the future consider moving this to p4wq, that would allow us to use threads more optimally - use a common pool of threads instead of a thread per task, right @andyross ?
I assure it can, at least works in tests now. The whole point of DP is to mix 1) diffrent ticks 2) diffrent cores |
@lyakh maybe i'm missing something how the buffers between componenets are managed and cache is managed. Lets discuss it in meeting. As I said, the main reason for DP is to allow offloading time consuming operation with long buffering / long processing to secondary cores. If it require some changes in pipeline_for_each_comp() - we need to implement it asap and add to this PR |
31574c3
to
66e3feb
Compare
@marcinszkudlinski passed tests aren't a proof of correctness. We need to design and implement a robust approach to running linked components or pipelines on different cores instead of enabling it and then fixing inevitable bugs. This PR also only disables local interrupt as protection while aiming to support multiple cores. It seems like testing wasn't excessive. |
ok, so we need to fix this issue ASAP, secondary cores support is a must |
66e3feb
to
627a937
Compare
Missing commit added - all "resolved" conversation should be fixed in the code now. Pls verify. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I can by this one again and realized I had some unposted comments. Cleaned them up and submitting late. Nothing major.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment to clarify
dbea72d
to
97f2320
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of nitpicks, the zephyr-side usage here all seems fine.
97f2320
to
90c917e
Compare
90c917e
to
34d3e98
Compare
some platforms don't use Zephyr, therefore they can't use DP scheduler. Add a config option Signed-off-by: Marcin Szkudlinski <[email protected]>
The DP scheduler is a scheduler based on Zephyr preemptible threads. It will start each SOF task as a separate Zephyr thread. At current implementation the scheduler can trigger each task/thread periodically or on demand. TODO: more sophisticated scheduling decisions, with deadline and task budgets calculations. Signed-off-by: Marcin Szkudlinski <[email protected]>
A component need to keep an information about how it need to be scheduled - as LowLatency or DataProcessing. The information comes from IPC4 init instance message Signed-off-by: Marcin Szkudlinski <[email protected]>
Pipeline creation - create a task for each DP module when started on primary or secondary core - delete a task for each DP module when stopped - don't call comp_copy in LL context for DP modules Signed-off-by: Marcin Szkudlinski <[email protected]>
This commit prevents DP modueles from run on different cores than the pipleine LL modules. This limitation is enforced because of possible cache races in pipeline_for_each_comp() To be removed till safe implementation is ready Signed-off-by: Marcin Szkudlinski <[email protected]>
34d3e98
to
3b4f87a
Compare
Cosmetic changes to satisfy checkpatch |
@keqiaozhang can you check the CI queue, both cavs and ace tests have run, but the parent? job is still showing queuing. Thanks. |
This was merged with clear failure: https://github.com/thesofproject/sof/actions/runs/4413659612/jobs/7734391466
cc: @lyakh |
Fixed. Please check it in new PRs. |
"(different address spaces)" fix submitted by @aiChaoSONG in #7286. thanks! |
DP scheduler is a scheduler that creates a separate preemptible Zephyr thread for each SOF task
There's only one instance of DP in the system, however, the threads may run on any core
Threads are pinned to specific core, there's no SMP processing.
The task execution may be delayed and task may be re-scheduled periodically
NOTE:
USECASES:
TODOs
EDF:
Threads run on the same priority, lower than thread running LL tasks. Zephyr EDF mechanism
is used for decision which thread/task is to be scheduled next. The DP scheduler calculates
the task deadline and set it in Zephyr thread properties, the final scheduling decision is made
by Zephyr.
task with budgets:
task start at high priority, after given execution time the priority is dropped