Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] defer_exec is deleted by created by another deferred task (that has to delete itself) #22697

Closed
2 tasks
stasmarkin opened this issue Dec 18, 2023 · 6 comments · Fixed by #22722
Closed
2 tasks

Comments

@stasmarkin
Copy link

Describe the Bug

Here is my case.
I create a deferred task, that cancels itself (I know, it's not necessary, but in runtime there could be other tasks to cancel. So sometimes it has to cancel itself). If I put creation another deferred task after cancelation, it would be deleted. Let me show a code:

static deferred_token token = 0;

uint32_t some_task(uint32_t trigger_time, void *cb_arg) { 
  cancel_deferred_exec(token);
  defer_exec(1000, another_task, NULL);
  return 0;
}

token = defer_exec(1000, some_task, NULL);

In that case some_task will be called in 1000ms, but another_task wouldn't be started in 2000.

I believe a bug lives in deferred_exec.c in function deferred_exec_advanced_task
It holds a pointer to entry and executes callback: entry->callback(entry->trigger_time, entry->cb_arg);
If you will call cancel_deferred_exec within that callback, it will erase current entry's state.
And a new callback to another_task will be written in that entry.
So, after exiting entry->callback deferred_exec_advanced_task will erase current entry's state again.

Keyboard Used

any

Link to product page (if applicable)

No response

Operating System

No response

qmk doctor Output

No response

Is AutoHotKey / Karabiner installed

  • AutoHotKey (Windows)
  • Karabiner (macOS)

Other keyboard-related software installed

No response

Additional Context

No response

@elpekenin
Copy link
Contributor

elpekenin commented Dec 18, 2023

The return 0; does already cancel the task. As such, you dont need to call cancel_deferred_exec... which to my understanding, is for cancelling tasks based on "outside world" conditions, rather than its own logic. (ie: no bugs due to using freed/used memory)

I might be missing some "weird" use case here, but on first sight the only scenario where i can see this (freeing the slot before the return 0;) being needed, is when you already have MAX_DEFERRED_EXECUTORS running and you can't add a new one right now, in which case you may just increase the #define by 1

@stasmarkin
Copy link
Author

I totally agree, you usually don't need to cancel deferred task within itself.
But I'm building a library with some complex states and many things happening under the hood. So in my case that static deferred_token token may contain some other task token becouse someone will put another task there (after previous was deferred). So it's not really necessary, that a task will cancel itself.

But in case where it cancels itself, I can't create a new deferred task.

Well, I have already got a workaround, I can defer task first and than cancel, and that works.

static deferred_token token = 0;

uint32_t some_task(uint32_t trigger_time, void *cb_arg) { 
  defer_exec(1000, another_task, NULL);
  cancel_deferred_exec(token);
  return 0;
}

token = defer_exec(1000, some_task, NULL);

But anyway, it seems to be a bug there

@elpekenin
Copy link
Contributor

elpekenin commented Dec 18, 2023

static deferred_token token

IMO it would be much better to use cb_arg to pass the token(s) around instead of having a global variable (you probably {doing/thought of} that already, but dropping the idea just in case)

workaround

defer_exec(1000, another_task, NULL);
cancel_deferred_exec(token);

Yeah, this would work... Unless, as said on my other message (sorry i edited it a couple times), you are already at the maximum of concurrent task, and defer_exec fails.

bug

I'm not the designer of the API, but to my eye it sounds more like a limitation/un-intended usage than a bug. I cant (didn't spend much time on it, mind you) think of a clean way to detect the "cancelled the task and created a new one on the same slot during its callback" scenario.

The best idea i can think of right now would be to have some sort of "admin" task whose cb_arg contains a list of tasks(callback [+ cb_arg]) to be scheduled, such that it retries until they actually return a valid token when calling defer_exec and they get pop'ed from the list, and perhaps a list of tokens to be cancelled too.

...But that sounds like quite an overkill vs simply increasing the number of executors you can have running at the same time, and forget about cancelling them from within themselves (doing it with the 0 return instead). Since each one only adds a couple pointers + a timer check, it wouldnt even be big impact on resources usage.

EDIT: Im aware both of my proposals would still be problematic... "Admin" task may get its own token to be cancelled if you aren't careful, and the second one wouldn't have a clean way to cancel a task from another taks (but you can maybe handle that elsewhere)

@stasmarkin
Copy link
Author

I'm not sure if it make sense for you, I don't know a codestyle of this project and best practices in C. But for me it seems reasonable to check if entry's state was changenged right after callback finished. And in case if state has changes just ignore other post-callback actions. So, code would be something like this:

                deferred_token current_token = entry->token;

                // Invoke the callback and work work out if we should be requeued
                uint32_t delay_ms = entry->callback(entry->trigger_time, entry->cb_arg);

                if (entry->token != current_token) {
                    // state of current executed task has been changed in callback  
                    continue; 
                }

                // Update the trigger time if we have to repeat, otherwise clear it out
                if (delay_ms > 0) {
                    // Intentionally add just the delay to the existing trigger time -- this ensures the next
                    // invocation is with respect to the previous trigger, rather than when it got to execution. Under
                    // normal circumstances this won't cause issue, but if another executor is invoked that takes a
                    // considerable length of time, then this ensures best-effort timing between invocations.
                    entry->trigger_time += delay_ms;
                } else {
                    // If it was zero, then the callback is cancelling repeated execution. Free up the slot.
                    entry->token        = INVALID_DEFERRED_TOKEN;
                    entry->trigger_time = 0;
                    entry->callback     = NULL;
                    entry->cb_arg       = NULL;
                }

Does that make any sense? That would fix both your case with reaching limits and my.
Sorry, can show my code yet, still in progress. I think I will publish it next week.

@elpekenin
Copy link
Contributor

elpekenin commented Dec 20, 2023

It makes sense, but it can still run into the same issue (less likely, tho)

cancel_defer_exec

Cancels itself, freeing up its position(entry) in the table, namely:

  • token = INVALID
  • callback and arg pointers = NULL
  • trigger_time = 0

defer_exec

To create this new task

  1. Code iterates over the table looking for an empty spot. NOTE May be the one we just free'd
  2. Upon finding an empty slot, allocate_token is used to assign a token for this new task. NOTE Might get the token that was just free'd too
  3. if (entry->token != current_token) isn't hit
  4. You override the trigger_time you've just set with the return value from the callback

EDIT: Nvm, im stoopid and didn't actually take a look at allocate_token, seems like the tokens aren't reused but just a counter that keeps growing... So your solution should work :))

@tzarc
Copy link
Member

tzarc commented Dec 20, 2023

See if #22722 fixes matters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants