-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CFE_ES_ScanAppTable() possible race conditions #295
Comments
Imported from trac issue 264. Created by jphickey on 2019-03-25T17:01:55, last modified: 2019-06-24T11:44:20 |
CFE_ES_RunLoop, CFE_ES_DeleteApp, and CFE_ES_ReloadApp all lock ES shared data already. |
I've looked into this already and I have a proposal/plan which is mostly implemented but just needs to be tested on all the various platforms. The proposal is to introduce a more consistent notion of a "background job" in ES. We have several of these scattered about, including the Performance Log dump (separate ticket #456) and they are all implemented differently and inconsistently. Worth noting that ScanAppTable has other issues besides the locking, in particular the timers are based on "scan intervals" which are not consistent, as they happen after every message (incl. SCH "send tlm") as well as after a period of message inactivity. At any rate, my proposal is to have ES start a single, low-priority thread for all of these background jobs. The jobs will be implemented as state machines which perform some (limited) work on each invocation. The prototype will be:
The Both the periodic ScanAppTable and the Performance Log file dump fit nicely into this model, as well as (probably) any other longer-running jobs that occur in ES. |
Create a new background job to handle the maintenance tasks that had been performed in the ES main task as part of the CFE_ES_ScanAppTable() routine. All app state changes, including those invoked by messages, are now handled by this job. This also slightly changes the semantics of CFE_ES_RunLoop and CFE_ES_ExitApp. Now, the CFE_ES_RunLoop routine no longer requires a RunStatus buffer. Instead, the only thing that matters is the RunStatus value that is eventually passed to CFE_ES_ExitApp after the shutdown is complete. This should be mostly backward compatible, as the recommended app pattern would pass the same value to both functions. This commit also fixes nasa#480, as the value passed to CFE_ES_ExitApp will not override a request that was already pending.
Fix #295, Resolve app table scanning race conditions
The
CFE_ES_ScanAppTable()
is called from the ES message processing thread, but it does not lock the global ES data structure when reading/writing from the global data.ES software bus command messages are safe because this function is called by the same thread that is processing those messages, therefore concurrency is not possible here.
But other functions, like
CFE_ES_DeleteApp()
andCFE_ES_ReloadApp()
are part of the public API and these also update the same fields within the app state data structure thatCFE_ES_ScanAppTable()
is reading.CFE_ES_RunLoop()
can also modify fields within this structure and this is called by pretty much every app in the system.This issue was noted while reviewing the fix for a similar issue in #279.
The text was updated successfully, but these errors were encountered: