Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack size on POSIX needs to account for internal TCB+TLS #1429

Closed
jphickey opened this issue Dec 1, 2023 · 0 comments · Fixed by #1430
Closed

Stack size on POSIX needs to account for internal TCB+TLS #1429

jphickey opened this issue Dec 1, 2023 · 0 comments · Fixed by #1430
Assignees

Comments

@jphickey
Copy link
Contributor

jphickey commented Dec 1, 2023

Describe the bug
On POSIX, we noticed that stack usage seems higher than expected.

This is a concern because this is not being adequately budgeted for in typical CFE startup scripts, which only focus on the stack needed by the app itself, not the overhead imposed by the OS/library.

To Reproduce
This can be confirmed by checking the stackusage on the OSAL functional test (e.g. osal-core-test).

Partial output from running stackusage ./osal-core-test:

2680330  66  2680396      16384      16384       9320    56      0      0x555747f88172 
2680330  67  2680397      16384      16384       9320    56      0      0x555747f88172 
2680330  68  2680398      16384      16384       9320    56      0      0x555747f88172 
2680330  69  2680399      16384      16384       9320    56      0      0x555747f88172 

The 6th column - 9320 - refers to the maximum amount of stack used by one of the child threads spawned in this test.

The concern here is that these threads do nothing:

void task_generic_no_exit(void)
{
while (1)
{
OS_TaskDelay(100);
}
}

Basically - what this is saying is that the baseline stack use of a "nothing" thread is still using 9.3 kB of stack space.

System observed on:
Debian

Additional context
This is certainly due to the way the glibc pthreads implementation deals with the TCB and TLS structures. In the non-main threads on on x86-64 platform, it just puts these structures at the top of the stack.

An excellent description of the topic is here: https://chao-tic.github.io/blog/2018/12/25/tls

This correlates with what we are observing here, in that the TCB+TLS structures consume almost 10kB of stack space before even getting to the entry point of the task.

The real problem occurs if the user has fine-tuned/measured their stack usage on an RTOS like VxWorks. Most CFS apps only really need a few kB of stack space to run, so if a cfe_es_startup.scr file calls for a stack of 8192 bytes (because that's what the app needs) then the stack will end up being too small on POSIX because of this extra usage that hadn't been accounted for.

Reporter Info
Joseph Hickey, Vantage Systems, Inc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant