Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

input: non-thread-safe access around mem_buf_status #3854

Closed
rittneje opened this issue Jul 25, 2021 · 13 comments
Closed

input: non-thread-safe access around mem_buf_status #3854

rittneje opened this issue Jul 25, 2021 · 13 comments
Labels

Comments

@rittneje
Copy link
Contributor

rittneje commented Jul 25, 2021

The mem_buf_status field of struct flb_input_instance is read and written across two threads without synchronization.

For example, it gets read in one thread here:

/* Check if the input plugin has been paused */
if (flb_input_buf_paused(in) == FLB_TRUE) {
flb_debug("[input chunk] %s is paused, cannot append records",
in->name);
return -1;
}

and gets written by another thread here:

fluent-bit/src/flb_input.c

Lines 873 to 883 in 5d51473

mk_list_foreach(head, &config->inputs) {
in = mk_list_entry(head, struct flb_input_instance, _head);
if (flb_input_buf_paused(in) == FLB_FALSE) {
if (in->p->cb_pause && in->context) {
flb_info("[input] pausing %s", flb_input_name(in));
in->p->cb_pause(in->context, in->config);
}
paused++;
}
in->mem_buf_status = FLB_INPUT_PAUSED;
}

From what I understand, this kind of access might not work as expected if each thread is running on a different CPU, so it does need to be fixed.

One possible solution is to protect that field with a pthread mutex. Another is to use atomic_int or similar, but it sounds like that's not portable. A third is to mark the field as volatile, but that may not necessarily fully resolve the issue.

This race condition was detected by the thread sanitizer while running the filter_grep unit tests.

8: ==================
8: WARNING: ThreadSanitizer: data race (pid=32648)
8: Write of size 4 at 0x7b4c00002580 by main thread:
8: #0 flb_input_pause_all /home/jrittner/fluent-bit/src/flb_input.c:882 (flb-rt-filter_grep+0x68021)
8: #1 flb_engine_exit /home/jrittner/fluent-bit/src/flb_engine.c:779 (flb-rt-filter_grep+0x8493e)
8: #2 flb_stop /home/jrittner/fluent-bit/src/flb_lib.c:719 (flb-rt-filter_grep+0x62be3)
8: #3 flb_test_filter_grep_regex /home/jrittner/fluent-bit/tests/runtime/filter_grep.c:55 (flb-rt-filter_grep+0x5f980)
8: #4 test_do_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1007 (flb-rt-filter_grep+0x5d12e)
8: #5 test_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1103 (flb-rt-filter_grep+0x5d4ae)
8: #6 main /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1700 (flb-rt-filter_grep+0x5f06a)
8:
8: Previous read of size 4 at 0x7b4c00002580 by thread T1:
8: #0 flb_input_buf_paused /home/jrittner/fluent-bit/include/fluent-bit/flb_input.h:484 (flb-rt-filter_grep+0xbf9ba)
8: #1 flb_input_chunk_append_raw /home/jrittner/fluent-bit/src/flb_input_chunk.c:888 (flb-rt-filter_grep+0xc2551)
8: #2 in_lib_collect /home/jrittner/fluent-bit/plugins/in_lib/in_lib.c:93 (flb-rt-filter_grep+0x14c293)
8: #3 flb_input_collector_fd /home/jrittner/fluent-bit/src/flb_input.c:1075 (flb-rt-filter_grep+0x6900a)
8: #4 flb_engine_handle_event /home/jrittner/fluent-bit/src/flb_engine.c:377 (flb-rt-filter_grep+0x841df)
8: #5 flb_engine_start /home/jrittner/fluent-bit/src/flb_engine.c:639 (flb-rt-filter_grep+0x841df)
8: #6 flb_lib_worker /home/jrittner/fluent-bit/src/flb_lib.c:628 (flb-rt-filter_grep+0x62663)
8: #7 (libtsan.so.0+0x296ad)
8:
8: As if synchronized via sleep:
8: #0 nanosleep (libtsan.so.0+0x4dac0)
8: #1 flb_time_msleep /home/jrittner/fluent-bit/src/flb_time.c:83 (flb-rt-filter_grep+0x9282b)
8: #2 flb_test_filter_grep_regex /home/jrittner/fluent-bit/tests/runtime/filter_grep.c:53 (flb-rt-filter_grep+0x5f971)
8: #3 test_do_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1007 (flb-rt-filter_grep+0x5d12e)
8: #4 test_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1103 (flb-rt-filter_grep+0x5d4ae)
8: #5 main /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1700 (flb-rt-filter_grep+0x5f06a)
8:
8: Location is heap block of size 392 at 0x7b4c000024c0 allocated by main thread:
8: #0 calloc (libtsan.so.0+0x2afc3)
8: #1 flb_calloc /home/jrittner/fluent-bit/include/fluent-bit/flb_mem.h:78 (flb-rt-filter_grep+0x650dc)
8: #2 flb_input_new /home/jrittner/fluent-bit/src/flb_input.c:153 (flb-rt-filter_grep+0x65aed)
8: #3 flb_input /home/jrittner/fluent-bit/src/flb_lib.c:255 (flb-rt-filter_grep+0x610a7)
8: #4 flb_test_filter_grep_regex /home/jrittner/fluent-bit/tests/runtime/filter_grep.c:28 (flb-rt-filter_grep+0x5f62c)
8: #5 test_do_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1007 (flb-rt-filter_grep+0x5d12e)
8: #6 test_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1103 (flb-rt-filter_grep+0x5d4ae)
8: #7 main /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1700 (flb-rt-filter_grep+0x5f06a)
8:
8: Thread T1 'flb-pipeline' (tid=32650, running) created by main thread at:
8: #0 pthread_create (libtsan.so.0+0x2bcee)
8: #1 mk_utils_worker_spawn /home/jrittner/fluent-bit/lib/monkey/mk_core/mk_utils.c:284 (flb-rt-filter_grep+0x4bc7f7)
8: #2 flb_test_filter_grep_regex /home/jrittner/fluent-bit/tests/runtime/filter_grep.c:43 (flb-rt-filter_grep+0x5f849)
8: #3 test_do_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1007 (flb-rt-filter_grep+0x5d12e)
8: #4 test_run_ /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1103 (flb-rt-filter_grep+0x5d4ae)
8: #5 main /home/jrittner/fluent-bit/tests/runtime/../lib/acutest/acutest.h:1700 (flb-rt-filter_grep+0x5f06a)
8:
8: SUMMARY: ThreadSanitizer: data race /home/jrittner/fluent-bit/src/flb_input.c:882 in flb_input_pause_all
8: ==================

@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Aug 25, 2021
@rittneje
Copy link
Contributor Author

Not stale.

@github-actions github-actions bot removed the Stale label Aug 26, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2021

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Oct 1, 2021
@rittneje
Copy link
Contributor Author

rittneje commented Oct 1, 2021

Not stale.

@github-actions github-actions bot removed the Stale label Oct 5, 2021
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2021

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Nov 5, 2021
@rittneje
Copy link
Contributor Author

rittneje commented Nov 5, 2021

Not stale.

@github-actions github-actions bot removed the Stale label Nov 6, 2021
@github-actions
Copy link
Contributor

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Dec 10, 2021
@rittneje
Copy link
Contributor Author

Not stale.

@github-actions github-actions bot removed the Stale label Dec 15, 2021
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Mar 16, 2022
@rittneje
Copy link
Contributor Author

Not stale.

@github-actions github-actions bot removed the Stale label Mar 17, 2022
@edsiper
Copy link
Member

edsiper commented May 31, 2022

both logics/code mentioned above runs in the same thread. There is no threaded input plugins "yet"

@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

@github-actions github-actions bot added the Stale label Aug 30, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Sep 5, 2022

This issue was closed because it has been stalled for 5 days with no activity.

@github-actions github-actions bot closed this as completed Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants