Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application(redis client) crashes with SIGSEGV at redisBufferRead at hiredis.c:941 while making a GET call to redis server #1267

Closed
Sukesh0513 opened this issue Jun 24, 2024 · 6 comments

Comments

@Sukesh0513
Copy link

Sukesh0513 commented Jun 24, 2024

in our application we use c++, so we have used redis++(1.3.7) and hired(1.0.2) for our redis client implementation, In the application we make a GET call to redis server, then after that call, a bunch of functions in redis++ and hiredis are called one by one, and the application finally crashes with SIGSEGV while calling the function redisBufferRead at at hiredis.c:941. This is what I could see from backtracing the corefile.

Note: in our application we do several GET call until we get the desired value the from redis-server, in some cases if the value is configured in redis servees, we will have a null value as return and we will retry the GET call again untill we get a non null value. In the initial stage, redis server is not pupulated this value and our application will keep on making get call but it seals like this issue is not happening for every get call, it only happen once after multiple times.

here is the backTrace from the corefile

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f6dc0b49a21 in redisBufferRead (c=0x1720680) at hiredis.c:941
941 hiredis.c: No such file or directory.

#0 0x00007f6dc0b49a21 in redisBufferRead (c=0x1720680) at hiredis.c:941
#1 0x00007f6dc0b49d78 in redisGetReply (c=c@entry=0x1720680, reply=reply@entry=0x7f6da0989618) at hiredis.c:1032
#2 0x00007f6dc0b84b66 in redis::Connection::recv (this=this@entry=0x7f6da0989688, handle_error_reply=handle_error_reply@entry=true)
at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/connection.cpp:212
#3 0x00007f6dc0b9e913 in redis::Redis::_command<void ()(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&> (this=, cmd=0x7f6dc0b9cb40 <redis::cmd::get(redis::Connection&, std::basic_string_view<char, std::char_traits > const&)>, connection=...)
at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/redis.hpp:1316
#4 redis::Redis::command<void (
)(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&> (
this=, cmd=cmd@entry=0x7f6dc0b9cb40 <redis::cmd::get(redis::Connection&, std::basic_string_view<char, std::char_traits > const&)>)
at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/redis.hpp:47
#5 0x00007f6dc0b9733c in redis::Redis::get[abi:cxx11](std::basic_string_view<char, std::char_traits > const&) (this=, key=...)
at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/redis.cpp:314

Initially I thought that the redisContext which is used in the function redisBufferRead at hiredis.c:941 is pointing to a null pointer or might be not be initialized but when the checked the backtrack and tried to print it, I came to know that it is not null and is initialized, looking the back trace I do not understand or can find any reson for redisBufferRead to throw a SIGSEGV. Any idea or understand on why this could happen will be very helpful.

@Sukesh0513 Sukesh0513 changed the title Application(redis client) crashes with SIGSEGV at redisBufferRead at hiredis.c:941 while waking a GET call to redis server Application(redis client) crashes with SIGSEGV at redisBufferRead at hiredis.c:941 while making a GET call to redis server Jun 24, 2024
@michael-grunder
Copy link
Collaborator

michael-grunder commented Jun 24, 2024

What I would suggest is running your application under valgrind or building it with gcc or clang sanitizers.

Immediately the address 0x1720680 looks wrong compared to the other addresses in the callstack. This could indicate that you have a buffer overrun or similar issue that's overwriting the pointer to your redisContext object.

I can't tell for sure, but you can see that every other address looks more like 0x7f6da0989618, which seems like a valid address.

Valgrind can probably find this for you very quickly whether it is an issue in your program or possibly in hiredis or redis++

@michael-grunder
Copy link
Collaborator

Going to close this because it's unlikely a bug in hiredis.

However, if you run your program under valgrind and it does indicate hiredis as the source of the bug feel free to post the diagnostics and reopen.

@Sukesh0513
Copy link
Author

Sukesh0513 commented Jun 26, 2024

Hi,

Thanks for the analysis, we did sun some unit tests under valgrind but we do find this issue, in out unit test we usually add a mock value as a respace to a GET call from redis, so we do not have any issues. But this issue happens in test server, when we keep the application in idle state and do not configure any thing in redis server, so our application makes GET calls indefinitely, we do not see hredis getting this crash right away, it makes a bunch of GET call and this after some time it crashes,

The crash apparently happens at hiredis.c at 941, when it is trying to access the c

image

So I checked the core file and c has valid content in it and is not null, so the only conclusion that I have is, may be c is dereferences just before accessing it at hiredis,c 941, but may be the address still holds the data. Also, have you seen any issue of this sort where hiredis dereferences or some thing of this sort and throws a SIGSEHV after multiple retries?

@michael-grunder
Copy link
Collaborator

Also, have you seen any issue of this sort where hiredis dereferences or some thing of this sort and throws a SIGSEHV after multiple retries?

Over the years we have certainly fixed segfaulting bugs, but not in quite some time. It's certainly possible that there's another one in the codebase that we haven't detected yet.

we did sun some unit tests under valgrind but we do find this issue

The valgrind log would be interesting to see. It should be able to show the origin of the problem.

Like I said though, the address 0x1720680 for the redisContext looks suspect to me 😄 If the context is just a normal heap allocation, I would expect it to look like the rest of your addresses in the callstack:

Look at the rest of your addresses in that callstack:

0x7f6da0989618
0x7f6da0989688
0x7f6dc0b9cb40
0x00007f6dc0b49a21
0x00007f6dc0b49d78
0x00007f6dc0b84b66
0x00007f6dc0b9733c
0x00007f6dc0b9e913

Those look like heap allocations to me. 0x1720680 looks like it's in your program somewhere.

In addition to the output of valgrind another thing to try is running your tests using rr. If you can catch the crash it will allow you to replay it over and over again, meaning if this is a buffer overrun or similar you will be able to set a watchpoint on your context and run backwards until it detects the write.

(rr) watch -l <redisContext pointer>
(rr) reverse-continue

@Sukesh0513
Copy link
Author

Hi again ,

Sorry for the delay, we have reproduced the issue under valgriend and here is what I can see from the BackTrace of valgriend. this crash is not exactly same as the previous one which happens at redisBufferRead (c=0x1720680) at hiredis.c:941, but we do see a crash here as well, but it is happening at a different place, Please take a look and let me know if this rings any bells

==4595== 1,532 bytes in 1 blocks are possibly lost in loss record 1,258 of 1,290
==4595== at 0x483D85F: realloc (vg_replace_malloc.c:1451)
==4595== by 0x4928B49: hi_realloc (alloc.h:71)
==4595== by 0x4928B49: sdsMakeRoomFor (sds.c:226)
==4595== by 0x49293AB: sdscatlen (sds.c:384)
==4595== by 0x492EA44: redisReaderFeed (read.c:724)
==4595== by 0x4927ECD: redisBufferRead (hiredis.c:985)
==4595== by 0x4928227: redisGetReply (hiredis.c:1081)
==4595== by 0x48B75B5: redis::Connection::recv(bool) (connection.cpp:241)
==4595== by 0x48D3702: _command<void ()(redis::Connection&, const std::basic_string_view&), const std::basic_string_view<char, std::char_traits >&> (redis.hpp:1316)
==4595== by 0x48D3702: std::enable_if<!std::is_convertible<void (
)(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > >::value, std::unique_ptr<redisReply, redis::ReplyDeleter> >::type redis::Redis::command<void ()(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&>(void ()(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&) (redis.hpp:47)
==4595== by 0x48CBF7B: redis::Redis::get[abi:cxx11](std::basic_string_view<char, std::char_traits > const&) (redis.cpp:314)

@michael-grunder
Copy link
Collaborator

This looks like partial output from valgrind where it's describing a memory leak.

You can capture the full valgrind output with --log-file=/tmp/some-log-file.txt. You can also specify --track-origins=yes which will display both the invalid read/write but also where the memory came from if valgrind can determine that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants