-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault while freeing memory #4
Comments
Which platform, version and plugin binary version is that? Can you please add a |
Here is the info:
Dist info:
libs:
I'm using the latest version of the plugin (1.4). |
@mxhash can you take a deeper look please? I haven't been involved in development here, nor understand the specific code parts where this could be triggered. I've just updated the README for getting things to compile easier. |
I agree that net-snmp seems to be the issue. Executing the plugin in a Debian Stretch environment with libsnmp v5.6.3 does not fix the problem. So it might not have been fixed in a newer version yet. |
I think I have the same problem. While executing the check, a Segmentation Fault returned.
Version: Sometimes the error occurs, sometimes not... |
I give some tests a try. Any steps to reproduce the issue? |
We only have problems with checks on Nexus7700. But sometimes it works, sometimes not.... |
Ah ok ;-) bad to test for me... Does this problem also occur when querying with snmp utils? |
Thank you for following up on this issue! |
I also did a snmpwalk over all OIDs of this device and didn't get an error |
Thank you guys. I'll give it a try. |
Allright, did some testing and had no change to reproduce the issue you had. But we do not have these kind of hardware. I tried with 3com and juniper in default mode. The stack trace also looks like an error in the libsnmp functions. Maybe you can get more info out of it (different version, debug with symbols, etc.) At the moment I have no clue ;-) Cheers, |
Hi, possibly I have the same problem with Dell S4048-ON Switches. If I run the check with sudo the message changes to: Debug with gdb and symbols:
The output from gdb with elevated rights reads:
I tried to find a fix for this and came to the lines 1488, 1499 and 1500 in snmp_bulkget.c.
I think it would be fine if it is utilized by a GETBULK request. In this case I think it is used by a GET request and in this case the values for "errstat" and "errindex" are populated with "8" and "0". The errorcode "wrongLength (The set value has an illegal length from what the agent expects)" can be found in the netsnmp file snmp_client.c and it seems like the number 8 matches the error message in line 1204. If I change the value of "non_repeaters" to something between 1 and 19 the returned error message is changing accordingly. If I change it to 0 the check is running fine an no error messages are shown. My biggest question here is, why do requests to other devices do not show this problem? Please let me know if this was helpful or if I am wrong. Cheers, |
same issue on redhat enterprise linux 8, backtrace looks similiar
|
Sorry to bump an old thread - we're seeing this happen across a number of devices at different sites, lots ofa different vendors. I've noticed if I add --mode nonbulk the check works correctly. |
Sorry to bring this issue back on top. As my colleague @phil-or wrote 2019 we had sometimes troubles with Nexus 7700 series. We have installed serveral of them. We are using VDC on every Nexus. That means on every physical device are "installed" more than one virtual routers. They are configured in the same way on every hardware. Incl. the VLAN-Interfaces the Nexus has 74 ports to monitor. Interesting thing is, some time after my colleague wrote here, everything worked fine. But suddenly the problem comes back few weeks ago. But only with one virtual router! First I tried "strace". This is the anonymized output after loading net-snmp:
After I tried "ltrace". This is the shorted and anonymized output starting with the function "snmp_pdu_create":
Next debug tool I used was "valgrind". This is the anonymized output:
The last tool I used is "dmesg". This one seems to confirm what "valgrind" is printing:
I did also other tests with diffrent parameters. Like @0xliam mentioned "--mode nonbulk". With this one the segfault also happens on problematic virtual router on the Nexus 7700. I hope this information helps to debug this better. |
Any news on this issue? It Seems there is no solution for this problem yet? |
ref/NC/761567 |
Hey,
I am encountering a problem using the program to query a device. The program terminates with a seg fault.
Maybe the following debugging output helps you to track down the issue:
The text was updated successfully, but these errors were encountered: