-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random integers in SNMP Input Plugin OID output #7929
Comments
What am I supposed to be looking at? Can you point out specifically what you're seeing and what you're expecting to see? |
So the second line is what the output should look like with proper/expected values for each field and tag. The first and third lines are what I'm getting the output as for some hosts. the values for OIDs are coming as 903i, 904i, 905i and so on. |
Hi @ssoroka , anything on this? |
I haven't seen this behavior before. Could you make a packet capture of telegraf getting these values from the snmp agents and attach it to the issue along with the output of telegraf and the telegraf config you're using? |
Maybe also show the relevant part of your config. |
Sorry for the delay in replying. I managed to figure out what the issue was. I guess that is not the right behaviour. |
I'm also seeing this. Some of these OIDs I have enabled
Any idea where these values are coming from, and why an error isn't reported? |
@DouglasHeriot That seems a totally different issue, could you create a new one for this please? @surbhim7 As your issue seems to be resolved, can you close it? |
@Hipska I'm pretty sure it's the same issue - I've just run into a different consequence of it. With this basic input config: [[inputs.snmp]]
agents = [
"10.1.1.1",
"10.2.2.2"
]
version = 3
sec_name = "user"
auth_protocol = "SHA"
auth_password = "password"
sec_level = "authPriv"
context_name = ""
priv_protocol = "AES"
priv_password = "password"
interval = "20s"
[[inputs.snmp.field]]
name = "snmp_hostname"
oid = "RFC1213-MIB::sysName.0"
is_tag = true
[[inputs.snmp.field]]
name = "uptime"
oid = "RFC1213-MIB:sysUpTime.0"
Returns this result:
Example switch
When testing using
This leads to "runaway cardinality" on snmp fields with The expected outcome is an error should be logged saying that authentication failed, instead of rubbish data being published. |
Correct, that is indeed the same issue, with the same root cause. I was confused because you initially only mentioned a InfluxDB error/problem. The plugin should indeed give warning/error if the agent does not return useful data. |
I'm having a little look into this - it appears it may be an issue in the upstream gosnmp project? Where |
…a#7929) gosnmp does not return any error when an SNMP get request fails due to authentication. The SNMP agent will return a "report" PDU instead of a "get-response" PDU in this case. A check has been added to verify the packet's PDUType is not "Report"
I think this only applies to SNMP V3 - with SNMP V2 if the community string is incorrect there is simply no response. Error packet capture (note telegraf is set to retry 3 times) rfc3411 says "Report-PDU" is of the "Internal Class" which I guess means valid messages shouldn't use this - my testing and packet captures show valid requests receive a "Response-PDU". gosnmp I have created a draft pull request #8215 with a fix for this that logs reports as an error that may be due to authentication. I have not yet added any tests for this. However, I'm not an expert at the SNMP protocol - I'm not sure if it makes more sense for this to be handled within the gosnmp library. I'm not sure if gosnmp/gosnmp#172 is relevant to this or not - I think it shows you can also check the received report message for authentication flags? As shown in my comment above, the |
Hi @DouglasHeriot, I have devices which does not work with Telegraf SNMP plugin, as the requests fail due to following error: "Incoming packet is not authentic, discarding". The device answers on the first initial packet with the SNMP Report PDU and that message triggers the error and Telegraf stops at that point. As you mentioned, tools like snmpget and snmpwalk work correctly. I have tested your pull request #8215, but I don't see any change in the behavior:
Any ideas how this issue can be fixed? |
Hi @DouglasHeriot, if you interested in the problem described in my previous comment, please take a look at: #3788 (comment) |
@dpajin thanks for the info! In my case I just fixed the authentication config on our switches to resolve the error. However I'm still interested in seeing this fixed as my Influx gets full of random integers when new switches are added that are not correctly configured. Will you make a pull request to update |
@DouglasHeriot, yes I will try to make a pull request to update |
@DouglasHeriot, I made a pull request for gosnmp update: #8588 |
Since gosnmp has been updated, does this issue still occur? |
Yes I can confirm it happens: snmpwalk -v 3 -u user-l AuthPriv -x aes -X privpass -a sha -A authpass 10.10.10.10 telegraf --test --config asav.conf
[[inputs.snmp.field]] [[inputs.snmp.field]] |
Okay, I see that some problems are solved with new gosnmp version, but there is still no check on the response.
The mystery numbers are some of 1.3.6.1.6.3.15.1.1 They are counters of how many auth problem the system has observed. I will have a look on your original PR #8215. |
I don't think there's a mystery, I explained in #7746 what was going on here (SNMP target returns a |
Indeed, that's what I'm saying. See also the linked PR #8215 which will fix this issue. |
Hi all, please check out telegraf 1.18.2 which has a final fix for this. Feedback is welcome. |
Works as expected as I gather - no more false entries sent to output, influxdb and understandable errors in telegraf.log |
I'm using the snmp input plugin to poll certain OIDs.
Every time I run telegraf, I'm getting sequential integers in outputs of OID for random hosts as can be seen below :
The text was updated successfully, but these errors were encountered: