Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect SNMPv3 password returns garbage field values but no error #7746

Closed
jjamesclt opened this issue Jun 25, 2020 · 9 comments
Closed

Incorrect SNMPv3 password returns garbage field values but no error #7746

jjamesclt opened this issue Jun 25, 2020 · 9 comments
Labels
area/snmp bug unexpected problem or unintended behavior

Comments

@jjamesclt
Copy link

jjamesclt commented Jun 25, 2020

I am attempting to obtain a hostname from SNMP but the results continue to be returned as numbers.

Config file:

[[inputs.snmp]]
agents = [ "10.4.0.16" ]
interval = "60s"
timeout = "5s"
retries = 3
version = 3
sec_name = "UID"
auth_protocol = "SHA"
auth_password = "PWD"
sec_level = "authNoPriv"
 
  [[inputs.snmp.field]]
    name = "hostname"
    oid = ".1.3.6.1.2.1.1.5.0"
    is_tag = true
 
  [[inputs.snmp.field]]
    name = "uptime"
    oid = "1.3.6.1.2.1.1.3.0"
 
  [[inputs.snmp.field]]
    name = "cpmCPUTotal1min"
    oid = ".1.3.6.1.4.1.9.9.109.1.1.1.1.4.7"

Results from telegraf:

telegraf --test --config /etc/telegraf/telegraf.d/snmp_metrics.conf
2020-06-25T17:36:16Z I! Starting Telegraf 1.14.4
> snmp,agent_host=10.4.0.16,host=server01.domain.com,hostname=369185 cpmCPUTotal1min=369187i,uptime=369186i 1593106579000000000

I have tested with SNMPGET, SNMPWALK, SNMPTRANSLATE and cannot find any correlation to the number I get back from Telegraf.

snmpget -v3 -l authNoPriv -u UID -a SHA -A "PWD" 10.4.0.16 .1.3.6.1.2.1.1.5.0
SNMPv2-MIB::sysName.0 = STRING: SWITCH01.domain.com

snmptranslate .1.3.6.1.2.1.1.5.0
SNMPv2-MIB::sysName.0
 
snmpwalk -v3 -l authNoPriv -u UID -a SHA -A "PWD" 10.4.0.16 | grep SNMPv2-MIB::sysName.0
SNMPv2-MIB::sysName.0 = STRING: SWITCH01.domain.com

I have tried to update my SNMP OID to SNMPv2-MIB::sysName, SNMPv2-MIB::sysName.0, and RFC1213-MIB::sysName.0 as shown in the original example that had me building this solution:
https://techexpert.tips/grafana/grafana-monitoring-snmp-devices/

I have tried to do with after installing many many Cisco MIBs and placing them in /usr/shared/snmp/mibs with the others, and adding them to /usr/shared/snmp.conf. I have tried rebulding a CENTOS 7 server again from scratch and only installing Telegraf (I have the full TIG-stack on the other) but telegraf --test has the same results.

I expect to see hostname=SWITCH01 or hostname=SWITCH.domain.com as I see for so many other examples on the web. I'm sure I did something wrong somewhere or missed a key configuration point. I do have the whole TIG-stack working for linux stats, and for JSON-based ACI switch stats. I have tried a new database and users for InfluxDB but this problem seems to happen way before I get that far.

Thanks!

@danielnelson
Copy link
Contributor

Any chance you could temporarily configure the device to use SNMP v2 temporarily and see if the problem persists? It will be much easier to debug without the encryption.

After that could you also collect the packet captures with tcpdump as described in the troubleshooting section of the readme?

@danielnelson danielnelson added area/snmp bug unexpected problem or unintended behavior labels Jun 25, 2020
@jjamesclt
Copy link
Author

Any chance you could temporarily configure the device to use SNMP v2 temporarily and see if the problem persists? It will be much easier to debug without the encryption.

After that could you also collect the packet captures with tcpdump as described in the troubleshooting section of the readme?

Works fine with SNMPv2. I assumed there wouldn't be a difference because I saw examples of v2 and v3 and they all did the same thing, and not what I was getting.

telegraf --test --config /etc/telegraf/telegraf.d/config_file.conf
2020-06-25T18:49:35Z I! Starting Telegraf 1.14.4
> snmp,agent_host= 10.4.0.16,host=server01.domain.com,hostname=SWITCH01.domain.com uptime=2441887095i 1593110977000000000

I suppose I'll go try a capture now, thanks.

@danielnelson
Copy link
Contributor

Are all the values incorrect, or is it just the hostname? I notice from the example output above the values are a bit strange, all sequential integers:

hostname=369185 cpmCPUTotal1min=369187i,uptime=369186i

@jjamesclt
Copy link
Author

Are all the values incorrect, or is it just the hostname? I notice from the example output above the values are a bit strange, all sequential integers:

hostname=369185 cpmCPUTotal1min=369187i,uptime=369186i

No, they are way wrong. All over the place. I wonder if you're going to tell me what I just stumbled on. So SNMPv3, I feel is still new to even some seasoned network professionals. (I hope others are okay with that assumption.)

This blog, issue, case, that led me right to the issue:
https://community.checkpoint.com/t5/General-Topics/Monitoring-Check-Point-via-SNMPv3-and-telegraf/td-p/84555

I too, am quite ashamed. Just a single character off on the SHA key (or hash) apparently garbles data instead of presenting any sort of error. But I did find something in the capture that had me doubt something.

Thank you for walking me through a couple troubleshooting steps and getting me here so quickly!

@danielnelson
Copy link
Contributor

Wow, I had no idea it would give you no error but the data would be wrong. Unfortunately, I don't know enough about SNMPv3 to say if that can be improved upon, I'll look into it though.

@jjamesclt
Copy link
Author

Wow, I had no idea it would give you no error but the data would be wrong. Unfortunately, I don't know enough about SNMPv3 to say if that can be improved upon, I'll look into it though.

Perhaps it can, snmpget provides an error when I attempt with an incorrect password:

snmpget -v3 -l authNoPriv -u UID -a SHA -A "WRONG_PWD" HOST SNMPv2-MIB::sysName.0
snmpget: Authentication failure (incorrect password, community or key)

@danielnelson
Copy link
Contributor

Let's reopen this issue as a bug report about error reporting.

@danielnelson danielnelson reopened this Jun 26, 2020
@danielnelson danielnelson changed the title Unable to retrieve textual input from SNMPv3 for hostname "is_tag" Incorrect SNMPv3 password returns garbage field values but no error Jun 26, 2020
@hackery
Copy link
Contributor

hackery commented Jul 23, 2020

This sounds very like the issue I discovered yesterday. In our case, a firmware upgrade on an SNMP target (Juniper firewall) changed the authentication requirements, and our Telegraf config was then requesting a sec_level+sec_name+auth_password which weren't accepted.

tcpdump gives a clue about what's going on.

Where authentication fails (and perhaps in other situations) the SNMP target returns a Report rather than a GetResponse:

12:40:50.019291 IP mgmt-0.46703 > x.x.x.111.snmp:  F=ar U="xxx" E=... C="" GetRequest(36)  E:2636.3.39.1.12.1.1.1.4.0
12:40:50.043677 IP x.x.x.111.snmp > mgmt-0.46703:  F= U="xxx" E=... C="" Report(33)  S:snmpMPDMIB.snmpMPDMIBObjects.snmpMPDStats.snmpUnknownPDUHandlers.0=84847

That 84847 is what then appears as the result value for the metric emitted by the SNMP input plugin. I can watch these incrementing values shown by tcpdump and see them recorded in Influx. (snmpUnknownPDUHandlers.0 is a value in the MIB which increments for each "bad request" of this type).

I've not gone into the code there but I presume that the response handler (a) doesn't discriminate between GetResponse and Report, and (b) assumes that the OID in the response is the one that was requested, and takes its value.

@stephanie-engel
Copy link
Contributor

Closing because this seems to be a duplicate of #7512

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/snmp bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants