-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: collect SSD endurance information where available in smartctl #11391
feat: collect SSD endurance information where available in smartctl #11391
Conversation
Thanks so much for the pull request! |
Thanks so much for the pull request! |
!signed-cla |
@bentasker for some reason CircleCI tests didn't get triggered, mind rebasing on master to see if it would trigger them? Thanks! |
…where it's present (related to influxdata#8701) There isn't a standardised attribute ID (or name) for SSD endurance information with SMART data. Some vendors use ID 202, others use 177 (and others use others). The two following entries indicate the remaining lifetime of two different SSDs 202 Percent_Lifetime_Remain P---CK 094 094 000 - 6 177 Wear_Leveling_Count ------ 100 100 050 - 432 Because those device manufacturers that *don't* use `202` might use `202` for other information, we cannot safely rely on the ID to identify the information. This commit introduces `deviceFieldNames` which is a map of names (such as `Percent_Lifetime_Remain`). When the SMART data is iterated over in `gatherDisk()`, each entry's name is checked against `deviceFieldNames` andd a field created where a match is made
…bute (see influxdata#8701) This form of wording is used in the smart attributes on WD HGST SSDs. It's functionally identical to the "Percent Used" attribute seen on various other brands
Although the different values are assumed to mean the same thing, because their implementation is vendor specific that assumption probably isn't safe.
…used:" Adds a new SMART snippet (as none of the existing examples had it) and adds a test
The earlier commits introduced new fields, so update the expected result counts to reflect this
b0d3144
to
41fee36
Compare
@sspaink sure, done |
Other fields seem to be emitted using lowercase, so this follows that convention as well as keeping processing of `deviceFieldNames` in line with the processing applied to `deviceFieldNames`.
Required for all PRs:
Adjusts the
smart
input plugin to collect SSD endurance information in more situations (for example, resolves #8701)The
smart
plugin already collects endurance information ifattributes
istrue
and the NVME log contains the attributePercentage Used
.However, this PR adjusts the plugin to also collect endurance information if it's available in the following locations
Percentage used endurance indicator
Percent_Lifetime_Remain
,Wear_Leveling_Count
orMedia_Wearout_Indicator
Because the SMART data fields are vendor specific, the ID cannot be relied on, so this PR introduces the ability to match rows by name rather than ID.
Additionally, for safety, it is not assumed that the value of any one of those field values could/should overwrite the other, so three new output fields are created
endurance_remain_perc
endurance_wear_levelling
endurance_media_wearout
For the majority of vendors, these values start at 100 and count down - it is expected that there will be exceptions, but this PR doesn't attempt to address those as it seems better to let the user handle those as they see fit at query time.