Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] ZFS Zpool properties on Linux #2616

Closed
tomtastic opened this issue Apr 3, 2017 · 57 comments
Closed

[Feature request] ZFS Zpool properties on Linux #2616

tomtastic opened this issue Apr 3, 2017 · 57 comments

Comments

@tomtastic
Copy link

tomtastic commented Apr 3, 2017

Feature Request

Currently, only FreeBSD has a zfs input plugin which reports zpool list properties. Linux should have this too ideally.

Proposal:

Using the existing inputs.zfs 'poolMetrics = true||false' option, to decide whether or not to try executing : '/bin/zpool list -Hp' and parsing the output like the FreeBSD code does.

Current behavior:

Only kstats are queried on Linux platforms, whereas FreeBSD queries both kstats and the userspace
'zpool list -Hp' command.

Desired behavior:

telegraf allowed to run 'zpool list' and collection of stats enabled :

Apr 03 20:55:40 nibbler telegraf : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/sbin/zpool list -Hp
> zfs_pool,host=nibbler,pool=rpool allocated=3695163326464i,capacity=92i,dedupratio=1,fragmentation=51i,free=290566324224i,health="ONLINE",size=3985729650688i 1527000570000000000

Use case:

Monitoring zpool health and fragmentation is a really useful metric for ZFS users,
as it can alert administrators to problems requiring urgent attention.

@tomtastic
Copy link
Author

tomtastic commented Apr 3, 2017

I think this only requires a minor re-write of the zfs_linux.go plugin code, and an explaination in the telegraf.conf inputs.zfs stanza

I have some working code currently which I will submit as a PR soon, does influxdata prefer single (merged) commits for PRs ?

@danielnelson
Copy link
Contributor

For commits you can leave them separate, when we merge PRs we generally squash them.

@tomtastic
Copy link
Author

Thanks @danielnelson.
I'm having a problem with another PR, circle-test.sh timing out whilst pulling docker.io/stealthly/docker-riemann (#2623)
Something I've done?

@danielnelson
Copy link
Contributor

I started a rebuild for it, we need to improve our test script to not pull the docker image all the time. If something like this happens again just leave a comment on the PR.

@rwarren
Copy link

rwarren commented Jun 30, 2017

Any update on your work, @tomtastic?

I've found this issue while trying to figure out if any of the detailed parameters that are collected by the existing zfs plugin actually yield (even if indirectly) any of the significantly more useful parameters like:

  • health -- ie: do I need to replace a hard drive?
    • not sure how you would report this out via telegraf, anyway?
  • size -- current size of pool
  • alloc -- amount of space used in pool
  • free -- amount of space free in pool
  • frag -- a rough measure of how fragmented the free space is
  • cap -- capacity percentage that is used (ie: 0% = empty)

Several of these are what you are addressing with this issue (and your patch work), and I think they are significantly more useful than the low level stats that exist right now!

On the health reporting point... is there any way to get telegraf to report stats like this, or does it need to be a number? Simply enumerating the health states (ONLINE, DEGRADED, FAULTED,OFFLINE, UNAVAIL, REMOVED) could work and allow for simple alarms at the telegraf output destination, but I'm wondering what the norm is. End objective (for health) is really "which telegraf'd stat can I use to figure out if a device may need replacing in my pool"?

@tomtastic
Copy link
Author

tomtastic commented Jul 14, 2017

Hi @rwarren, sorry for the very slow reply.
I have it working and reporting all the parameters you list above which I mostly view in grafana.
zpool_grafana

I'll pull my finger out and commit the code very shortly.

@rwarren
Copy link

rwarren commented Jul 14, 2017

@tomtastic -- awesome, thanks! Grafana is what I'm using for visuals as well, and that looks good. Hopefully your PR will be accepted.

@tomtastic
Copy link
Author

tomtastic commented Jul 14, 2017

You can test it from here : https://github.com/tomtastic/telegraf/tree/better_zpool_linux

I'm reminded looking at the code, that 'size, alloc, free, dedup' aren't presented currently as ZFSonLinux doesn't get the syntax for 'zpool list -Hp' (p for parsable) until version 0.7.0.
But we've the useful bits like fragmentation and health!

I need to write some unit tests for it all before I submit the PR.

@tomtastic
Copy link
Author

tomtastic commented Jul 15, 2017

Having trouble getting the unit tests to build, I can run 'make test-short', but it only indicates the zfs plugin fails to build, no useful errors.

@danielnelson Is there anyone I could ask for help with the unit-tests here?

@danielnelson
Copy link
Contributor

I can try to help, can you paste the output here?

@tomtastic
Copy link
Author

tomtastic commented Jul 19, 2017

Its quite likely the code for the unit tests is garbage, the required changes are small but I don't have enough knowledge of golang to make it any better.

ok      github.com/influxdata/telegraf/plugins/inputs/webhooks/papertrail       0.022s
ok      github.com/influxdata/telegraf/plugins/inputs/webhooks/rollbar  0.019s
?       github.com/influxdata/telegraf/plugins/inputs/win_perf_counters [no test files]
--- FAIL: TestZfsPoolMetrics (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x65af8e]

goroutine 5 [running]:
testing.tRunner.func1(0xc42006ad00)
        /home/trcm/src/umake/go/go-lang/src/testing/testing.go:622 +0x29d
panic(0x6aa500, 0x852330)
        /home/trcm/src/umake/go/go-lang/src/runtime/panic.go:489 +0x2cf
github.com/influxdata/telegraf/plugins/inputs/zfs.(*Zfs).gatherPoolStats(0xc420045f58, 0x830dc0, 0xc420068ea0, 0x0, 0xc420015b00, 0x200000003, 0xc4200f61a0)
        /home/trcm/src/test/src/github.com/influxdata/telegraf/plugins/inputs/zfs/zfs_linux.go:19 +0x3e
github.com/influxdata/telegraf/plugins/inputs/zfs.(*Zfs).Gather(0xc420055f58, 0x830dc0, 0xc420068ea0, 0xc420085a58, 0xc420015ad0)
        /home/trcm/src/test/src/github.com/influxdata/telegraf/plugins/inputs/zfs/zfs_linux.go:101 +0x130
github.com/influxdata/telegraf/plugins/inputs/zfs.TestZfsPoolMetrics(0xc42006ad00)
        /home/trcm/src/test/src/github.com/influxdata/telegraf/plugins/inputs/zfs/zfs_linux_test.go:159 +0xe5a
testing.tRunner(0xc42006ad00, 0x70ec38)
        /home/trcm/src/umake/go/go-lang/src/testing/testing.go:657 +0x96
created by testing.(*T).Run
        /home/trcm/src/umake/go/go-lang/src/testing/testing.go:697 +0x2ca
FAIL    github.com/influxdata/telegraf/plugins/inputs/zfs       0.032s
ok      github.com/influxdata/telegraf/plugins/inputs/zookeeper 0.043s

@danielnelson
Copy link
Contributor

Looks like Zfs.zpool is nil, but is being called on line 19 of gatherPoolStats.

@mazhead
Copy link

mazhead commented Oct 5, 2017

@tomtastic Any chance to get this moving again? ZFS could be really useful.

@tomtastic
Copy link
Author

I don't know enough Golang currently to fix the unit test code, sorry @mazhead.

But if you take the zfs plugin code from my branch https://github.com/tomtastic/telegraf/tree/better_zpool_linux you can compile up a working version, it's just the unit-tests I can't sort out for submitting the PR to get this merged.

@richardelling
Copy link
Contributor

Parsing zpool output approach won't work well in practice. Also, it includes only a small fraction of the interesting info. The approach I've started taking that does work better is a zpool replacement that directly calls zfs ioctl and gets the information as nvlists. These are then written out as influxdb line protocol. This is called as an inputs.exec, but could also be a self-timed daemon.

For both zpool command and a zfs ioctl consumer, there is a blocking situation that can occur and must be dealt with: when the going gets tough, zpool command blocks. Timeouts already exist for inputs.exec and it is unclear to me if other telegraf agents inherit the timeouts feature in such a way that blocking in the zfs ioctl is properly handled.

Architecturally, is feels cumbersome to include a zfs ioctl caller as native golang agent. The ioctl interface is not a stable interface and will cause integration pain forever. By contrast, the zpool replacement can be distributed with the ZFSonLinux code, thus keeping up with the zfs ioctl changes thus keeping the interface boundary at the influxdb line protocol layer.

Thoughts?

@richardelling
Copy link
Contributor

To spur the conversation, here's an example of the output of scan statistics collector:

zscanstat,function=scrub,pool=zwimming,state=finished end_ts=1510250221i,errors=0i,examined=115474432i,function="scrub",pass_examined=1i,pause_ts=0i,paused_t=0i,pct_done=100.03,processed=0i,rate=1i,remaining_t=0i,start_ts=1510250217i,state="finished",to_examine=115437568i,to_process=0i 1513214163000000000

it is simply not possible to get that data from screen-scraping zpool status output.

Of more interest to me personally, the performance histograms are stored as nvlist arrays.

@danielnelson
Copy link
Contributor

Sounds great, have you already proposed the addition of your tool to the ZFSonLinux team? Will it output in line protocol directly?

@richardelling
Copy link
Contributor

@danielnelson the tool outputs line protocol directly. I have not yet proposed it upstream, but it slides into the source tree nicely as a new command, so there aren't any conflicts with existing. To properly complete it, there is more work to complete and automated tests to be built.

Think of it as a goal of "zpool status + zpool iostat + zpool get" all rolled into one and output as influxdb line protocol.

@danielnelson
Copy link
Contributor

That would be great for us, when you feel it is ready we can link to it from the zfs README as well.

@tomtastic
Copy link
Author

tomtastic commented May 18, 2018

@richardelling I pretty much gave up on finishing the unit-test for my PR when you announced your work on this.
Do you have time to complete your tool and automated tests?

@richardelling
Copy link
Contributor

Hi @tomtastic I have made some progress, but it is not clear that operating as a telegraf plugin is the best approach. Why? Because if there is a failure in the zfs.ko module or a pool becomes suspended then the data collection will hang indefinitely. This is also true for zpool status so you can't get around it by another method. So, rather than hang inside telegraf, it should still remain an external command writing in line protocol. I'll be able to get back on this RSN.

@tomtastic
Copy link
Author

It's been just over one year since I had code to support getting better zpool properties on Linux and (now that Ubuntu and others are distributing zfsonlinux 0.7.0+) that code has zpool-property-feature-parity with the FreeBSD code.
If you don't think you're likely to have your external tools ready in the next three to six months, I'll dedicate some time to finishing the last bits necessary in zfs_linux_test.go, as it's about time this issue was closed off in my humble opinion.

@richardelling
Copy link
Contributor

richardelling commented May 30, 2018

@tomtastic I've put an initial commit for zpool_influxdb collector to be used with telegraf's inputs.exec plugin. More testing required, of course.

I'm still concerned about how this will hang during some pool suspension events and how that could affect execs over prolonged time. This should be similar to hangs when you exec the zpool command. More testing under failure cases needed.

https://github.com/richardelling/zfs-linux-tools/tree/influxdb-wip/telegraf

@rwarren
Copy link

rwarren commented Jul 18, 2018

@tomtastic: Somewhat unrelated to this PR, but... what plugin did you use for making that great "zpool status" table in Grafana? I'm trying to replicate it with no luck yet. I can't get rid of the Time column with the Table plugin, nor can I get the rows right.

@rwarren
Copy link

rwarren commented Jul 18, 2018

@tomtastic Never mind... I finally figured it out as a combo of many things (using last() and an alias() for each metric, hiding the time column, etc).

I also mapped my zpool health to integers in rough order of severity so I could use thresholding and get proper alerting. This conversion of health to an integer is really useful should probably be done in this plugin as well!

My current panel state:
image

And here's how I did the health state integer mapping in order to enable alerts and colorizing (green for ONLINE, orange for DEGRADED, red for worse):

image

The generated zpool_health_int field is also easy to set up an alert for when the value differs from 0, as well as providing a clean historical chart to see when deviations happened and for how long. Not so easy with string fields. FWIW, I'm sending a value of 10 for unrecognized text values to future-proof it a bit.

@richardelling
Copy link
Contributor

@rwarren the "health" you see in zpool command output is actually derived from the internal
integers. But it isn't a direct value, there are too many conditions to make it an obvious result.
That said, do you think it would be beneficial for the ZFS collector to provide the raw numbers, too?
Or perhaps a syslog-like priority + severity tuple?

@rwarren
Copy link

rwarren commented Jul 18, 2018 via email

@tomtastic
Copy link
Author

tomtastic commented Jul 18, 2018

Thanks rwarren, cool use of Grafana, could you pop your panel JSON for that into a gist for me ?
Or drop me an email, whatever is easier.

@ddimick
Copy link

ddimick commented May 3, 2019

@nyanloutre Thank you, exactly what I was looking for.

yvasiyarov added a commit to yvasiyarov/telegraf that referenced this issue Nov 25, 2019
yvasiyarov added a commit to yvasiyarov/telegraf that referenced this issue Nov 28, 2019
yvasiyarov added a commit to yvasiyarov/telegraf that referenced this issue Nov 28, 2019
yvasiyarov added a commit to yvasiyarov/telegraf that referenced this issue Nov 28, 2019
yvasiyarov added a commit to yvasiyarov/telegraf that referenced this issue Dec 19, 2019
Add freeing and leaked options collection
@yvasiyarov
Copy link

@danielnelson can you please merge #6724 which will close this feature request ?

@tomtastic
Copy link
Author

Thank you for this @yvasiyarov, I'm very happy you've closed this issue :)

I might add it's a shame that rather than receiving any help/collaboration 2.5 years ago with the unit tests on my (otherwise working) code to close this issue, we ended up waiting for someone to re-write it in a way they preferred, but seemingly never released.
This wasn't helpful in making me want to continue contributing to Influxdata,... and probably has that affect to others too.

@RafaelSchridi
Copy link

After such a long time it's probably easier to rewrite than to update.

But now that's done can we close this and merge the other pull-request? I've been waiting for this for almost 3 years now 😄.

@tomtastic
Copy link
Author

Happily closing this!

@yvasiyarov
Copy link

@tomtastic please do not close the future request until PR is get merged.
Its not yet merged as of now. We need help from @danielnelson here

@peterdk
Copy link

peterdk commented Feb 1, 2020

Looking forward to this.

@lastb0isct
Copy link

I see that this is closed, but this still doesn't not seem to have been merged, has it?!

@richardelling
Copy link
Contributor

FYI, a zpool_influxdb command has been added to the OpenZFS upstream master branch. It is integrated into the CI/CD system and should be available in future builds automatically. See https://github.com/openzfs/zfs/tree/master/cmd/zpool_influxdb

@iwvelando
Copy link

iwvelando commented Feb 6, 2021

In case it benefits anyone else while waiting for this PR to be sorted out, I put together a Go utility to be invoked by a telegraf exec plugin that parses both zpool list -H -p and zpool status -s -p; it's been tested on Ubuntu 20.04 ZFS 0.8.3. https://github.com/iwvelando/telegraf-exec-zpool-status

@thoernle
Copy link

Are there any updates on this?

@tomtastic
Copy link
Author

Richards' zpool_influxdb tool is available in the zfs-2.1.0 release.

@thoernle
Copy link

Richards' zpool_influxdb tool is available in the zfs-2.1.0 release.

Yes, and I am really looking forward to this but since this is not implemented in telegraf we have to upgrade our whole system to accomplish that.

@tomtastic
Copy link
Author

You simply use it with telegraf like this :

[[inputs.execd]]
  ## Program to run as daemon
  command = ["/usr/local/bin/zpool_influxdb", "--execd"]

  ## Define how the process is signaled on each collection interval.
  ## Valid values are:
  ##   "none"   : Do not signal anything.
  ##              The process must output metrics by itself.
  ##   "STDIN"   : Send a newline on STDIN.
  ##   "SIGHUP"  : Send a HUP signal. Not available on Windows.
  ##   "SIGUSR1" : Send a USR1 signal. Not available on Windows.
  ##   "SIGUSR2" : Send a USR2 signal. Not available on Windows.
  signal = "STDIN"

  ## Delay before the process is restarted after an unexpected termination
  restart_delay = "30s"

  ## Data format to consume.
  ## Each data format has its own unique set of configuration options, read
  ## more about them here:
  ## https://github.com/influxdata/telegraf/blob/master/docs/DATA_FORMATS_INPUT.md
  data_format = "influx"

@tomtastic
Copy link
Author

tomtastic commented Jul 30, 2021

You will still need to muck about with value mappings for the health of a pool like @rwarren's example years earlier in this thread, and I've no idea how to extract capacity as a percentage ?!

@richardelling
Copy link
Contributor

Richards' zpool_influxdb tool is available in the zfs-2.1.0 release.

Yes, and I am really looking forward to this but since this is not implemented in telegraf we have to upgrade our whole system to accomplish that.

Unfortunately, this will never happen. The interfaces used to get pool information are not stable across releases. So you'd have to update telegraf every time ZFS changed... ugly. It is more stable to have changes in ZFS reflected in the zpool_influxdb command as a more stable interface to telegraf. A bit unfortunate, but it is the state of the world. See also https://github.com/openzfs/zfs/tree/master/cmd/zpool_influxdb#caveat-emptor

@richardelling
Copy link
Contributor

You will still need to muck about with value mappings for the health of a pool like @rwarren's example years earlier in this thread, and I've no idea how to extract capacity as a percentage ?!

zpool_influxdb puts health as a tag, does that suffice?

With regards to a capacity %, that is easy to do in a query. If you'd like to see it from zpool_influxdb then please file an issue there and ping me.

@timdonovanuk
Copy link

timdonovanuk commented Feb 9, 2022

Capacity would be great and would align with the standard [[inputs.disk]] metric (used_percent) captured by telegraf.

With regards to a capacity %, that is easy to do in a query

The whole reason I'm here is it's literally not easy to do in a query ;) Flux is not a simple query language, and while after a bit of hacking I managed it in a query, translating that again into a query suitable for an influxdb alert was not possible.

As @rwarren mentioned 5 years ago, while we all appreciate every little esoteric arc value is shipped by telegraf for monitoring, the most objectively useful one not available.

Edit:

I guess this is the best solution right now:

<telegraf.conf>

[[inputs.exec]]

  commands = ["zpool list -H -o capacity -p"]
  name_override = "zpool_used_percent"
  data_format = "value"
  data_type = "integer" # required

@richardelling
Copy link
Contributor

Capacity for mounted datasets is available in the inputs.disk data for Linux. Are you looking for capacity of zvols?

Unfortunately, the interface for the zfs command isn't stable (as in interface stability). However, the FreeBSD telegraf zfs plugin code does "screen scrape" the zfs command for capacity and historically, zfs list is more stable than pool properties. @timdonovanuk suppose we expand that to other OSes? If so, file a new issue because it is a dataset property and not a pool property: basically remove the "BSD only" comment on the zfs_dataset metric.

@tsquillario
Copy link

This worked for me:

#!/usr/bin/python3

import json
from subprocess import check_output

columns = ["NAME", "SIZE", "ALLOC", "FREE", "CKPOINT", "EXPANDSZ", "FRAG", "CAP", "DEDUP", "HEALTH", "ALTROOT"]
health = {'ONLINE':0, 'DEGRADED':11, 'OFFLINE':21, 'UNAVAIL':22, 'FAULTED':23, 'REMOVED':24}

stdout = check_output(["zpool", "list", "-Hp"],encoding='UTF-8').split('\n')
parsed_stdout = list(map(lambda x: dict(zip(columns,x.split('\t'))), stdout))[:-1]

for pool in parsed_stdout:
    for item in pool:
        if item in ["SIZE", "ALLOC", "FREE", "FRAG", "CAP"]:
            pool[item] = int(pool[item])
        if item in ["DEDUP"]:
            pool[item] = float(pool[item])
        if item == "HEALTH":
            pool[item] = health[pool[item]]

print(json.dumps(parsed_stdout))
[[inputs.exec]]
  commands = ["/app/scripts/zfs-health-check.py"]
  name_override = "zfs"
  data_format = "json"
  timeout = "10s"
  json_string_fields = ["SIZE", "ALLOC", "FREE", "FRAG", "CAP", "DEDUP", "HEALTH"]
  tag_keys = ["NAME"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests