Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent disk plugin behavior using autofs with different types of maps #3430

Closed
oplehto opened this issue Nov 6, 2017 · 8 comments · Fixed by #3440
Closed

Inconsistent disk plugin behavior using autofs with different types of maps #3430

oplehto opened this issue Nov 6, 2017 · 8 comments · Fixed by #3440
Milestone

Comments

@oplehto
Copy link
Contributor

oplehto commented Nov 6, 2017

Directions

Bug report

We moved from using NIS mapping autofs NFS mounts to using ones defined in regular text files. All of the sudden all of our servers started mounting all the filesystems defined in the autofs files. We tracked this down to Telegraf which started triggering all of our autofs mounts after the aforementioned change.

I found an (undocumented) ignore_fs parameter which can be used to fix this:

 [[inputs.disk]]
   interval = "300s"
   ignore_fs = ["autofs"]

What made this challenging to debug was that we had been running with the old Telegraf config for ages with Telegraf happily ignoring the NIS mapped autofs filesystems. Thus it was not the most obvious suspect.

This solves the immediate issue, but the underlying bug (inconsistent autofs behavior based on map) should be addressed and documentation improved.

System info:

Telegraf v1.4.1 9cf19df
RHEL 7.3

Steps to reproduce:

  1. Set up an autofs configuration for NFS mounts using NIS maps and regular text file maps
  2. Run telegraf with the inputs.disk plugin and see if the autofs mounts are triggered in either or both cases

Expected behavior:

Telegraf should either consistently mount or not mount the autofs mounts regardless of how the mounts are mapped (NIS, LDAP, file etc.)

It should clearly stated that autofs mounts may be triggered by the disk plugin unless they are ignored with the ignore_fs option (which itself should be documented!).

I feel that unmounted autofs filesystems should not be touched by the collector at all unless explicitly defined to do so. However I understand that changing this might cause issues for other users.

Actual behavior:

Telegraf disk plugin ignores autofs mounts when they are mapped via NIS but then triggers them when they are mapped in a regular file.

Additional info:

@oplehto oplehto changed the title Disk plugin behavior with automount Inconsistent disk plugin behavior using autofs with different types of maps Nov 6, 2017
@danielnelson
Copy link
Contributor

I think with NIS it probably could not find the mounts at all. I agree that it would be good to not trigger autofs filesystems if possible.

Can you add the contents of /proc/filesystems and /proc/self/mounts? If possible it would be nice to see them in both the NIS and plain text configurations.

@oplehto
Copy link
Contributor Author

oplehto commented Nov 7, 2017

Here are snippets key portion of /proc/self/mounts with one filesystem ( /xxx/it/foo4 ) automounted. The notable difference is that in the NIS case /proc/self/mounts does not list the mountpoints for unmounted filesystems.

Plain text map:

/etc/auto.master.d/auto.it /xxx/it/foo1 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
/etc/auto.master.d/auto.it /xxx/it/foo2 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
/etc/auto.master.d/auto.it /xxx/it/foo3 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
/etc/auto.master.d/auto.it /xxx/it/foo4 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
/etc/auto.master.d/auto.it /xxx/it/foo5 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
/etc/auto.master.d/auto.net /xxx/net/foo6 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
/etc/auto.master.d/auto.net /xxx/net/foo7 autofs rw,relatime,fd=7,pgrp=62536,timeout=300,minproto=5,maxproto=5,direct 0 0
nas21:/foo4 /xxx/it/foo4 nfs rw,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=3.43.12.12,mountvers=3,mountport=1234,mountproto=udp,local_lock=none,addr=3.43.12.12 0 0

NIS map:

auto.it /xxx/it autofs rw,relatime,fd=13,pgrp=1085,timeout=300,minproto=5,maxproto=5,indirect 0 0
auto.net /xxx/net autofs rw,relatime,fd=19,pgrp=1085,timeout=300,minproto=5,maxproto=5,indirect 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
tmpfs /run/user/0 tmpfs rw,nosuid,nodev,relatime,size=385776k,mode=700 0 0
nas21:/foo4 /xxx/it/foo4 nfs rw,relatime,vers=3,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=3.43.12.12,mountvers=3,mountport=1234,mountproto=udp,local_lock=none,addr=3.43.12.12 0 0

No difference in the /proc/filesystems:

nodev	sysfs
nodev	rootfs
nodev	ramfs
nodev	bdev
nodev	proc
nodev	cgroup
nodev	cpuset
nodev	tmpfs
nodev	devtmpfs
nodev	debugfs
nodev	securityfs
nodev	sockfs
nodev	pipefs
nodev	anon_inodefs
nodev	configfs
nodev	devpts
nodev	hugetlbfs
nodev	autofs
nodev	pstore
nodev	mqueue
	ext3
	ext2
	ext4
nodev	rpc_pipefs
nodev	nfsd
nodev	binfmt_misc
nodev	nfs
nodev	nfs4

@danielnelson
Copy link
Contributor

Since the actual NFS mount is also listed when mounted, I guess we should just always skip autofs filesystems. I can't think of any reason one would want to include them.

@danielnelson
Copy link
Contributor

I'm going to put the fix in 1.5 since there is an available workaround.

@halsafar
Copy link

halsafar commented Sep 6, 2019

So I actually spun up telegraf to monitor some autofs mounts.

Is there a way to re-enable this?

@danielnelson
Copy link
Contributor

Is there a way to re-enable this?

No, but if you did Telegraf would always keep the filesystem mounted. I suspect you would rather have Telegraf not mount the filesystems but report on them only if they are mounted?

@halsafar
Copy link

halsafar commented Sep 6, 2019

Actually in this case we expect the filesystem to be always mounted. Corporate infrastructure just has it first mount via AutoFS. We use the mounts so often they are always alive.

I might be able to manually mount everything via NFS (no AutoFS involved). I expect this will only temporarily work as this is a data center and the machines exporting a drive sometimes change (why we use AutoFS).

I will probably explore the PR and manually undo it for my needs. I might make a PR for adding an option to allow monitoring AutoFS.

To answer your question directly: Yes I would rather have telegraf only report on them if they are mount. Currently it doesn't report on them at all.

@tjb36
Copy link

tjb36 commented Sep 2, 2024

@danielnelson Can you confirm for me the default behaviour in Telegraf regarding this issue?

I have an NFS share which is configured with x-systemd.automount in the fstab file. Since I also have in fstab the parameter x-systemd.idle-timeout=10 , I expect my drive to unmount automatically again until needed. I want to use the inputs.disk plugin to monitor the disk usage of this NFS drive. Do I expect that the inputs.disk plugin is able to remount the drive in order to collect the metric?

In the "Bug fixes" for V1.5 here, I see "Always ignore autofs filesystems in disk input." Is this relevant for what I am asking, or is this simply about ignoring already-mounted drive which have the tag "aufs" ?

Please advise. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants