-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multipathd: Implement shutdownifnomultipath
#78
Conversation
I would prefer to see this broken up into two commits. One to add the functionality to multipathd, and the other to add the new service. Also, since you don't have DefaultDependencies=no in multipathd/multipathd-shutdown-no-multipath.service, it should already be starting after basic.target, so your current "After" dependencies are unnecessary. But like Martin mentioned before, you might want to wait even longer for this to run, perhaps "After=multi-user.target". This would give devices unnecessary in boot as much time as possible to appear and get multipathed before you check if any multipath devices exist. |
Sure thing. I'll work on splitting the changes soon.
Interesting, indeed I thought about using Thanks. |
3127117
to
c70c19e
Compare
Some remarks: IIRC, the first step for fixing this issue should be to disable multipathd socket activation; at least nobody has contradicted my statement saying so in #76. Secondly, I dislike the "shutdownifnomultipath" verb. multipathd supports multi-word commands, so we should be able to implement something like Wrt waiting after boot, we could even combine the auto-shutdown service with a systemd timer. Basically what we could do is implement a timer that waits for (say) 60s after activation, and runs "shutdown if idle" after that. Such a timer could even be combined with a setup where |
I don't have much time to reply right now, but IIRC leaving the socket activation enabled has the benefit of covering the case when the user adds a device to a live system. Maybe I'm misremembering, though.
I'll take a look at implementing the multi-word command.
A systemd timer works, too. I like the idea of having multipathd shutting itself off even after the addition of new devices to a live system. |
I still think this is a corner case, as I said before ("We are now discussing automation of 4. to avoid 5."). In general I think users you have multipath hardware should generally enable multipathd, and users who don't should not. Is that asking too much from users? But of course, if you want a fully automated setup that works for everyone without investing thought in their setup, we may want to go down that route.
Instead of creating just
The timer would work for both boot and and device addition. By setting We should also be aware of the All this said, if you have patches, please submit them to the dm-devel mailing list as described in README.md. |
Currently, socket activation doesn't work properly with multipath-tools (see opensvc#76 for a lengthy discussion about the issue). Due to the way multipath-tools and udev interact, it is not possible to guarantee that the daemon will be stopped even if there are no multipath devices/maps present in the system. In order to workaround that limitation, this commit proposes a new daemon command called "shutdown if-idle". What it does is check if any multipath maps have been found during system initialization. If none exist, then the daemon will shut itself down. Closes: opensvc#76 Signed-off-by: Sergio Durigan Junior <[email protected]>
…down if-idle" On top of the new "shutdown if-idle" command, we use a oneshot (actually "exec") systemd service that will be executed 120 seconds after booting up the system or after multipathd.service is active, to send the command mentioned above to the daemon. Note that this approach is still not perfect: if new devices are added to the system while it's running, multipathd will be started, even if no multipath'ing is done. But that's better than the current state where the daemon stays on unconditionally. Signed-off-by: Sergio Durigan Junior <[email protected]>
Signed-off-by: Sergio Durigan Junior <[email protected]>
c70c19e
to
b2e1023
Compare
Sorry, this fell between the cracks... I've now updated the branch following the suggestions from @mwilck. I haven't tested it fully yet, but would appreciate if you could take another look meanwhile and check if it follows what you had in mind. As for the |
A command that gets sent some fixed time later sounds like a good place to introduce race conditions under load. Can this not be a config option that is always active on distros choosing such a setup? |
I am not sure I understand what you're asking for. What config option do you want, and how should its semantics be? I figure that both you and @sergiodj are working for the Debian project, so perhaps the two of you want to find an agreement on the desired behavior? Wrt the race condition, this is a valid concern. Something like the following could happen:
We could avoid this scenario by doing the equivalent of an "udevadm settle" (i.e. wait for pending uevents to finish, and make sure multipathd has processed all uevents it is aware of, i.e. multipathd's internal uevent queue is empty) when multipathd receives an "shutdown if-idle" command. I think if we did this, we'd be safe. I can't perceive other race conditions. If the timer expired before 3. (or 2.) above, "multipath -u" would be called while processing the udev rules, and start another multipathd instance. Anyway, @sergiodj, I need to ask you to send this to [email protected] (with @bmarzins and myself on CC). I think this needs a broader audience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted, this needs to be discussed on the dm-devel mailing list.
@@ -224,6 +224,7 @@ load_keys (void) | |||
r += add_key(keys, "setmarginal", VRB_SETMARGINAL, 0); | |||
r += add_key(keys, "unsetmarginal", VRB_UNSETMARGINAL, 0); | |||
r += add_key(keys, "all", KEY_ALL, 0); | |||
r += add_key(keys, "if-idle", KEY_IF_IDLE, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: as noted before, I would prefer the more natural shutdown if idle
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand that that was your preference; I thought you were OK with either. I'll adjust the code accordingly.
@@ -6,6 +6,8 @@ Wants=systemd-udevd-kernel.socket @MODPROBE_UNIT@ | |||
After=systemd-udevd-kernel.socket @MODPROBE_UNIT@ | |||
After=multipathd.socket systemd-remount-fs.service | |||
Before=initrd-cleanup.service | |||
Wants=multipathd-shutdown-if-idle.timer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't do this. The timer should be independently enabled by either the admin or the distro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. I don't remember why I put that there, but I'll remove it.
@@ -0,0 +1,10 @@ | |||
[Unit] | |||
Description=Device-Mapper Multipath Conditional Shutdown Trigger | |||
After=multi-user.target |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be certain, we should als add "After=multipathd.service" here IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK.
condlog(3, "shutdown if no multipath map found (operator)"); | ||
|
||
if (VECTOR_SIZE (vecs->mpvec) == 0) | ||
exit_daemon(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted by @zeha, this is race-prone. We need to make sure that there are no uevents pending, neither in udev itself nor in multipathd's own queues, before actually exiting.
@zeha and I talked in private yesterday and I believe we've reached a consensus on how to incorporate these changes into Debian.
OK, I'll see about implementing this change. I'll have to discover how to access multipathd's internal uevent queue and check if it's empty.
Will do. Thanks. |
BTW, I'd appreciate some pointers here. My experience dealing with libudev is very limited, and I'm no expert on multipath-tools either. I know programs aren't supposed to use |
Right. On the multipathd side, you will find the relevant code in
I need to research this myself. There seems to be no official API for this. A naïve implementation might look roughly like this: int _is_udevq_idle() {
int st;
if (is_uevent_busy())
return false;
st = system("/usr/bin udevadm settle -t 0")
if (!WIFEXITED(st) || WEXITSTATUS(st))
return false;
return true;
}
int is_udevq_idle() {
if (!_is_udevq_idle())
return false;
// give uevent listener has a chance to run
// listener should have higher prio than this thread to be certain
sched_yield();
// then check again
return _is_udevq_idle();
} This isn't completely safe because new uevents can arrive at any time, but I suppose it's safe enough for practical purposes. The question is what we should do if this returns false (just fail, wait for some time, or call This begs the question, again, whether the "automatic exit" idea makes sense at all. It's a lot of complexity added for a minor improvement of user experience (see my previous breakdown). |
My idea would basically be: New config option, "multipathd_no_devices_found". (Upstream) defaults to "stay_running", keeping today's behaviour. But this is an idle idea, without having looked at the multipathd startup code. |
Hi, Sorry for the delay in replying. I've been doing this work as a side task, but $DAYJOB is now a bit more hectic than expected. Mainly for this reason, but also because I agree with the assessment that this is a lot of work/complexity for a relatively minor gain, I do not intend to continue working on this feature. Thank you all for the invaluable help and interesting discussions that made me understand more about the world of multipath'ed devices. @zeha, I will send a Merge Request against Debian's multipath-tools package in order to revert part or all of the changes that I'd suggested a while ago regarding the socket activation feature. This whole "side project" made me realize that the changes don't work as expected, and as such should not be advertised in the |
Currently, socket activation doesn't work properly with
multipath-tools (see #76 for a lengthy discussion about the issue).
Due to the way multipath-tools and udev interact, it is not possible
to guarantee that the daemon will be stopped even if there are no
multipath devices/maps present in the system.
In order to workaround that limitation, this commit proposes a new
daemon command called
shutdownifnomultipath
. It does what it sayson the tin: if no multipath maps have been found during the system
initialization, the daemon will shut itself down. On top of that, we
use a oneshot (actually "exec") systemd service that will be executed
only once, after everything's been set up, to send the command
mentioned above to the daemon.
Note that this approach is still not perfect: if new devices are added
to the system while it's running, multipathd will be started, even if
no multipath'ing is done. But that's better than the current state
where the daemon stays on unconditionally.