Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-exec: support guest exec via flux-shell #2298

Closed
garlick opened this issue Aug 10, 2019 · 7 comments
Closed

flux-exec: support guest exec via flux-shell #2298

garlick opened this issue Aug 10, 2019 · 7 comments

Comments

@garlick
Copy link
Member

garlick commented Aug 10, 2019

Could the flux-shell call flux_subprocess_server_start() and thereby offer the job owner a way to launch arbitrary tasks along side their job, and inside any container set up by the IMP, for debugging, monitoring, etc?

Maybe the flux exec front end could then be modified to optionally accept a jobid, and then interpret the rank idset as shell ranks? Then you could do stuff like

$ flux exec --jobid 234234234 -r all ps -fu $(id -u)

If eventually we had pty support in flux exec you could do something like

$ flux exec --jobid 234344343 -r 0 top

It might be a little tricky to map the jobid + shell rank to the broker rank that the shell has registered its job-<id> service on (I think R would need to be parsed by flux exec). Other than that, if the "subprocess server" is ready to be embedded in the shell, it seems like this feature would mostly reuse existing work...

@grondo
Copy link
Contributor

grondo commented Aug 10, 2019

Yes, I had been thinking something very similar to this. It might even be required to allow new processes to enter the same job container

@grondo
Copy link
Contributor

grondo commented Aug 10, 2019

BTW, when we had discussed this same idea before, it was yet another discussion that led to "flux broker == flux shell". I wonder if that idea requires another look before we keep getting ideas to add broker features to the shell.

@garlick
Copy link
Member Author

garlick commented Aug 10, 2019

These things worry me about that idea:

  • broker is structured fundamentally to be the primary message router for an instance
  • broker code presumes it runs as the instance owner and supports multi-user, while shell presumes it runs as guest. Dual role might get confusing?
  • broker doesn't presume it has access to enclosing instance services, unlike shell (--standalone notwithstanding).
  • broker is old and would look different if we wrote it now, yet it is stable and we are successfully building lots of things on top of it

In short, I think the broker would need surgery to make it serve both roles, and in doing so, we'd likely find ourselves wanting to modernize it as well. We might also create something that is harder to maintain because of the different roles it must run in.

I think it may be more expedient in our current situation to factor out areas where we have duplicate code rather than try to develop one executable that works in both contexts. Happily libsubprocess is one place where such code is already abstracted into a library.

@grondo
Copy link
Contributor

grondo commented Aug 10, 2019 via email

@grondo
Copy link
Contributor

grondo commented Aug 10, 2019

It might be a little tricky to map the jobid + shell rank to the broker rank that the shell has registered its job- service on (I think R would need to be parsed by flux exec).

For debugger support we need to build the job process table that maps hostnames to pids and taskids (#2163, flux-framework/rfc#187). Maybe we could generalize this and include broker ranks as well (or perhaps more generically, service address). In the rare case a tool or user wants to use the shell exec service, it would first request generation of the mpir proctable. (Just another idea. It might be more generically useful to have an R parsing library)

@grondo
Copy link
Contributor

grondo commented Aug 3, 2020

Over the weekend I coded up a proof-of-concept implementation of a shell "exec" plugin using a subprocess server. It was actually fairly straightforward. Main issues were:

  • libsubprocess hardcodes the "remote" subprocess service prefix name to cmb.rexec. I ended up adding an optional command option to set an alternate service endpoint that would be used by flux_rexec(3), e.g. flux_cmd_setopt (cmd, "service", "shell-123456.rexec").

  • flux_subprocess_server_start(3) registers message handlers internal with rolemask == 0 so I believe only FLUX_ROLE_OWNER messages could be handled by the subprocess server, defeating the goal here. We could allow the flags to be set on initialization by the caller, but unfortunately FLUX_ROLE_USER lets in all users. What I ended up doing in the proof of concept was setting FLUX_ROLE_USER by default, but change all message handlers to reject any message without FLUX_ROLE_OWNER or a userid matching the current user.

  • The shell guest exec plugin may want to add to the environment of spawned processes, but this isn't currently possible. Instead the environment would be set by the env key of the request. I haven't addressed this one yet, but it would seem a callback that could be registered with the subprocess server would work here (and the hard-coded local_uri member of the server struct could then be dropped.

As an aside to the 2nd bullet, it occurred to me it would be very convenient if there was some kind of rolemask like FLUX_ROLE_USER_ONLY which only allowed messages from current uid and FLUX_ROLE_OWNER. This would be useful for user-registered services as in the shell (and might prevent accidental security holes) (Essentially we already have this in the shell as implemented in flux_shell_service_register(3), however it is not possible to use this function call for the subprocess server, which registers message handlers internally.)

@grondo
Copy link
Contributor

grondo commented May 15, 2024

Fixed by 99aa6be

@grondo grondo closed this as completed May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants