Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-exec: Error: rank 0: cat: Value too large for defined data type #4572

Closed
garlick opened this issue Sep 15, 2022 · 16 comments
Closed

flux-exec: Error: rank 0: cat: Value too large for defined data type #4572

garlick opened this issue Sep 15, 2022 · 16 comments

Comments

@garlick
Copy link
Member

garlick commented Sep 15, 2022

Redirecting input via flux exec works in the shell, but when launched inside CTI, I'm getting the errors

2022-09-15T14:06:03.224712Z broker.err[0]: channel buffer error: rank = 0 pid = 13014, stream = stdin, len = 1048576: Success
flux-exec: Error: rank 0: cat: Value too large for defined data type
2022-09-15T14:06:03.255490Z broker.err[0]: server_write_cb: lookup_pid: No such file or directory

I'm using Flux 0.40.0-15, it happens with cat, sed, and a minimal C program that redirects input. Haven't seen this before in CTI when launching other programs, but it could be something with the input redirection.

Originally posted by @ardangelo in #3631 (comment)

@garlick
Copy link
Member Author

garlick commented Sep 15, 2022

Based on a cursory look - we have a 4MB buffer in the broker and we treat filling it up as a fatal error. This probably wants end to end flow control but for now I wonder if we can handle the "buffer full" write error by backing off and retrying?

@ardangelo
Copy link

Update on this, I have been trying this approach again to ship files,
For small files it usually completes successfully. However, with larger files (100+ MB), the first invocation will often fail, but an immediate retry will work.

The error is different in this case,

cat /tmp/cti-adangelo/cti_daemonrW8WUO1.tar | flux exec -r 0 sed -n 'w /tmp/flux-71aFeA/jobtmp-0-ƒ9HCP58r2B/cti_daemonrW8WUO1.tar'
May 11 14:32:40.908869 broker.err[0]: Error writing 65536 bytes to subprocess pid 17760 stdin
May 11 14:32:40.911301 broker.err[0]: Error writing 65536 bytes to subprocess pid 17760 stdin: unknown pid
May 11 14:32:40.913095 broker.err[0]: Error writing 65536 bytes to subprocess pid 17760 stdin: unknown pid
(Repeated)

@grondo
Copy link
Contributor

grondo commented May 11, 2023

Unrelated to the actual bug discussed in this issue, I'll note that @garlick developed a better method for shipping files via flux-filemap(1).

This is integrated into a stage-in job shell option if that works in your use case. See the flux-shell(1) manpage for a description of the options.

Edit: though I didn't find any examples in the documentation of steps required to use to the stage-in plugin. We may want to add that. For now feel free to ask questions where things are not self-explanatory!

Edit2: There are some examples in the flux-filemap(1) manpage, but they do not include use of the stage-in job shell option.

@ardangelo
Copy link

We're using flux-filemap to ship files from the broker node to the non-broker nodes, but we still needed a way to get the file from the frontend where we're running our debugger tools to the broker node.

Although currently, we only are supporting running our tools from inside the flux start shell. Could we add files to the filemap directly in that case without worrying that the broker would be running somewhere else?

@grondo
Copy link
Contributor

grondo commented May 11, 2023

I've posted your question as the beginning of a Discussion thread here: #5168

I'm pretty confident flux-filemap(1) will handle your use case, but since it isn't clear, we can use the discussion in the Q&A thread to perhaps improve documentation or add a FAQ. It might help to give more specifics of how you're trying to use flux-filemap over in that issue. Thanks!

@chu11
Copy link
Member

chu11 commented Sep 17, 2024

Doing a simple

cat /tmp/achu/100mfile | flux exec -r 0 sed -n 'w /tmp/achu/foo.out'

eventually gets us

Sep 17 10:10:46.299969 PDT broker.err[0]: Error writing 131072 bytes to subprocess stdin
flux-exec: Error: rank 0: sed: internal fatal error: No space left on device

Adding some debug, I verified that we are writing way more than 4megs to stdin, but eventually the buffer is not being emptied faster than it is being filled.

Right now flux_subprocess_write() will simply send an RPC to the subprocess server and forget about it. So that leaves us little ability to do some flow control.

The devil is in the details, but I think we could easily respond to that write RPC. Therefore allowing us to "flow control" a bit via a flux_subprocess_write_async() and/or flux_subprocess_write_sync() function.

the sync() equivalent would make changes to flux-exec and flux-shell super easy but obviously make things slower. the async() would lead to a lot of changes in those places. And i'm not 100% sure, but I think it'd likely end up behaving similarly to sync() (i.e. wait until all respond before sending next chunk).

just some random brainstorms

  • support user config of larger stdin buffers
  • support user control to do sync on large stdin?
  • Edit: do sync() after 4 megs?

will ponder more.

@garlick
Copy link
Member Author

garlick commented Sep 17, 2024

Hmm yeah, to avoid blocking the receiver in whatever operation empties the buffer (which could be a stuck subprocess - causing deadlock), some kind of protocol change to RFC 42 seems like it is required.

Rather than adding write responses, some kind of credit based flow control scheme seems appropriate here. For example (just brainstorming):

  • change the sender so that it only sends up to credits bytes per channel before pausing (stopping for example the stdin fd watcher in flux-exec. This probably requires a callback to expose credits to the user? Like on_credit (int bytes)?
  • add a new exec response type for issuing credits for a given stream
  • each stream's receive buffer sends a credit type response initially for the full buffer size
  • then each time some data is taken out of the buffer, send some credits
  • enhancement: only send credit when a low water mark is reached

That should work for any buffer size and in fact would let us drop the default buffer to something more reasonable than 4MB.

@chu11
Copy link
Member

chu11 commented Sep 18, 2024

Rather than adding write responses, some kind of credit based flow control scheme seems appropriate here.

I like the idea and design in principle seems like it would work, but ...

stopping for example the stdin fd watcher in flux-exec

my immediate thought went to the KVS stdin. IIRC when we do cat foobar | flux run ... the data goes to the KVS guest.input first? The shell and flux job attach have changed a lot, so it's not 100% clear to me if this is still the default case.

Assuming it still is, I don't think at the moment there is a way to "pause" the KVS eventlog watcher. If hypothetically we just stop the watcher, we could restart it but we'd need some kind of "offset" or "seek" mechanism.

That was just my immediate thought.

@garlick
Copy link
Member Author

garlick commented Sep 19, 2024

Yeah that looks right (thanks for the shell input refactor @grondo). Data passes through the input eventlog unless input is being read from a file.

I suppose nothing changes if the subprocess user doesn't implement flow control. E.g. if they don't register a callback for receiving credits and just send data whenever. So we could implement flow control and get flux exec working properly, and then look at how best to deal with job input. Maybe job-info.eventlog-watch needs a size and offset parameter, so it has to be reissued when size (credits) runs out.

@chu11
Copy link
Member

chu11 commented Sep 19, 2024

(oops, I didn't hit comment last night before the comment just posted)

Assuming it still is, I don't think at the moment there is a way to "pause" the KVS eventlog watcher. If hypothetically we just stop the watcher, we could restart it but we'd need some kind of "offset" or "seek" mechanism.

Just a random thought, we could also do "credits" with the kvs-watch module. Or perhaps a more simple mechanism, effectively tell the watch to pause temporarily. A medium solution would be to say "give me another" to kvs-watch when user is ready. Depending on implementation, could have to deal with raciness where user has to cache "one round" of data from the kvs-watch.

(Edit: now that I think about it a "give me another" is basically making it synchronous ... that's probably not good)

Also, similarly to #6274, perhaps it'd be worthwhile to give a warning to users to use filemap or another mechanism for stdin if the size gets big.

@chu11
Copy link
Member

chu11 commented Sep 19, 2024

Maybe job-info.eventlog-watch needs a size and offset parameter, so it has to be reissued when size (credits) runs out.

it occurs to me that for truly correct implementation, a user has to buffer some data, b/c a single "data" entry from the KVS could be larger than the stdin buffer size. Just more work to do ....

@garlick
Copy link
Member Author

garlick commented Sep 19, 2024

Maybe job-info.eventlog-watch needs a size and offset parameter, so it has to be reissued when size (credits) runs out.

it occurs to me that for truly correct implementation, a user has to buffer some data, b/c a single "data" entry from the KVS could be larger than the stdin buffer size. Just more work to do ....

A size parameter would probably need to refer to the eventlog size not the data size, since job-info and kvs don't know about stdio event content. So if the credit value were passed in as the size, there would be wiggle room equal to the eventlog encoding overhead. (Corner case: ensure max event size placed in the KVS is < the buffer size).

@garlick
Copy link
Member Author

garlick commented Sep 19, 2024

Another possibility is we redo the input mechanism so that the KVS is not in the middle of everything and is instead just cc'ed for the record unless bypassed.

Edit: but my point earlier was I don't think job input would be affected if it just ignores credits for now. We could deal with this part later assuming we have some feasible options.

@chu11
Copy link
Member

chu11 commented Sep 19, 2024

Corner case: ensure max event size placed in the KVS is < the buffer size

Oh good point, that shouldn't be too hard. Lets do that first.

@garlick
Copy link
Member Author

garlick commented Sep 19, 2024

Oh good point, that shouldn't be too hard. Lets do that first.

First? I was just referring to a corner case to avoid if we decide to go with a size, offset argument for eventlog-watch. It returns whole events so it wouldn't do to have an event that could not be received in one go. But I was thinking flow control first, shell input later?

@chu11
Copy link
Member

chu11 commented Sep 19, 2024

First? I was just referring to a corner case to avoid if we decide to go with a size, offset argument for eventlog-watch. It returns whole events so it wouldn't do to have an event that could not be received in one go. But I was thinking flow control first, shell input later?

Yeah, I realized it after I thought about it. I think in my mind it was super simple to implement, therefore it could be done first. However, without the other stuff first, there's no ability to test it.

chu11 added a commit to chu11/flux-core that referenced this issue Oct 9, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 11, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 15, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 15, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 15, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 15, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 17, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 22, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 24, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 24, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 29, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 29, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 29, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 30, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 31, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Oct 31, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Nov 1, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
chu11 added a commit to chu11/flux-core that referenced this issue Nov 1, 2024
Problem: libsubprocess now supports stdin flow control via credits,
but that is not used in flux-exec.

Support credits and flow control in flux-exec to avoid overflowing
the stdin buffer.

Fixes flux-framework#4572
@mergify mergify bot closed this as completed in 2b403a9 Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants