-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flux-exec: Error: rank 0: cat: Value too large for defined data type #4572
Comments
Based on a cursory look - we have a 4MB buffer in the broker and we treat filling it up as a fatal error. This probably wants end to end flow control but for now I wonder if we can handle the "buffer full" write error by backing off and retrying? |
Update on this, I have been trying this approach again to ship files, The error is different in this case,
|
Unrelated to the actual bug discussed in this issue, I'll note that @garlick developed a better method for shipping files via flux-filemap(1). This is integrated into a Edit: though I didn't find any examples in the documentation of steps required to use to the stage-in plugin. We may want to add that. For now feel free to ask questions where things are not self-explanatory! Edit2: There are some examples in the |
We're using Although currently, we only are supporting running our tools from inside the |
I've posted your question as the beginning of a Discussion thread here: #5168 I'm pretty confident |
Doing a simple
eventually gets us
Adding some debug, I verified that we are writing way more than 4megs to stdin, but eventually the buffer is not being emptied faster than it is being filled. Right now The devil is in the details, but I think we could easily respond to that write RPC. Therefore allowing us to "flow control" a bit via a the just some random brainstorms
will ponder more. |
Hmm yeah, to avoid blocking the receiver in whatever operation empties the buffer (which could be a stuck subprocess - causing deadlock), some kind of protocol change to RFC 42 seems like it is required. Rather than adding write responses, some kind of credit based flow control scheme seems appropriate here. For example (just brainstorming):
That should work for any buffer size and in fact would let us drop the default buffer to something more reasonable than 4MB. |
I like the idea and design in principle seems like it would work, but ...
my immediate thought went to the KVS stdin. IIRC when we do Assuming it still is, I don't think at the moment there is a way to "pause" the KVS eventlog watcher. If hypothetically we just stop the watcher, we could restart it but we'd need some kind of "offset" or "seek" mechanism. That was just my immediate thought. |
Yeah that looks right (thanks for the shell input refactor @grondo). Data passes through the input eventlog unless input is being read from a file. I suppose nothing changes if the subprocess user doesn't implement flow control. E.g. if they don't register a callback for receiving credits and just send data whenever. So we could implement flow control and get |
(oops, I didn't hit comment last night before the comment just posted)
Just a random thought, we could also do "credits" with the kvs-watch module. Or perhaps a more simple mechanism, effectively tell the watch to pause temporarily. A medium solution would be to say "give me another" to kvs-watch when user is ready. Depending on implementation, could have to deal with raciness where user has to cache "one round" of data from the kvs-watch. (Edit: now that I think about it a "give me another" is basically making it synchronous ... that's probably not good) Also, similarly to #6274, perhaps it'd be worthwhile to give a warning to users to use filemap or another mechanism for stdin if the size gets big. |
it occurs to me that for truly correct implementation, a user has to buffer some data, b/c a single "data" entry from the KVS could be larger than the stdin buffer size. Just more work to do .... |
A |
Another possibility is we redo the input mechanism so that the KVS is not in the middle of everything and is instead just cc'ed for the record unless bypassed. Edit: but my point earlier was I don't think job input would be affected if it just ignores credits for now. We could deal with this part later assuming we have some feasible options. |
Oh good point, that shouldn't be too hard. Lets do that first. |
First? I was just referring to a corner case to avoid if we decide to go with a size, offset argument for |
Yeah, I realized it after I thought about it. I think in my mind it was super simple to implement, therefore it could be done first. However, without the other stuff first, there's no ability to test it. |
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Problem: libsubprocess now supports stdin flow control via credits, but that is not used in flux-exec. Support credits and flow control in flux-exec to avoid overflowing the stdin buffer. Fixes flux-framework#4572
Redirecting input via
flux exec
works in the shell, but when launched inside CTI, I'm getting the errorsI'm using Flux 0.40.0-15, it happens with
cat
,sed
, and a minimal C program that redirects input. Haven't seen this before in CTI when launching other programs, but it could be something with the input redirection.Originally posted by @ardangelo in #3631 (comment)
The text was updated successfully, but these errors were encountered: