Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flux-job: flux_kvs_lookup_get: Value too large for defined data type #6256

Closed
grondo opened this issue Sep 4, 2024 · 13 comments · Fixed by #6268
Closed

flux-job: flux_kvs_lookup_get: Value too large for defined data type #6256

grondo opened this issue Sep 4, 2024 · 13 comments · Fixed by #6268
Assignees

Comments

@grondo
Copy link
Contributor

grondo commented Sep 4, 2024

A user is seeing this error from flux run in a subinstance with a job that may generate a lot of output. Not sure we've seen this one before.

job-info.err[0]: info_lookup_continuation: flux_kvs_lookup_get: Value too large for defined data type
ob-info.err[0]: main_namespace_lookup_continuation: flux_rpc_get_unpack: Value too large for defined data type
flux-job: flux_job_event_watch_get: Value too large for defined data type
@chu11
Copy link
Member

chu11 commented Sep 4, 2024

Was this a flux job info lookup on stdin/stdout? I'm wondering if stdin/stdout is pretty big and somewhere along the way some 2G boundary is crossed.

@grondo
Copy link
Contributor Author

grondo commented Sep 4, 2024

This was likely flux job attach since it was in the output of flux run. According to the user, --output and --error were used but trying to verify that.

@chu11
Copy link
Member

chu11 commented Sep 5, 2024

So I'm guessing this user produced a bunch of stdout into the KVS, and once it was > 2G, EOVERFLOW occurs b/c the data returned would be greater than the size of an int. Reproduced via

>flux submit t/shell/lptest 1000 5000000
ƒA6i9Muy

<wait some time, make that thing make a bunch of stdout>

>flux jobs
       JOBID USER     NAME       ST NTASKS NNODES     TIME INFO
    ƒA6i9Muy achu     lptest      R      1      1   5.095m corona212

> flux job attach ƒA6i9Muy
Sep 05 15:36:14.999036 PDT job-info.err[0]: info_lookup_continuation: flux_kvs_lookup_get: Value too large for defined data type
Sep 05 15:36:14.999190 PDT job-info.err[0]: main_namespace_lookup_continuation: flux_rpc_get_unpack: Value too large for defined data type
flux-job: flux_job_event_watch_get: Value too large for defined data type

I'm not really sure how to fix / improve this w/o major re-engineering (i.e. read KVS values w/ offsets / seeking values ... support > INT_MAX data ... may require RFC changes, don't know off the top of my head where we define things as ints only).

For the time being ...

Better error message?
Return truncated output? (as an option?)

@chu11
Copy link
Member

chu11 commented Sep 5, 2024

To my surprise we don't define things as int in the KVS treeobj format. We just have specific API functions that use int, such as treeobj_create_val(). I suppose we technically could support larger stuffs by just updating everything to use size_t (or unsigned int) ... but I don't know if we want to go down that path.

@garlick
Copy link
Member

garlick commented Sep 5, 2024

Well it is a bit silly to be failing for that reason, although fetching a > 2G message in a KVS get may fail for other reasons. It's certainly going to be a self-DoS for a while.

@chu11
Copy link
Member

chu11 commented Sep 6, 2024

just brainstorming here, we could support some type of flux_kvs_lookup_stream()? return each blobref in the valref array one at a time?

@chu11
Copy link
Member

chu11 commented Sep 6, 2024

Few brainstormings this morning:

  • It just occurred to me that b/c this is going through kvs-watch, an "offset" already is already being tracked, hypothetically could pass that into KVS lookup easily.
  • for appended eventlogs, nothing stops us from only returning less than 2G, and some flag that says "truncated" along w/ above so kvs-watch knows to ask for more
  • we are already using the special kvs.lookup-plus from within kvs-watch, so pretty easy to modify protocol there.

@garlick
Copy link
Member

garlick commented Sep 6, 2024

I wonder if it would be easier to implement a limit on storing a large value in the first place?

I hate to suggest modifying RFC 11 but we could maybe consider adding an optional size to val and valref?
(optional in the sense that it is allowed to be missing from older metadata)

If we have a running total size for a kvs value, then we have the means to reject an append that would cause it to exceed some maximum.

@chu11
Copy link
Member

chu11 commented Sep 6, 2024

I hate to suggest modifying RFC 11 but we could maybe consider adding an optional size to val and valref?
(optional in the sense that it is allowed to be missing from older metadata)
If we have a running total size for a kvs value, then we have the means to reject an append that would cause it to exceed some maximum.

That seems like a good idea for a full on correct solution in the KVS.

But if we're going down the path of just limiting appends, for a quicker solution for this case, perhaps we can abuse the OUTPUT_LIMIT support in the shell to limit stdout/stderr? I see this in the shell's code

    /*  Set default to unlimited (0) for single-user instances,
     *  O/w use the default multiuser output limit:
     */
    if (out->shell->broker_owner == getuid())
        out->kvs_limit_string = "0";
    else
        out->kvs_limit_string = MULTIUSER_OUTPUT_LIMIT;

if we just change that "0" to "2G" (or perhaps less if we want to be conservative), I think the stdout/stderr will probably just be capped for single user instances.

@garlick
Copy link
Member

garlick commented Sep 6, 2024

Great point!

@chu11
Copy link
Member

chu11 commented Sep 6, 2024

ok, lets go with that quick and dirty solution in the shell. The KVS thing you mention above is the longer term generalized solution, b/c users might be willing to abuse it if they write their own data to the KVS.

@garlick
Copy link
Member

garlick commented Sep 6, 2024

OK, I'll open an issue on the long term one. It requires more thought/discussion/spelunking I think.

@grondo
Copy link
Contributor Author

grondo commented Sep 6, 2024

We've had #5148 open since the original problem

chu11 added a commit to chu11/flux-core that referenced this issue Sep 6, 2024
Problem: The KVS has a size limit of INT_MAX for when returning kvs
values.  This limit can be exceeded by a job's standard output because
it is continually appended and the total size is not yet tracked by
the KVS.  When reading the output later, such as via `flux job attach`,
this can lead to EOVERFLOW errors.

Solution: For a single user instance, default to a maximum standard
output of 1G instead of "unlimited".  1G should provide a practical
maximum for most users and encourage them to send standard output
to a file if they want to save excess standard output.  If desired,
the value can still be overwritten via the "output.limit" setting.

Fixes flux-framework#6256
chu11 added a commit to chu11/flux-core that referenced this issue Sep 6, 2024
Problem: The KVS has a size limit of INT_MAX for when returning kvs
values.  This limit can be exceeded by a job's standard output because
it is continually appended and the total size is not yet tracked by
the KVS.  When reading the output later, such as via `flux job attach`,
this can lead to EOVERFLOW errors.

Solution: For a single user instance, default to a maximum standard
output of 1G instead of "unlimited".  1G should provide a practical
maximum for most users and encourage them to send standard output
to a file if they want to save excess standard output.  If desired,
the value can still be overwritten via the "output.limit" setting.

Fixes flux-framework#6256
chu11 added a commit to chu11/flux-core that referenced this issue Sep 9, 2024
Problem: The KVS has a size limit of INT_MAX for when returning kvs
values.  This limit can be exceeded by a job's standard output because
it is continually appended and the total size is not yet tracked by
the KVS.  When reading the output later, such as via `flux job attach`,
this can lead to EOVERFLOW errors.

Solution: For a single user instance, default to a maximum standard
output of 1G instead of "unlimited".  1G should provide a practical
maximum for most users and encourage them to send standard output to
a file if they want to save excess standard output.

Do not allow configuration larger than this.  As a consequence the
configuration of "unlimited" is no longer allowed.

Fixes flux-framework#6256
@mergify mergify bot closed this as completed in #6268 Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants