Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for RFC19 F58 encoded JOBIDs #3045

Merged
merged 19 commits into from
Jul 24, 2020
Merged

Conversation

grondo
Copy link
Contributor

@grondo grondo commented Jul 14, 2020

This PR adds support for the proposed RFC19 F58 encoded JOBIDs.

It is a WIP because the final set of testing is not in place yet, in case the result is not acceptable for merging.

At its core, this PR adds a new FLUID string type FLUID_STRING_F58, which represents the Base58 encoding of FLUIDS (and therefore Flux JOBIDs).

Since all supported FLUID string representations are somewhat unambiguous, a fluid_parse() function is introduced which can automatically parse a FLUID from a string in any supported encoding.

The fluid_parse() function is then used to implement a flux_job_id_parse() to the public API (FLUIDs do not seem to be directly exposed in libflux API). flux_job_id_parse() is then used throughout flux-job.c to allow input of a JOBID in any supported encoding, instead of just decimal integers.

This also allowed dropping the need for the flux job id --from option, since the encoding of input is now auto-detected, e.g.

$ flux job id ƒNH
1234
$ flux job id --to words ƒNH
turtle-academy-academy--academy-academy-academy
$ flux job id turtle-academy-academy--academy-academy-academy
1234

The JOBID conversion functions are then added to python bindings, and JobID class introduced, which allows easy conversion to/from the different JOBID representations.

Finally, the JobID class is introduced into flux-jobs, which allows various output fields like id.hex id.dothex id.words. id.f58 is then made the default:

$ flux jobs -a -c 10
       JOBID USER     NAME       STATUS NTASKS NNODES  RUNTIME RANKS
  ƒQsm1jrTzf grondo   hostname       CD      1      1   0.072s 0
  ƒC7RLWTWPq grondo   hostname       CD      1      1   0.080s 0
  ƒ8jd8DVmwM grondo   sleep          CD      1      1   2.079s 0
  ƒ8fSahkXpB grondo   hostname       CD      1      1   0.074s 0
  ƒ8fPukZbVq grondo   hostname       CD      1      1   0.073s 0
  ƒ8fLxAEij1 grondo   hostnaem        F      1      1   0.078s 0
  ƒ8e2dCmHDy grondo   hostname       CD      1      1   0.084s 0

@lgtm-com
Copy link

lgtm-com bot commented Jul 14, 2020

This pull request introduces 1 alert when merging ccb02cc into 4ebcc86 - view on LGTM.com

new alerts:

  • 1 for Module is imported with 'import' and 'import from'

@grondo
Copy link
Contributor Author

grondo commented Jul 14, 2020

If it wasn't clear from above, most flux job commands now take JOBID in any acceptable format, e.g.

$ flux job cancel 3057525595308032 
$ flux job cancel ƒQsm1jrTzf
$ flux job cancel hand-rose-baby--crack-matrix-academy

work and are all equivalent.

@garlick
Copy link
Member

garlick commented Jul 14, 2020

I think this is really neat.

I was noticing that some of the output from flux job (like submit) still uses decimal jobid's but mabye that's OK since jobid input is now "smart" :-)

The fluid_parse() function is then used to implement a flux_job_id_parse() to the public API (FLUIDs do not seem to be directly exposed in libflux API).

Should we open an issue to export the FLUID interface? I think it's not part of the public API because early on I wasn't sure the 64 bit job IDs were going to work out, but I think those days are behind us.

@grondo
Copy link
Contributor Author

grondo commented Jul 14, 2020

I was noticing that some of the output from flux job (like submit) still uses decimal jobid's but mabye that's OK since jobid input is now "smart" :-)

I tried that at first. However, there are assumptions of integer JOBIDs all over the place in the testsuite. That would need to be cleaned up first. I was going to ask the question if I should attempt that. I also didn't update flux-mini.py to emit F58 encoded JOBIDs by default. Similar reasons, though I don't think it is quite as bad.

Should we open an issue to export the FLUID interface?

I wasn't sure. Currently flux_jobid_t == fluid_t, do we want to keep flux_jobid_t as the "public" interface to FLUIDs. I guess the only other reason to export FLUID interface via libflux would be if 3rd party modules wanted to generate FLUIDs... It didn't seem necessary at this point though.

@grondo
Copy link
Contributor Author

grondo commented Jul 14, 2020

BTW, I did verify that RH backported Python Issue 28180 to the RHEL8 version of Python 3.6. i.e. in CentOS 8, this works:

$  LANG=C python3 -c 'print("ƒQsm1jrTzf")'
ƒQsm1jrTzf

Whereas in ubuntu 18.04 and RHEL7 it fails with an exception:

$  LANG=C python3 -c 'print("ƒQsm1jrTzf")'
Unable to decode the command from the command line:
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 7-8: surrogates not allowed

@grondo grondo force-pushed the f58 branch 2 times, most recently from 84bb84a to 3f7fc8f Compare July 14, 2020 22:04
@grondo
Copy link
Contributor Author

grondo commented Jul 14, 2020

Rebased. Also made a small fix that was preventing flux jobs JOBIDS... from working for JOBIDS anything other than int.

I take lack of objections as assent to move forward. So I'll go ahead an make some tests for new flux-jobs behavior.

class JobID(int):
"""Class used to represent a Flux JOBID

JobID is a subclass of `int`, so may be used in place of integer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, but I'm not sure I follow the subclassing of int. What scenarios do you have in mind where you would want to use a JobID in place of an integer? Does this help with the auto-generated bindings which are expecting an int for the id_parse and id_encode functions?

My main concern is that the __add__, __sub__, __mul__, and __div__ methods that are inherited from int don't really make sense in the context of a FLUID. I suspect the only int-like methods we care about are ordering related, and we can get that with very little code using functools total_ordering decorator.

Copy link
Member

@SteVwonder SteVwonder Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS - I was originally thinking that int was the class we were advised not to subclass in the python course (for reasons that escape me right now), but it was actually list that we were warned about with UserList being the preferred thing to subclass.

Copy link
Contributor Author

@grondo grondo Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern was that JOBIDs were treated explicitly as int already in the code, and I just wanted to extend that to allow the other representations, instead of adding a swath of functions to convert to/from int and the other forms. We were already treating JOBIDs as integers and the int methods didn't seem to get in the way, so I actually thought this was the least change for the least effort.

However, I'm not a good Python programmer (clearly), what would be proper to do here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this also follows the C API, where we have typedef uint64_t fluid_t.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one more thing. I remembered what I was worried about.

flux-jobs currently stores JOBIDs as int, and this value is passed to the {id} format key as int and therefore allows all integer format specifiers, presentations, etc. Therefore the JobID class needs to present itself as an int, but allow the other representations as properties so we can use {id.f58} {id.words} etc. Is there a better way to accomplish this than subclassing int?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main concern was that JOBIDs were treated explicitly as int already in the code, and I just wanted to extend that to allow the other representations, instead of adding a swath of functions to convert to/from int and the other forms.

Gotcha! That makes total sense to me. My concern about the arithmetic methods is relatively minor, and I see how adding calls to int() all over the place would be a lot, especially if we miss something that is behind a conditional (darn dynamically-typed languages!).

flux-jobs currently stores JOBIDs as int, and this value is passed to the {id} format key as int and therefore allows all integer format specifiers, presentations, etc.

That makes a ton of sense too. I don't know of an easy way to support all of the int format specifiers other than subclassing. The hard way would be to use/re-implement the int's __format__ method or implement a .int() method and then do {id.int:...} in all the different formats that we do.

However, I'm not a good Python programmer

This is the only thing you said that I disagree with 😄

@garlick
Copy link
Member

garlick commented Jul 24, 2020

Once all these unicode obstacles are resolved, we can really have fun with this! Was just thinking a 🌱 for jobs with a self-imposed power cap would be cool!

@grondo
Copy link
Contributor Author

grondo commented Jul 24, 2020

Once all these unicode obstacles are resolved, we can really have fun with this! Was just thinking a  for jobs with a self-imposed power cap would be cool!

Or for @SteVwonder's sake a flux job add-reaction 👍

I think this one's ready for a full review. For those that missed my slack conversation with myself, I had to make one change to the flux PythonFunctionWrapper class to avoid unnecessary UnicodeDecode exceptions from Python when non-ascii strings are passed through a python commandline. (See the commit description in f8a3ce8 for more information)

diff --git a/src/bindings/python/flux/wrapper.py b/src/bindings/python/flux/wrapper.py
index 13632f914..fa444164d 100644
--- a/src/bindings/python/flux/wrapper.py
+++ b/src/bindings/python/flux/wrapper.py
@@ -190,7 +190,7 @@ class FunctionWrapper(object):
                 # Unpack wrapper objects
                 args[i] = args[i].handle
             elif isinstance(args[i], six.text_type):
-                args[i] = args[i].encode("utf-8")
+                args[i] = args[i].encode("utf-8", errors="surrogateescape")
 
         try:
             result = self.fun(*args)

I also held off on manpage updates to avoid conflicts with #3033

@grondo grondo changed the title WIP: Add support for RFC19 F58 encoded JOBIDs Add support for RFC19 F58 encoded JOBIDs Jul 24, 2020
@grondo
Copy link
Contributor Author

grondo commented Jul 24, 2020

Oh, I also should mention I threw on a change for {status_abbrev} to use 2 characters in the output rather than 6. The heading is now ST, and default format is {status_abbrev:>2.2}, e.g.

bash-4.4$ flux jobs -a -c 10
       JOBID USER     NAME       ST NTASKS NNODES  RUNTIME RANKS
   ƒ2dfddAMM grondo   sleep      PD     16      -        - -
   ƒ2VW4WaJj grondo   sleep       R     16      4   19.97s [0-3]
   ƒ2YKw7ByR grondo   hostname    F     24      -        - -
   ƒ2HJuu975 grondo   hostname    F     24      -        - -
   ƒ2Dv9GTkb grondo   false       F      1      1   0.090s 0
   ƒ2CRuAEw9 grondo   flase       F      1      1   0.085s 0
   ƒ2AxkoVEX grondo   hostname   CD      1      1   0.094s 0
    ƒeEcGpLp grondo   hostname   CD      1      1   0.015s 0
    ƒeEcGpLo grondo   hostname   CD      1      1   0.014s 3
    ƒeEanq4V grondo   hostname   CD      1      1   0.015s 2

This gives a bit more room to add another field to default output in near future (e.g. sched.queue or sched.reason)

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only got minor suggestions, so approving. I love the end result. I don't know if everyone will agree but let's see :-)

Prob would be good to get some python review also, as I mainly focused on the C bits.

int index = 0;
memset (buf, 0, bufsz);
if (id == 0) {
buf[0] = b58digits[0]; return 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line for the return 1; statement?

return -1;
}
if (bufsz <= strlen (f58_prefix) + 1) {
errno = ENOSPC;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used ENOSPC (No space left on device) in this case, but then somebody came up with EOVERFLOW (Value too large for defined data type) in some of our code and I thought that was more appropriate? Either is fine but I thought I'd throw that out there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think I like EOVERFLOW as well. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to jump in without having to review the entire PR.

I have been using ERANGE as well and I think ERANGE or EDOM could be another choice.

https://en.wikipedia.org/wiki/Errno.h

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, now that I think about it, in this particular case, EOVERFLOW would be best since this is a buffer being passed, not a number being passed. Sorry for the noise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, thanks for the feedback!

}

if ((count = b58revenc (b58reverse, sizeof (b58reverse), id)) < 0)
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set errno

for (int i = len - 1; i >= 0; i--) {
int8_t c = b58digits_map[(int8_t)str[i]];
if (c == -1) /* invalid base58 digit */
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set errno

case FLUID_STRING_F58:
if (fluid_f58_decode (&fluid, s) < 0)
return -1;
break;
default:
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not from this PR, but since you're here, this needs errno set, and also the mn_decode() != 8 case (my bad)

if (type == NULL || strcasecmp (type, "dec") == 0) {
int len = snprintf (buf, bufsz, "%ju", (uintmax_t) id);
if (len >= bufsz) {
errno = ENOSPC;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: EOVERFLOW?

@@ -244,12 +243,8 @@ static struct optparse_option status_opts[] = {
};

static struct optparse_option id_opts[] = {
{ .name = "from", .key = 'f', .has_arg = 1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

else if (!strcmp (to, "hex")) {
if (fluid_encode (dst, dstsz, id, FLUID_STRING_DOTHEX) < 0)
log_msg_exit ("error encoding id");
if (flux_job_id_encode (id, to, dst, dstsz) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great cleanup.

grondo added 9 commits July 24, 2020 18:44
Problem: A couple corner cases in fluid_decode(3) do not set an
errno on error return.

Add errnos of EINVAL in these cases.
Support RFC18 defined F58 (prefixed base58) encoding for FLUIDs.
This representation encodes 64bit FLUIDs int Base58 using the
Bitcoin digits:

 123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

In order to disambiguate an F58 encoded FLUID from decimal or hex
encoding, a special character, the Unicode script f (U+0192) is
prepended to the base58 encoding.

Add a FLUID_STRING_F58 type for manual conversion to/from the F58
representation with fluid_encode(3) and fluid_decode(3).
Add unit tests for fluid_encode/decode with F58 strings.
Problem: fluid_decode(3) requires a priori knowledge of the encoding
of the source string, and doesn't support integer representations of a
FLUID as decimal or hex. This makes usage of the function multi-step
for utilities that want to allow FLUID input from any representation.

Add a function to auto-detect the encoding type of a FLUID string
representation, along with a new `fluid_parse()` function that
decodes from any supported representation.
Add unit tests for the generic fluid_parse() function.
Problem: FLUID string encoding is not exported via a public API,
but this functionality may be useful and/or necessary for callers,
especially for the case of converting JOBIDs between different
encodings.

Add flux_job_id_parse() and flux_jobid_encode() to libjob to allow
Flux API users to easily parse a Flux JOBID from any supported
string representation, or encode a `flux_jobid_t` to any supported
representation. The supported representations currently include
all standard FLUID encodings:

 - decimal integer
 - hexidecimal integer, prefixed with "0x"
 - dotted hex format, "XXXX.XXXX.XXXX.XXXX"
 - KVS path, "job.XXXX.XXXX.XXXX.XXXX"
 - Words, "X-X-X-X--X-X-X-X"
 - RFC 19 F58 encoding, prefixed with ƒ or f

Addition of these functions to the libflux API will also make it
easy to add support to bindings for encoding and decoding JOBIDs.

The functions are named `flux_job_id_*` instead of `flux_jobid_*`
to keep consistent naming with the other libjob functions, as well
as to make it easy to call the functions from Python cffi.
The flux-job id --from option will be deprecated, and the "hex"
encoding will change names. Remove these options from tests
to avoid breakage during these changes.
Update the `flux-job id` utility to use the new flux_job_id_parse()
and flux_job_id_encode() functions for parsing JOBIDs from the command
line, and for conversion to the requested representation.

The --from=TYPE option is no longer required, since flux_job_id_parse
can unambiguously decode from all supported FLUID/JOBID encodings.
This option is dropped.

This change results in a renaming of the "hex" encoding to "dothex".
A "hex" representation is still supported, but is encoded as hexidecimal
with a "0x" prefix.
grondo added 10 commits July 24, 2020 18:44
Problem: Flux JOBIDs support multiple representations as described
in RFC 19, but many flux-job subcommands take input in only decimal
integer format.

Use flux_job_id_parse() instead of parse_arg_unsigned() specifically for
jobids to support JOBID input in any supported format.
Update tests for `flux-job id` in t2201-job-cmd.t to reflect recent
updates in this command, including F58 JOBID encoding, and rename
of "hex" to "dothex" with introduction of new "hex" representation.
Problem: When passing commandline arguments (i.e. sys.argv) from
Python scripts to C API functions via the wrapper.py class, encoding
errors are possible if a utf-8 string is supplied in argv if the
current environment does not have a UTF-8 LANG or LC_ALL environemnt
set. This seems to be because Python encodes sys.argv to `str` with
the current encoding, and (at least as of Python 3.6) there is no
way to get the actual bytes given in argv in this environment without
throwing a UnicodeEncode error.

Fortunately this has been a big enough problem that PEP383[1] was
devised, introducing the 'surrogatescape' error handler for encode(),
which allows the so-called "smuggling" bytes in character strings.

Since the alternative is to possibly throw UnicodeEncode errors,
add `errors='surrogateescape'` to the wrapper.py __call__() method,
when it encodes string arguments to pass into C. There is likely no
downside here, since there is presumably no existing usage depending
on the current behavior.

[1] https://www.python.org/dev/peps/pep-0383/
Add a JobID class for working with multiple representations of Flux JobIDs.

The JobID class inherints from Python int class, but its constructor
takes a JobID in any valid representation using `flux_job_id_parse()`
under the covers.

The JobID class also provides an `encode()` method to encode a JobID
to any supported representation, along with convenience named properties,
e.g.

 jobid = JobID(arg)
 print (jobid.f58)

will emit a F58 encoded jobid.
Add tests for the JobID class and its supported encodings.
Store JOBIDs as JobID objects in flux-jobs. Support all JobID
format "properties", e.g. id.f58, id.hex, etc.
Change the default flux-jobs output format to use id.f58 instead of
just id. This results in JOBIDs taking up less horizontal space in
the default output - 12 columns instead of 20.
Add tests for JOBID encodings in output and input for flux-jobs.
Problem: The status_abbrev field only takes up two characters
of width, but the field has to be six chars wide due to the default
heading of STATUS.

Change the heading for {status_abbrev} to "ST" and shorten the
default width to 2.
Check for the correct default heading for status and status_abbrev.
@grondo
Copy link
Contributor Author

grondo commented Jul 24, 2020

Ok, I think addressed all your comments @garlick, squashed and pushed the result.

@garlick
Copy link
Member

garlick commented Jul 24, 2020

Great! I only skimmed through the python bits, but I see you did go back and forth with @SteVwonder so maybe that's good to go? Up to you - I'm fine with MWP.

@grondo
Copy link
Contributor Author

grondo commented Jul 24, 2020

Let's do it!

Edit: It will be easy to roll back the flux-jobs default output if that turns out to be a problem.

@codecov-commenter
Copy link

Codecov Report

Merging #3045 into master will increase coverage by 0.06%.
The diff coverage is 89.32%.

@@            Coverage Diff             @@
##           master    #3045      +/-   ##
==========================================
+ Coverage   81.13%   81.19%   +0.06%     
==========================================
  Files         285      285              
  Lines       44063    44229     +166     
==========================================
+ Hits        35750    35912     +162     
- Misses       8313     8317       +4     
Impacted Files Coverage Δ
src/common/libjob/job.c 80.91% <85.00%> (+0.67%) ⬆️
src/bindings/python/flux/job.py 84.71% <86.11%> (+0.09%) ⬆️
src/cmd/flux-job.c 85.12% <90.90%> (+0.03%) ⬆️
src/common/libutil/fluid.c 93.64% <91.42%> (-6.36%) ⬇️
src/bindings/python/flux/wrapper.py 93.40% <100.00%> (ø)
src/cmd/flux-jobs.py 92.50% <100.00%> (+0.02%) ⬆️
src/modules/resource/acquire.c 65.06% <0.00%> (-0.69%) ⬇️
src/broker/broker.c 75.00% <0.00%> (+0.10%) ⬆️
src/broker/runat.c 84.61% <0.00%> (+0.40%) ⬆️
... and 6 more

@mergify mergify bot merged commit 7953a1a into flux-framework:master Jul 24, 2020
@grondo grondo deleted the f58 branch July 24, 2020 20:23
@grondo
Copy link
Contributor Author

grondo commented Jul 24, 2020

Thanks!

@SteVwonder
Copy link
Member

Yeah, the Python bits looked good to me when I reviewed them a week or two ago. Just learned about the surrogateencoding functionality. Thanks for the nice commit message on that @grondo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants