Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON serialization and schema generation #414

Merged
merged 53 commits into from
Jul 13, 2020
Merged

Conversation

jlstevens
Copy link
Contributor

@jlstevens jlstevens commented Jun 10, 2020

This PR outlines another attempt at serializing/deserializing to JSON (for reference see
#153 for earlier attempts at tackling this problem). The design used in this PR aims to try to keep things simpler but also adds the ability to generate JSON Schemas which should help with testing as well as specification of allowable parameter values (e.g this could be used to specify suitable widgets in panel).

This PR is based off the ideas suggested in #382.

TODO:

  • The main open problem is about finding the appropriate API and JSON (and schema) representation for entire parameterized objects (i.e including the class identity as well as all the parameters).

@coveralls
Copy link

coveralls commented Jun 10, 2020

Coverage Status

Coverage decreased (-0.005%) to 76.447% when pulling 8c1adb9 on json_serialization into a8e34b2 on master.

@philippjfr
Copy link
Member

So after a brief look I guess what I expected to see here is the serialize/deserialize and maybe schema implementations to live on the actual Parameters rather than the serialization class so that the approach can be easily extended. I'm happy to review further but that seemed like a pretty core part of the design we were talking about before so I'd like to hear about your reasoning here before diving deeper.

@jbednar
Copy link
Member

jbednar commented Jun 10, 2020

Philipp, Jean-Luc, and I just met to discuss this approach, with some notes:

  • swapping serializers vs. adding Parameters -- which is more common?

    • adding Parameters seems more important
  • serialize to string immediately (as now), or with an intermediate Python representation (as for bokeh properties)?

    • Python IR:
      • Define a base set of supported Python types (int, float, bool, string, datetime64)
      • Build up a Python-based representation
      • Individual Parameters only know how to work with the IR
      • Specific serializers know how to map from IR<->textfile (JSON, XML, scriptrepr, etc.)
  • deserialization?

    1. result is a bag of Parameters, right now?
    2. result is an instance of a generic Parameterized (no Parameters initially, just has whatever was deserialized, but otherwise can be treated like any other Parameterized, e.g. for pn.pane.Param purposes.
    3. deserializing to a specific Parameterized, setting its parameters
    4. creating an arbitrary collection of nested Parameterized objects de novo

Actual IR->JSON should presumably be done using a JSONEncoder subclass (https://docs.python.org/3/library/json.html#json.JSONEncoder).

@philippjfr
Copy link
Member

philippjfr commented Jun 10, 2020

Python IR Support

Just wanted to start making a list of types that I think the IR and therefore every serializer should (eventually) support.

  • boolean
  • int
  • float
  • str
  • complex
  • bytes
  • list
  • dict
  • np.datetime64
  • datetime.datetime
  • datetime.date
  • np.array
  • pd.DataFrame

Feel free to edit and expand.

@jlstevens
Copy link
Contributor Author

I agree with the above list. It is also keeping #382 as a reminder of what should be possible with JSON serialization (which I suspect will be the baseline for all serialization formats).

param/serializer.py Outdated Show resolved Hide resolved
param/serializer.py Outdated Show resolved Hide resolved
param/serializer.py Outdated Show resolved Hide resolved
param/serializer.py Outdated Show resolved Hide resolved
@philippjfr
Copy link
Member

Can you also make sure you handle ObjectSelector/Selector with enums in your initial PR?

@philippjfr
Copy link
Member

Also encode the label attribute as the 'title' property in the schema.

@philippjfr
Copy link
Member

philippjfr commented Jun 11, 2020

Oh also ListSelector should be {'type': 'array', 'items': {'enum': [...]}}

@jlstevens
Copy link
Contributor Author

All good suggestions, thanks! I won't implement the regexp attribute for strings though as there is no easy/meaningful way to try to convert Python style regular expressions to equivalent javascript.

@jlstevens
Copy link
Contributor Author

Still more work to be done, but the suggested refactor and all of Philipp's suggestions are now implemented.

param/__init__.py Outdated Show resolved Hide resolved
param/__init__.py Outdated Show resolved Hide resolved
@jlstevens
Copy link
Contributor Author

jlstevens commented Jul 9, 2020

@jbednar @philippjfr Coverage is very slightly down (-0.005%) but otherwise tests are passing. Other than the ir_serialize versus serialize renaming (and the same for deserialize), I think this might be a good place to merge. The subsequent PR I am planning would address serialization of entire parameterized objects but I think this first step of serializing parameters (with schemas) is already valuable.

param/__init__.py Outdated Show resolved Hide resolved
Co-authored-by: Philipp Rudiger <[email protected]>
@jlstevens
Copy link
Contributor Author

@philippjfr @jbednar I think this PR is now in a useful state to merge. I'll go ahead and do that shortly unless you have any other suggestions or objections..

@philippjfr
Copy link
Member

Looks good. Merge away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants