Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(neon): Serde implementation optimized with JSON #953

Closed
wants to merge 9 commits into from

Conversation

kjvalencik
Copy link
Member

@kjvalencik kjvalencik commented Jan 6, 2023

Based on #701, but with the following changes:

  • Defers to JSON.stringify, JSON.parse and serde_json on non-primitive values
  • Adds cx.deserialize_args for getting a tuple from arguments

Questions

  • Are the lossy as casts acceptable or should we bring in a crate to more accurately capture failures?
  • Single element tuples are awkward in Rust (let (n,) = (u64,);). Should we include a deserialize_arg for unary methods to simplify things?
  • Is the very targeted FromArg acceptable in the near term or should we try to make a more generic TryFromJs? Defer for the future. FromArg may be implemented in the future in terms of the other trait.
  • Users should not use FromArg or FromArgs. They are currently private, but that also hides the docs and make it unclear what types implement them. Should we make them available like Arguments? Should we make them public with the docsrs flag only? Expose them just like Arguments with the methods hidden.
  • Is there a way we can deserialize directly to T without a special method (deserialize_arg) and without conflicting with T: DeserializeOwned? Maybe with some wrapper type? let Arg(name): Arg<String> = cx.deserialize_args()? The type system is actually telling us [correctly] that the API is ambiguous. What if the function does take a single argument and that argument is a tuple?

*For future consideration
Now that GAT exist, what might it look like to be able to export function that automatically deserialize arguments and serialize return values?

cx.export_function("greet", |mut cx, (name,): (String,)| Ok(format!("Hello, {name}!")));

@kjvalencik kjvalencik marked this pull request as ready for review January 9, 2023 18:45
) -> [Handle<'b, JsValue>; N] {
use std::ptr;

let mut argv = [JsValue::new_internal(ptr::null_mut()); N];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth adding a comment here that Node-API fills slots with undefined if the number of actual arguments is smaller than the length of argv.

Copy link
Member Author

@kjvalencik kjvalencik Jan 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably also document that the safety here depends on Handle<JsValue> being a transparent wrapper for a pointer.

Either that or accept that argv_exact is a leaky abstraction, mark it unsafe and return the raw pointers directly. There's an opportunity to individually wrap them with JsValue::new_internal when deserializing.

Copy link
Collaborator

@dherman dherman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. I made a few suggestions for clarity, but you can make your own call and then I don't think I need to re-review unless you want.


match T::deserialize(unsafe { Deserializer::new(env, v) }) {
Err(Error::FallbackJson) => {}
res => return res,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though return can be used as an expression, stylistically I find it less distracting (easier to understand we're not doing anything fancy) to just use it like a statement:

res => { return res; }

Especially since here the whole match is just being used as a statement.

}
}

#[derive(Debug, Copy, Clone)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could use a purpose statement comment.

visitor.visit_string(unsafe { sys::get_value_string(self.env, self.value)? })
}

fn deserialize_bytes<V>(self, visitor: V) -> Result<V::Value, Self::Error>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's more work but I think it would really help make the code understandable if each method like this has a comment explaining the mapping from the Serde IR type to the Rust type we've chosen (e.g. bytes -> ByteBuf). Maybe even showing an example like:

let data: ByteBuf = cx.deserialize_arg()?;

and

example(new ArrayBuffer(128))

types::{JsString, Value},
};

pub(super) fn deserialize<'cx, T, V, C>(cx: &mut C, value: Handle<V>) -> Result<T, Error>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the heart of the deserialization algorithm is:

DESERIALIZE(v: JavaScript value)
1. If v is a primitive type, use Deserializer
2. If v is a compound type, call JSON.stringify(v) and use serde_json

Ok(serde_json::from_str(&s)?)
}

struct Deserializer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should call this PrimitiveDeserializer or NonRecursiveSerializer or BaseCaseSerializer or something like that, to make it more obvious what it's for.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlatDeserializer? SimpleDeserializer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so many possible colors for the bike shed

static PARSE: LocalKey<Root<JsFunction>> = LocalKey::new();

PARSE
.get_or_try_init(cx, |cx| Ok(parse(cx)?.root(cx)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: It ever so slightly bugs me that we read JSON.parse lazily, which means if someone ever mutates global.JSON the program behavior is unpredictable. But I don't think we have any module loading lifecycle hooks that would allow decentralized code to register the creation and initialization of LocalKey values at module load time. Maybe an interesting idea for a future feature of LocalKey where you could write this code like this:

#[cfg(feature = "napi-6")]
{
    static PARSE: LocalKey<Root<JsFunction>> = LocalKey::init_on_load(cx, |cx| Ok(parse(cx)?.root(cx));

    Ok(PARSE.get().unwrap().to_inner(cx))
}

const MAX_SAFE_INTEGER: u64 = 9_007_199_254_740_991;
const MIN_SAFE_INTEGER: i64 = -9_007_199_254_740_991;

pub(super) struct Serializer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, maybe a name that indicates this is the non-recursive serializer only?

de::deserialize(cx, v).or_else(|err| cx.throw_error(err.to_string()))
}

/// Attempts to write Rust data into a JavaScript value using serde
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the basic algorithm here is:

SERIALIZE(x: Rust value)
1. If Serializer can serialize it to a base case JS value, use that
2. Otherwise use serde_json to create a string and call JSON.parse() on that

@antonok-edm
Copy link
Contributor

@dherman @kjvalencik is there anything blocking here that I can help with? I've been suddenly running into a really weird issue that seems related to neon_serde; I could spend my time and effort debugging and working around that issue but I'd prefer to invest it towards something more useful here if possible.

@kjvalencik
Copy link
Member Author

@antonok-edm that does look like a really strange bug! I may look into it if I get a chance.

As for this PR, I'm going to close it. I've been deliberating over this for a long time and decided that quietly switching to JSON in some circumstances is not the right approach. As demonstrated in your PR, it's already pretty easy for users to use JSON if that's their preference.

I'm going to open a new PR that is direct transcoding (closer to neon-serde) and document the caveats that it may be slower than JSON. Then if optimizations are made available to Node-API (currently, object related ones are private V8 APIs that only built-in JSON can leverage), we can make it faster without changing behavior.

There may still be some benefits of building JSON in, since it's such a common pattern. Do you have any thoughts here? Perhaps something like:

let (Something, Other, String) = cx.args_from_json()?;

@kjvalencik kjvalencik closed this Aug 11, 2023
@antonok-edm
Copy link
Contributor

There may still be some benefits of building JSON in, since it's such a common pattern. Do you have any thoughts here? Perhaps something like:

let (Something, Other, String) = cx.args_from_json()?;

It doesn't seem like that'd support overloaded signatures (which could be fine, I guess). I think having an equivalent of what I wrote in the json_ffi mod here would be great though.

@kjvalencik
Copy link
Member Author

kjvalencik commented Aug 17, 2023

We discussed this in our last meeting and the plan is to use a newtype approach similar to axum extractors. Something like:

let (a_string, a_number, Json(other_json)): (String, f64, Json<MyStruct>) = cx.args()?;

You could also have extractors that cover overloaded signatures cases like Either<String, f64>.

This covers the case where the value is already serialized JSON, but I see in your example it's calling JSON.stringify first. We could have something that did that, although I'm not sure what to call it (ViaJson?).

My plan is to first implement this and then bring back serde as an extractor type (and without the hacky fallback to JSON).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants