Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving op_crate API and registering ops #9738

Closed
AaronO opened this issue Mar 9, 2021 · 4 comments
Closed

Improving op_crate API and registering ops #9738

AaronO opened this issue Mar 9, 2021 · 4 comments
Labels
deno_core Changes in "deno_core" crate are needed feat new feature (which has been agreed to/accepted)

Comments

@AaronO
Copy link
Contributor

AaronO commented Mar 9, 2021

Sharing some thoughts about improvements to op_crates, some previously shared on Discord.

Goals

  1. Make op_crates plug-and-play / self-contained, reduce complexity leaked to consumers.
  2. Streamline lifecycles of declared resources without consumer having to coordinate between snapshot or not scenarios
  3. Extend what op_crates can do, e.g: provide their own v8 funcs, external refs, ...

Current state

op_crates have a convenience init function to execute all JS in the runtime, all ops are exported individually as op_* funcs.

So their main "API" by convention today is ::init() + N ::op_*

Problems

In short: op_crates aren't plug-and-play, op-middleware is not reusable and no-access to v8 for optimisations, etc...

  • init is somewhat of a misnomer since it doesn't init the entire module, it only inits the JS (ops need to be registered separately)
  • ops need to be registered individually by the consumer (like deno does in runtime/*), leaking complexity to the consumer
    • Prevents op_crates from being "plug-and-play"
    • Adding new ops in a crate without coordinating with the consumer breaks the consumer (same with removing/renaming)
    • The main pro of the current approach is that it's granular and gives the consumer control
    • Reading the code it appears that the current approach is mainly motivated by a need for metrics_op to wrap op setup
  • There's currently no convention for exposing v8 external refs (e.g: for optimize(core): move internal js src off heap #9713)
  • There's currently no convention for adding non-op native v8 functions (which is intentionally discouraged by design but can be limiting in some cases). It wouldn't be best-practice (lose inspectability of ops, metrics ...) but I think it wouldn't be a huge cost to allow it following certain conventions. A few use-cases:
    • Could allow moving TextEncoder's encoding indexes off the heap and out of JS code, it currently consumes ~500kb per isolate between the two. If v8 bindings were allowed, it could be done via external arrays, preserving logic in JS and simply moving the arrays to rust instead of the entire implementation
    • Functions like .heapStats() are inherently low-level and need access to the isolate. This requires them to go into core/ or limits consumers from adding their own
    • v8 bindings could help provide optimized versions of URL and other core elements (see refactor(op_crates/web): Move URL parsing to Rust #9276 (comment) for perf gap)
    • Other cases with lazy accessors like rusty_v8/examples/process.rs
  • op "middleware" like metrics_op aren't really reusable by 3rd-parties and adding new op middleware is tedious

Important gotcha: js/ops/v8 elements have somewhat different lifecycles due to snapshots, which is why having only a naive init_all() wouldn't work

Suggestions

  • init_js(js_runtime): rename the current init to init_js
  • init_ops(registrer, crate_specific_args): to register all of crate's ops
    • registrer could be a Trait impl allowing composable op-middleware to hook-in and wrap ops (or filter them etc...), similar to plugin_api::Interface
    • Making metrics_op reusable and allowing for new middleware like op-tracing, that could easily be enabled/disabled
  • init_v8(scope, crate_specific_args): to register v8 bindings
  • V8_EXT_REFS: exposing a list of v8::ExternalReference
  • init(args?) -> ModuleInit: this could allow for a single external API, that would return a ModuleInit struct of optional rust callbacks (of the above functions essentially) & data (v8 ext refs).
    • This could allow the JsRuntime to consume "modules" with a very simple API, e.g:
    JsRuntime::new(RuntimeOptions {
      modules: Some([deno_crypto::init(), deno_fetch::init(), deno_webgpu::init(), ...  ]),
      ..Default::default()
    });
    
    • The JsRuntime could then handle lifecycles appropriately (when snapshotting, etc...) and transparently for implementors, providing a much greater "plug-and-play" experience
    • This unified init provides consistency whilst allowing gradual increase in complexity depending on feature use in modules
    • Overall this moves the setup of op_crates from "compile time" by the dev to runtime
  • bootstrap_js(js_runtime): for situations like core's shared queue, where some code should not be snapshotted

Notes

  • Increasing the number of op_crates (as per chore: split web op crate #9635) could be great for modularity, but could increase fragmentation without init_ops and other conventions discussed above
  • core already provides init_v8 and init_js conceptually, bindings::initialize_context and JsRuntime::js_init respectively
  • I understand and agree with the rationale of "ops first" and whilst ops should soon get faster, I think allowing non-op v8 implementations for edge-cases or performance improvements for central parts (like URL) should be considered
@lucacasonato
Copy link
Member

I agree with the above, except for the init_v8. The op crates do not get direct access to the V8 isolate, and I think we will keep it this way for the foreseeable future.

I do like the idea of an easy plug and play way of adding modules to JsRuntime. I was thinking something like this:

trait JsRuntimeModule {
  /// This function returns JS source code to be loaded into the isolate (either at snapshotting, or at startup).
  /// The function should return a vector of a tuple of the file name, and the source code. 
  fn init_js(&self): -> Result<Vec<(&'static string, &'static string)>, AnyError> {
    // default implementation of `init_js` is to load no runtime code
    Ok(vec![])
  }

  /// This function can set up the initial op state of an isolate at startup. 
  fn init_state(&self, state: &mut OpState) -> Result<(), AnyError> {
    // default implementation of `init_js` is to not mutate the state
    Ok(())
  }

  /// This function lets you middleware the op registrations. This function gets called before this module's init_ops.
  fn init_op_registrar_middleware(&self, registrar: Box<dyn OpRegistrar>) -> Box<dyn OpRegistrar> {
    // default implementation is to not change the registrar
    registrar
  }

  /// This function gets called at startup to initialize the ops in the isolate.
  fn init_ops(&self, registrar: &mut Box<dyn OpRegistrar>) -> Result<(), AnyError> {
    // default implementation of `init_ops` is to load no runtime code
    Ok(())
  }
}

trait OpRegistrar {
  register_minimal_op_sync(...)
  register_minimal_op_async(...)
  register_json_op_sync(...)
  register_json_op_async(...)
}

@ry
Copy link
Member

ry commented Mar 10, 2021

Having a plug-n-play interface has always been the goal - we've just been concerned about making it general enough to handle all of the corner cases (Like the FetchPermissions trait). At this point we have quite a few op crates, and I'm relatively confident that if a system is made that works with our current inventory, that it will be sufficiently general for future needs.

Luca's proposal seems ok but the OpRegistrar seems to break abstractions. Ultimately all ops have the same signature - the JSON/Minimal are just wrappers on top of the ops - so a single registration function is sufficient

pub fn register_op<F>(&mut self, name: &str, op_fn: F) -> OpId

@AaronO
Copy link
Contributor Author

AaronO commented Mar 10, 2021

@ry I agree with what you said and was going to make the same remark regarding OpRegistrar, since json_op_sync etc... lift those ops to have OpFn signatures. But it could be useful for an op-tracing "middleware" that would want to inspect inputs/outputs and handle them differently depending on the op sub-type.

@lucacasonato I know the v8-bindings is a sensitive topic so I'll drop it here, it can always be revisited later with specific use-cases, but I personally think it's worth considering and in the future it can be streamlined into this JsRuntimeModule interface. Given that this proposal grants the JsRuntime authoritative control over JsRuntimeModule setup, allow_v8_bindings could be a JsRuntime option, so it could be disabled for deno but allowed for 3rd-parties building off deno_core and the op_crates

It could also make sense to provide a few default implementations of JsRuntimeModule, like PureJSModule or SimpleOpModule for those simpler cases, but that's simply a matter of convenience.

Regarding op-middleware, I think it would be great to have a few solid ideas/use-cases besides OpMetrics before standardizing it as an abstraction, here are a few others that I can imagine:

  • OpTracing: so deno could have an --optrace flag similar to strace for debugging (this could in theory be done in JS by wrapping parts of Deno.core so it wouldn't necessarily need to be op-middleware)
  • OpDisabler: could take a predicate matching names of ops to disable with NOP error-throwing versions

@kitsonk kitsonk added deno_core Changes in "deno_core" crate are needed feat new feature (which has been agreed to/accepted) labels Mar 14, 2021
@AaronO
Copy link
Contributor Author

AaronO commented Apr 28, 2021

Fixed by #9800

@AaronO AaronO closed this as completed Apr 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deno_core Changes in "deno_core" crate are needed feat new feature (which has been agreed to/accepted)
Projects
None yet
Development

No branches or pull requests

4 participants