Improving op_crate API and registering ops #9738

AaronO · 2021-03-09T22:48:39Z

Sharing some thoughts about improvements to op_crates, some previously shared on Discord.

Goals

Make op_crates plug-and-play / self-contained, reduce complexity leaked to consumers.
Streamline lifecycles of declared resources without consumer having to coordinate between snapshot or not scenarios
Extend what op_crates can do, e.g: provide their own v8 funcs, external refs, ...

Current state

op_crates have a convenience init function to execute all JS in the runtime, all ops are exported individually as op_* funcs.

So their main "API" by convention today is ::init() + N ::op_*

Problems

In short: op_crates aren't plug-and-play, op-middleware is not reusable and no-access to v8 for optimisations, etc...

init is somewhat of a misnomer since it doesn't init the entire module, it only inits the JS (ops need to be registered separately)
ops need to be registered individually by the consumer (like deno does in runtime/*), leaking complexity to the consumer
- Prevents op_crates from being "plug-and-play"
- Adding new ops in a crate without coordinating with the consumer breaks the consumer (same with removing/renaming)
- The main pro of the current approach is that it's granular and gives the consumer control
- Reading the code it appears that the current approach is mainly motivated by a need for metrics_op to wrap op setup
There's currently no convention for exposing v8 external refs (e.g: for optimize(core): move internal js src off heap #9713)
There's currently no convention for adding non-op native v8 functions (which is intentionally discouraged by design but can be limiting in some cases). It wouldn't be best-practice (lose inspectability of ops, metrics ...) but I think it wouldn't be a huge cost to allow it following certain conventions. A few use-cases:
- Could allow moving TextEncoder's encoding indexes off the heap and out of JS code, it currently consumes ~500kb per isolate between the two. If v8 bindings were allowed, it could be done via external arrays, preserving logic in JS and simply moving the arrays to rust instead of the entire implementation
- Functions like .heapStats() are inherently low-level and need access to the isolate. This requires them to go into core/ or limits consumers from adding their own
- v8 bindings could help provide optimized versions of URL and other core elements (see refactor(op_crates/web): Move URL parsing to Rust #9276 (comment) for perf gap)
- Other cases with lazy accessors like rusty_v8/examples/process.rs
op "middleware" like metrics_op aren't really reusable by 3rd-parties and adding new op middleware is tedious

Important gotcha: js/ops/v8 elements have somewhat different lifecycles due to snapshots, which is why having only a naive init_all() wouldn't work

Suggestions

init_js(js_runtime): rename the current init to init_js
init_ops(registrer, crate_specific_args): to register all of crate's ops
- registrer could be a Trait impl allowing composable op-middleware to hook-in and wrap ops (or filter them etc...), similar to plugin_api::Interface
- Making metrics_op reusable and allowing for new middleware like op-tracing, that could easily be enabled/disabled
init_v8(scope, crate_specific_args): to register v8 bindings
V8_EXT_REFS: exposing a list of v8::ExternalReference
init(args?) -> ModuleInit: this could allow for a single external API, that would return a ModuleInit struct of optional rust callbacks (of the above functions essentially) & data (v8 ext refs).
- This could allow the JsRuntime to consume "modules" with a very simple API, e.g:
```
JsRuntime::new(RuntimeOptions {
  modules: Some([deno_crypto::init(), deno_fetch::init(), deno_webgpu::init(), ...  ]),
  ..Default::default()
});
```
- The JsRuntime could then handle lifecycles appropriately (when snapshotting, etc...) and transparently for implementors, providing a much greater "plug-and-play" experience
- This unified init provides consistency whilst allowing gradual increase in complexity depending on feature use in modules
- Overall this moves the setup of op_crates from "compile time" by the dev to runtime
bootstrap_js(js_runtime): for situations like core's shared queue, where some code should not be snapshotted

Notes

Increasing the number of op_crates (as per chore: split web op crate #9635) could be great for modularity, but could increase fragmentation without init_ops and other conventions discussed above
core already provides init_v8 and init_js conceptually, bindings::initialize_context and JsRuntime::js_init respectively
I understand and agree with the rationale of "ops first" and whilst ops should soon get faster, I think allowing non-op v8 implementations for edge-cases or performance improvements for central parts (like URL) should be considered

The text was updated successfully, but these errors were encountered:

lucacasonato · 2021-03-10T10:59:52Z

I agree with the above, except for the init_v8. The op crates do not get direct access to the V8 isolate, and I think we will keep it this way for the foreseeable future.

I do like the idea of an easy plug and play way of adding modules to JsRuntime. I was thinking something like this:

trait JsRuntimeModule {
  /// This function returns JS source code to be loaded into the isolate (either at snapshotting, or at startup).
  /// The function should return a vector of a tuple of the file name, and the source code. 
  fn init_js(&self): -> Result<Vec<(&'static string, &'static string)>, AnyError> {
    // default implementation of `init_js` is to load no runtime code
    Ok(vec![])
  }

  /// This function can set up the initial op state of an isolate at startup. 
  fn init_state(&self, state: &mut OpState) -> Result<(), AnyError> {
    // default implementation of `init_js` is to not mutate the state
    Ok(())
  }

  /// This function lets you middleware the op registrations. This function gets called before this module's init_ops.
  fn init_op_registrar_middleware(&self, registrar: Box<dyn OpRegistrar>) -> Box<dyn OpRegistrar> {
    // default implementation is to not change the registrar
    registrar
  }

  /// This function gets called at startup to initialize the ops in the isolate.
  fn init_ops(&self, registrar: &mut Box<dyn OpRegistrar>) -> Result<(), AnyError> {
    // default implementation of `init_ops` is to load no runtime code
    Ok(())
  }
}

trait OpRegistrar {
  register_minimal_op_sync(...)
  register_minimal_op_async(...)
  register_json_op_sync(...)
  register_json_op_async(...)
}

ry · 2021-03-10T13:46:45Z

Having a plug-n-play interface has always been the goal - we've just been concerned about making it general enough to handle all of the corner cases (Like the FetchPermissions trait). At this point we have quite a few op crates, and I'm relatively confident that if a system is made that works with our current inventory, that it will be sufficiently general for future needs.

Luca's proposal seems ok but the OpRegistrar seems to break abstractions. Ultimately all ops have the same signature - the JSON/Minimal are just wrappers on top of the ops - so a single registration function is sufficient

deno/core/runtime.rs

Line 450 in 3ab4886

pub fn register_op<F>(&mut self, name: &str, op_fn: F) -> OpId

AaronO · 2021-03-10T14:07:55Z

@ry I agree with what you said and was going to make the same remark regarding OpRegistrar, since json_op_sync etc... lift those ops to have OpFn signatures. But it could be useful for an op-tracing "middleware" that would want to inspect inputs/outputs and handle them differently depending on the op sub-type.

@lucacasonato I know the v8-bindings is a sensitive topic so I'll drop it here, it can always be revisited later with specific use-cases, but I personally think it's worth considering and in the future it can be streamlined into this JsRuntimeModule interface. Given that this proposal grants the JsRuntime authoritative control over JsRuntimeModule setup, allow_v8_bindings could be a JsRuntime option, so it could be disabled for deno but allowed for 3rd-parties building off deno_core and the op_crates

It could also make sense to provide a few default implementations of JsRuntimeModule, like PureJSModule or SimpleOpModule for those simpler cases, but that's simply a matter of convenience.

Regarding op-middleware, I think it would be great to have a few solid ideas/use-cases besides OpMetrics before standardizing it as an abstraction, here are a few others that I can imagine:

OpTracing: so deno could have an --optrace flag similar to strace for debugging (this could in theory be done in JS by wrapping parts of Deno.core so it wouldn't necessarily need to be op-middleware)
OpDisabler: could take a predicate matching names of ops to disable with NOP error-throwing versions

AaronO · 2021-04-28T16:43:40Z

Fixed by #9800

kitsonk added deno_core Changes in "deno_core" crate are needed feat new feature (which has been agreed to/accepted) labels Mar 14, 2021

This was referenced Mar 15, 2021

core: introduce extensions #9800

Merged

Make op_crates truly self-contained with inter-crate deps #9803

Closed

AaronO closed this as completed Apr 28, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving op_crate API and registering ops #9738

Improving op_crate API and registering ops #9738

AaronO commented Mar 9, 2021 •

edited

Loading

lucacasonato commented Mar 10, 2021

ry commented Mar 10, 2021 •

edited

Loading

AaronO commented Mar 10, 2021 •

edited

Loading

AaronO commented Apr 28, 2021

Improving op_crate API and registering ops #9738

Improving op_crate API and registering ops #9738

Comments

AaronO commented Mar 9, 2021 • edited Loading

Goals

Current state

Problems

Suggestions

Notes

lucacasonato commented Mar 10, 2021

ry commented Mar 10, 2021 • edited Loading

AaronO commented Mar 10, 2021 • edited Loading

AaronO commented Apr 28, 2021

AaronO commented Mar 9, 2021 •

edited

Loading

ry commented Mar 10, 2021 •

edited

Loading

AaronO commented Mar 10, 2021 •

edited

Loading