Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dispatch and native plugins #2987

Closed
bartlomieju opened this issue Sep 19, 2019 · 6 comments
Closed

Dispatch and native plugins #2987

bartlomieju opened this issue Sep 19, 2019 · 6 comments

Comments

@bartlomieju
Copy link
Member

bartlomieju commented Sep 19, 2019

Deno op dispatch and native plugins

Overview

Deno dispatch system is based on

Deno.core.send(opId: number, control: null | ArrayBufferView, data?: ArrayBufferView)
  • a function that passes messages from JS to Rust and executes function described with opId.

Message passing is done using so-called "shared queue", which for now has served it's purpose well.

The opId which identifies the op and its dispatcher is a number that is shared between TypeScript and Rust.

As of Deno v0.18.0 all ops dispatching logic is hardcoded.

Problem

Current solution is very simple and quite efficient, however has several shortcomings that need to be addressed
to expose native plugins API to users;

  1. OpId has to be manually synchronized between TS and Rust. In current situation, where we have about 60 ops, it's not that bad, but surely we'll be cumbersome as number of ops grows.
...
// Warning! These values are duplicated in the TypeScript code (js/dispatch.ts),
// update with care.
pub const OP_READ: OpId = 1;
pub const OP_WRITE: OpId = 2;
...
  1. There is no way to add new op programatically. Op has be to registered in code manually before compilation, which prevents users from adding new ops from their' code.
let op = match op_id {
    OP_READ => {
      dispatch_minimal::dispatch(io::op_read, state, control, zero_copy)
    }
    OP_WRITE => {
      dispatch_minimal::dispatch(io::op_write, state, control, zero_copy)
    }
    ...
}
  1. Op and its dispatcher are very tightly coupled. There still be tight coupling between op and it's dispatcher (you can't easily write op that will work with both "minimal" and "json" dispatcher), but it should be done using a simple API instead of manually writing closures.
// see example above
  1. Code for ops lives in 2 different places. TS code for ops currently lives in //js/ while Rust code lives in //cli/ops/. Ideally we'd like to have //ops directory that will contain both TS and Rust code (eg. //ops/fs/main.ts and //ops/fs/mod.ts as entry points).

Nomenclature

  • dispatch - main method used to call into Rust code from JS. See Isolate::set_dispatch method.
  • dispatcher - an abstract construct that is called from dispatch. It's aware of serialization method ("minimal", "json") and sets contract for op handler.
  • op/op handler - a function that performs some actual work, eg. "read"/"write"/"dial". Op has to be used with a dispatcher that knows how to transform result of that op into CoreOp.
  • CoreOp - fundamental type that describes how Rust code can return value to JS.

Requirements

  1. Minimal or no changes to Deno.core and shared queue infrastructure.
  2. New system must be FAST (at least as fast as current solution, preferably faster).
  3. Ops should have access to permission APIs (TODO: can permissions be forced upon plugins/ops?)
  4. Op handler signature shouldn't depend on ThreadSafeState because it's a deno_cli concept.
  5. Ability to register new dispatcher with custom serialization method (eg. "msgpack").
  6. Ability to register new op statically (before build).
  7. No hardcoded opIds - after registration of an op, during startup Rust sends an "op map" that is a dictonary mapping op names to op ids.
  8. Ability to register new op dynamically (from JS code), loading some Rust code (as dynamic library?). This functionality can be exposed as another op - op_register_new_op.
  9. Ability to register an op multiple times. Say I have some native plugin that requires "read" op, but I'm not sure that host will have "read" available. So I register the op in my library setup code (potentially with namespace of my plugin).
  10. Both static and dynamic op registration should be as simple as possible.

Proposed dispatch implementation

Prototype by @afinch7

In #2840 there's a complete solution.

  • "op listeners"

After registration of a new op a bunch of listener methods is called to reset the ids of all ops.
I believe this can be refactored to encapsulate this logic in JsonOp (presented in "pros" section) and ultimately
should be handled by separate op that calls "callback" function on dispatch to register op ids that are sent from Rust.

This is only done when using set_dispatch_registry, which can be combined with set_dispatch. It's functionality is equivalent
to proposed setOpMap below.

  • dispatcher registry
pub trait OpDispatcher: Send + Sync {
  fn dispatch(&self, args: &[u8], buf: Option<PinnedBuf>) -> CoreOp;
}

type OpDispatcherRegistry = Vec<Option<Arc<Box<dyn OpDispatcher>>>>;

pub struct OpDisReg {
  // Quick lookups by unique "op id"/"resource id"
  // The main goal of op_dis_registry is to perform lookups as fast
  // as possible at all times.
  op_dis_registry: RwLock<OpDispatcherRegistry>,
  next_op_dis_id: AtomicU32,
  // Serves as "phone book" for op_dis_registry
  // This should only be referenced for initial lookups. It isn't
  // possible to achieve the level of perfromance we want if we
  // have to query this for every dispatch, but it may be needed
  // to build caches for subseqent lookups.
  op_dis_id_registry: RwLock<HashMap<String, HashMap<&'static str, OpId>>>,
  notifier_reg: RwLock<NotifyerReg>,
}

notifier_reg is used to trigger "op listeners" described above.

Pros:

  • elegant API for registration of ops in TS
const OP_CHDIR = new JsonOp(opNamespace, "chdir");

export function chdir(directory: string): void {
  OP_CHDIR.sendSync({ directory });
}

It might need a little tweaking as each op would create a closure, but that has to verified experimentally.

  • nice code organization

All code (both Rust and TS) related to fs lives inside //ops/fs crate.

Cons:

  • each op is a trait

This leads for a lot of boilerplate code, where we essentialy have an empty structure that must implement a trait.

pub struct OpChmod;

#[derive(Deserialize)]
#[serde(rename_all = "camelCase")]
struct ChmodArgs {
  promise_id: Option<u64>,
  path: String,
  mode: u32,
}

impl FsOpDispatcher for OpChmod {
  fn dispatch(
    &self,
    state: &TSFsOpsState,
    control: &[u8],
    buf: Option<PinnedBuf>,
  ) -> CoreOp {
    wrap_json_op(
      move |args, _zero_copy| {
        let args: ChmodArgs = serde_json::from_value(args)?;
        let (path, path_) = deno_fs::resolve_from_cwd(args.path.as_ref())?;

        state.check_write(&path_)?;

        let is_sync = args.promise_id.is_none();
        blocking_json(is_sync, move || {
          debug!("op_chmod {}", &path_);
          // Still check file/dir exists on windows
          let _metadata = fs::metadata(&path).map_err(FsOpError::from)?;
          #[cfg(any(unix))]
          {
            let mut permissions = _metadata.permissions();
            permissions.set_mode(args.mode);
            fs::set_permissions(&path, permissions).map_err(FsOpError::from)?;
          }
          Ok(json!({}))
        })
      },
      control,
      buf,
    )
  }

  const NAME: &'static str = "chmod";
}
  • a bit magical loading of TS code

A concept of "import map" (not import maps we have implemented in Deno tho!) is used to map bare specifier to path of .ts file inside the crate. Eg. "deno_op_fs" is mapped to "bundle/main.ts" in "ops_fs" crate.

Prototype by @bartlomieju

Taking a lot of inspiration from @afinch7 I came up with a bit different solution that allows to avoid problem of initial double lookup.

// NOTE: this can be merged into `ThreadSafeState` or `Isolate`
pub struct DispatchManager {
  op_registry: RwLock<Vec<OpHandler>>
  // TODO: HashMap signature should be changed to support namespacing
  op_phone_book: RwLock<HashMap<String, (OpId, Dispatcher, Op)>>
}

In this case each registered op is placed in the op_phone_book structure to identify it by it's namespace and name and OpId is an offset in the op_registry vector obtained from length of the vector.

In the example above OpHandler type can be roughly described as:

// TODO: this has to be changed; we want to get rid of `state` and also add permission API
type OpHandler = Fn(state: &ThreadSafeState, control: &[u8], zero_copy: Option<PinnedBuf>) -> CoreOp);

Which is equivalent of wrapping any op with its designated dispatcher.

let read_handler = dispatch_minimal(op_read);
let resources_handler = dispatch_json(op_read);

// And registration...
isolate.register_op("read", read_handler);
isolate.register_op("resources", resources_handler);
// or
isolate.register_minimal_op("read", op_read);
isolate.register_json_op("resources", op_resources);

Then on startup a map of all registered ops can be sent to JS to synchronize JS counterparts functions:

// Construct an op map that looks like this:
// {
//    "minimal": {
//       "read": 1,
//       "write": 2,
//    },
//    "json": {
//       "resources": 3,
//       "fs": {
//         "chown": 4,
//         "chmod": 5,
//         ...
//       }
//    }
// }
let op_map = dispatch_manager.get_map()

isolate.execute(format!("denoMain({})", json!(op_map)));
function denoMain(opMao: OpMap, ...args): void {
  dispatch.setOpMap(opMaps);
  os.start(true, "TS");	  os.start(true, "TS");
}

// dispatch.ts
export function setOpMap(opMap: OpMap): void {
    // this can be further abstracted away
    dispatch_minimal.setOpMap(opMap.minimal);
    dispatch_json.setOpMap(opMap.json);
}

// dispatch_json.ts
class JsonOp {
    namespace?: string;
    name: string;
    opId: number;
    
    constructor(namespace: string, name?: string) {
        if (!this.name) {
            this.name = namespace;
        } else {
            this.namespace = namespace;
            this.name = name;
        }

        registeredOp.add(this);
    }

    register(opId: number) {
        this.opId = opId;
    }

    sendSync(...) {}

    sendAsync(...) {}
}

const registeredOps = [
    // store all instances of `JsonOp`
]

function setOpMap(opMap: opMap) {
    for (const registeredOp in registeredOps) {
        let opId;

        let map = opMap;

        if (registeredOp.namespace) {
            map = opMap[registeredOp.namespace]
        }

        const opId = map[registeredOp.name];

        if (!opId) {
            throw new Error(`Unregistered op: ${registeredOp.name}`);
        }

        op.register(opId);
    }
}

That way we get free validation that all ops are properly registered. Additional check can be added to ensure
that all ops sent from Rust are registered as well.

Upon dynamic registration of op we can do the same thing synchronously. The only constraint is that ops couldn't be unregistered
unless some synchronisation mechanism is added.

Questions/TODOs

  1. Where should op be registered? @ry suggested doing it on Isolate which means ops become core concept (which in turn will require permissions to become core concept).

Proposed native plugins implementation

TODO

Starting this issue for further discussion.

This is at least partially 1.0 blocker (#2473)

Related issues:

Related PRs:

CC @afinch7 @ry @piscisaureus @kevinkassimo

@ry
Copy link
Member

ry commented Sep 20, 2019

Discussed offline - the API looks fine, but we should have a Isolate::register_op in core first.

@kevinkassimo
Copy link
Contributor

kevinkassimo commented Sep 21, 2019

Also looks good to me. Maybe we should also take some time to reorganize denoMain() args since it is getting messy.

register_op in core sounds reasonable, but I also do not think moving permission to core is the right idea (not just permission, but potentially everything hosted under ThreadSafeState in the future). Maybe just somehow bind state to op invocation would work for us in cli, like isolate.register_op("my_op", with_state(state.clone(), f)) and mostly keep our current cli op handler types.

@bartlomieju
Copy link
Member Author

Also looks good to me. Maybe we should also take some time to reorganize denoMain() args since it is getting messy.

register_op in core sounds reasonable, but I also do not think moving permission to core is the right idea (not just permission, but potentially everything hosted under ThreadSafeState in the future). Maybe just somehow bind state to op invocation would work for us in cli, like isolate.register_op("my_op", with_state(state.clone(), f)) and mostly keep our current cli op handler types.

That's a good idea @kevinkassimo actually new prototype allows to keep CLI ops as they were (taking state as argument), please see #3004

@bartlomieju
Copy link
Member Author

I'm closing this issue because we introduced all changes described in this PR. Native plugins are still TODO, but we still need a few prerequisites before it can be tackled.

@rektide
Copy link

rektide commented Mar 9, 2020

How much of this discussion is still relevant & interesting now that #3372 has landed? How much of native plugins was implemented alike or as per these ideas? Which parts of this issue are n/a with the native plugins we have?

@kevinkassimo
Copy link
Contributor

@rektide I think that implementation mostly follows what was discussed here. Built-in ops are now registered in similar ways (with 0 reserved to retrieve other op name to id mapping), and external ops still cannot access CLI state directly yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants