Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust Oak Runtime ownership of Nodes and Channels #633

Closed
tiziano88 opened this issue Feb 24, 2020 · 9 comments
Closed

Rust Oak Runtime ownership of Nodes and Channels #633

tiziano88 opened this issue Feb 24, 2020 · 9 comments
Assignees

Comments

@tiziano88
Copy link
Collaborator

The Rust Oak Runtime should fully own Nodes and Channels, in such a way that it should be possible to "ask" it to produce a list of them and some related information (e.g. stats or errors).

This may be useful for debugging Oak Applications, but I also think it will simplify the implementation of labels and information flow control, since all the rules relative to what is allowed and what not would be clearly expressed in the Runtime itself, which would become the sole mediator of all communication between Nodes and Channels.

Currently the Rust Oak Runtime is responsible for creating Nodes, but it only keeps around a JoinHandle per Node instance, and also does not have visibility into Channels that are created by that Node.

impl RuntimeRef {
/// Thread safe method that attempts to create a node within the `Runtime` corresponding to a
/// given module name and entrypoint. The `reader: ChannelReader` is passed to the newly
/// created node.
///
/// `node_create` is a method of `RuntimeRef` and not `Runtime`, so that the underlying
/// `Arc<Runtime>` can be passed to `conf.new_instance` and given to a new node thread.
pub fn node_create(
&self,
module_name: &str,
entrypoint: &str,
reader: ChannelReader,
) -> Result<(), OakStatus> {
if self.is_terminating() {
return Err(OakStatus::ERR_TERMINATED);
}
let mut node_threads = self.node_threads.lock().unwrap();
if self.is_terminating() {
return Err(OakStatus::ERR_TERMINATED);
}
let join_handle = self
.configurations
.get(module_name)
.ok_or(OakStatus::ERR_INVALID_ARGS)
.and_then(|conf| {
conf.new_instance(module_name, self.clone(), entrypoint.to_owned(), reader)
})?;
node_threads.push(join_handle);
Ok(())
}
}

Blocks #630 .

@blaxill assigning to you so I don't get in your way of any refactoring / extensions to the Rust Runtime, but let me know if you would prefer me to look into it (and when).

@blaxill
Copy link
Contributor

blaxill commented Feb 24, 2020

Am I right in thinking you need the runtime to have a hashmap from nodes to some data (which in the future will include labels)?

I'm thinking add said hashmap to the runtime, keeping the channel logic in channel.rs, but add a method to runtime to allow channel.rs functions to inspect properties of the calling node?

@tiziano88
Copy link
Collaborator Author

Am I right in thinking you need the runtime to have a hashmap from nodes to some data (which in the future will include labels)?

I was thinking it would have a map from some unique per-node id to the actual node instance, and similarly for channels.

I'm thinking add said hashmap to the runtime, keeping the channel logic in channel.rs, but add a method to runtime to allow channel.rs functions to inspect properties of the calling node?

I think that would technically work, but the logic for IFC would end up being spread across channels and nodes. I guess it may be centralized into channels only, if they know a lot about the receiving nodes, but to me that seems to make the channel abstraction too "heavy". WDYT?

@blaxill
Copy link
Contributor

blaxill commented Feb 24, 2020

I was thinking it would have a map from some unique per-node id to the actual node instance, and similarly for channels.

Currently the node instances are just threads that have their own private data. I don't think we necessarily want to expose their data to the runtime (for separation of concerns not security). Perhaps just node-id to struct { _: joinhandle, _: label, _: Vec<Weak<Channels>> }. I may add channels references like this for now and then think about simplifying it in a future PR. That is, if we have the runtime knowledgable of channels, then it can probably do the reference counting and we can drop the Arcs for something non-rc.

On the topic of querying nodes from the runtime: Is this for debugging stats/errors or more? I imagine the labels always being enforced on channel operations, in which case I see as the runtime enforcing this, but "runtime.enforce(...) -> Result<(), OakStatus/PrivError>" being called from e.g ChannelWriter.write, rather than WasmNode calling e.g. runtime.channel_write_with_priv directly.

I think that would technically work, but the logic for IFC would end up being spread across channels and nodes. I guess it may be centralized into channels only, if they know a lot about the receiving nodes, but to me that seems to make the channel abstraction too "heavy". WDYT?

I think it depends on how the extra functionality works. If it's as simple as enforce_label(from: Node, to: Node), i'd put that in the runtime, and call it from channel code when adding messages, but no strong opinion either way.

Part of this is I'm thinking of the oak_runtime as a whole as the OS, and the runtime impl more of a node scheduler and now policy enforcer. And channels as their own abstraction, rather than a thin PODish type. I think both approaches have their benefits though, this was just how I was implementing it so far.

@tiziano88
Copy link
Collaborator Author

I was thinking it would have a map from some unique per-node id to the actual node instance, and similarly for channels.

Currently the node instances are just threads that have their own private data. I don't think we necessarily want to expose their data to the runtime (for separation of concerns not security). Perhaps just node-id to struct { _: joinhandle, _: label, _: Vec<Weak<Channels>> }. I may add channels references like this for now and then think about simplifying it in a future PR. That is, if we have the runtime knowledgable of channels, then it can probably do the reference counting and we can drop the Arcs for something non-rc.

Sure, I was not suggesting to expose the entire node internals directly to the Runtime anyways, but ideally there would be some abstract trait that may be implemented by all node types, with methods to access some internal information, e.g. label, status, handle table etc. Also a trait seems more scalable than yet another struct that needs to be kept in sync in some other way (e.g. what happens when new channels are created / destroyed? what is responsible to sync that back to the struct?)

Also, I think that having channels owned by the creator node is problematic from a resource management point of view: what if node A creates a channel, and it passes its respective handles to nodes B and C, and subsequently A dies? In principle we would like B and C to keep using the channel (or alternatively we need to add ownership annotations to channel handles, but that does not seem worth it, at least at this stage).

On the topic of querying nodes from the runtime: Is this for debugging stats/errors or more? I imagine the labels always being enforced on channel operations, in which case I see as the runtime enforcing this, but "runtime.enforce(...) -> Result<(), OakStatus/PrivError>" being called from e.g ChannelWriter.write, rather than WasmNode calling e.g. runtime.channel_write_with_priv directly.

This seems to add coupling though, now you need to make sure that before doing an actual write, a channel checks with the runtime to see whether that's allowed. Sure, in the end these things are equivalent, but there are more chances that someone may introduce or refactor code that forgets to perform the proper check. There may also be a time-of-check vs time-of-use issue, since the runtime state may change after the call to enforce but before the actual write (but perhaps this may also be a problem in the runtime-owned approach). Also it seems "bad" for channels to hide information from the runtime in general, in such a way that the runtime would not be able to know immediately how many channels have been created overall, and what nodes have read or write access to a given channel.

I think that would technically work, but the logic for IFC would end up being spread across channels and nodes. I guess it may be centralized into channels only, if they know a lot about the receiving nodes, but to me that seems to make the channel abstraction too "heavy". WDYT?

I think it depends on how the extra functionality works. If it's as simple as enforce_label(from: Node, to: Node), i'd put that in the runtime, and call it from channel code when adding messages, but no strong opinion either way.

Part of this is I'm thinking of the oak_runtime as a whole as the OS, and the runtime impl more of a node scheduler and now policy enforcer. And channels as their own abstraction, rather than a thin PODish type. I think both approaches have their benefits though, this was just how I was implementing it so far.

Yes I see your point, my personal view is that I'd rather have "thin" channel abstractions, and have more logic in the runtime itself, because that would allow the runtime to make decisions based on non-local state more easily, while in the "thick" channel abstraction, each channel would have a much more limited view, and would still end up having to be extremely coupled to the runtime in order to figure out all the details (e.g. what node am I sending data to? what Wasm module does it have loaded? what label does it have? can it remove this label?, etc.), and therefore any hope of ending up with a nice self-contained channel abstraction would also be quite slim.

@blaxill
Copy link
Contributor

blaxill commented Feb 25, 2020

Also, I think that having channels owned by the creator node is problematic from a resource management point of view: what if node A creates a channel, and it passes its respective handles to nodes B and C, and subsequently A dies?

This works currently with local only nodes, definitely a lot less clear with remote nodes.

I'm happy to go forward with your approach, I'll start looking into moving ownership of channels into the runtime first.

@tiziano88
Copy link
Collaborator Author

@blaxill for nodes, are you expecting to have some identifier or reference mapping to a generic node instance? I think we will need something like that when implementing label tracking, so that the runtime is aware of the node that is invoking a particular function.

e.g.

pub fn channel_write(&self, channel: &ChannelWriter, msg: Message) -> Result<(), OakStatus> {

could be changed into

 pub fn channel_write(&self, caller: NodeRef, channel: &ChannelWriter, msg: Message) -> Result<(), OakStatus> { 

@blaxill
Copy link
Contributor

blaxill commented Mar 9, 2020

Since nodes are 1-to-1 with threads, I was thinking of having the runtime track it internally in a thread local variable, but yeah having some node reference available to the runtime. Do you have a preference?

@tiziano88
Copy link
Collaborator Author

nodes are 1-to-1 with threads

I don't think it's necessarily the case, couldn't there be a node with multiple threads? e.g. the wasm node will likely need to support multiple threads at some point.

I think there needs to be a node identifier of some sort; either a numeric id that the runtime can use to look up node-specific information (e.g. labels), or a (possibly weak) reference to some internal data structure owned by the runtime that encapsulates such state.

@tiziano88 tiziano88 added this to the MVP milestone Mar 11, 2020
@blaxill blaxill mentioned this issue Mar 23, 2020
19 tasks
@blaxill
Copy link
Contributor

blaxill commented Mar 31, 2020

Closing via #690 #680 #728 #768.
More general Rust status at #540

@blaxill blaxill closed this as completed Mar 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants