-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement virtual objects kept in vat secondary storage #1907
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Glad to see progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think virtualObjectManager.js
could be a lot simpler. I've got a sketch of a different approach, which also fixes the information leakage of the shared cache innerSelf
objects. I'll attach it below. Let's chat about this before moving forward and see if you agree.
@@ -11,6 +11,9 @@ import { assert, details } from '@agoric/assert'; | |||
import { isPromise } from '@agoric/promise-kit'; | |||
import { insistVatType, makeVatSlot, parseVatSlot } from '../parseVatSlots'; | |||
import { insistCapData } from '../capdata'; | |||
import { makeVirtualObjectManager } from './virtualObjectManager'; | |||
|
|||
const VIRTUAL_OBJECT_CACHE_SIZE = 3; // XXX ridiculously small value to force churn for testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good technique. Maybe make it an option to makeLiveslots
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably. I haven't yet taken the time to work out the right way to inject it, but since it's set on a per-vat basis I was figuring it should be a creation parameter that can be specified in the config file.
const vatGlobals = harden({ | ||
makeWeakStore, | ||
makeKind, | ||
flushCache, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think flushCache
should be available to the vat code. Maybe as a vatPower
, but I expect this is only for testing, and we should look for a different way to expose it to tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I put it in specifically for testing. I originally had a comment to the effect of "remove after debugging" or some such, but after further thought concluded that it did no harm being there (aside from being a way to make one's vat run slower) so I took the comment out. But I have no problem removing it or making it a special power of some kind, as in normal use it doesn't do anything actually useful either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked today about moving flushCache
out to a return value, where unit tests could reach it, rather than making it a vat global.
Looking at the return value down at the new line 643, I suspect we don't need to avoid returning m
that list. So if we add flushCache
to the object that build
returns (next to vatGlobals
and dispatch
, then we can change line 643 to just return build(syscall, ...
.
id = Nat(Number(idSuffix)); | ||
} | ||
|
||
return { type, allocatedByVat, virtual, id, subid }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was some discussion in #455 about allowing these to be an arbitrary hierarchy, in which case subid
is more likely to be an Array
of Nats (or id
is always an array, and we change the callers to insist length===1
and extract just id[0]
). But it's probably not worth adding that now, I'd wait until we have a clear use case for three-element IDs first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point I can't even imagine what an arbitrary hierarchy might be used for, so it would definitely be engineering in advance of need.
size = saveSize; | ||
}, | ||
remember(innerObj) { | ||
if (liveTable.has(innerObj.instanceKey)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit weirded out by the objects being passed into this structure being modified by it, I think I'd prefer the cache produce it's own boxes around the user-supplied values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a similar sense, but since the inner object and the cache are already necessarily deep in each other's business anyway there didn't seem to be much payoff to the added complexity. Keep in mind that the cache isn't dealing with "user-supplied values", it's only dealing with internal objects.
So, here's a sketch of a simpler approach. Limitations so far:
My function makeCache(size, load, store) {
const values = new Map(); // key -> { dirty, value }
const keys = []; // TODO replace wth more-efficient linked list
// but build our own boxes, don't impose anything upon the caller-supplied
// 'value'
function evict(key) {
const { dirty, value } = values.get(key);
if (dirty) {
store(key, value);
}
values.delete(key);
}
// make room for one entry
function makeRoom() {
while (keys.length && keys.length > size - 1) {
const key = keys.pop();
evict(key);
}
}
function promote(key) {
assert(values.has(key));
const where = keys.indexOf(key);
keys.splice(where, 1);
keys.unshift(key);
}
function init(key, value) {
assert(!values.has(key));
makeRoom();
values.set(key, { dirty: true, value });
keys.unshift(key);
}
function get(key) {
if (values.has(key)) {
promote(key);
return values.get(key).value;
}
makeRoom();
const value = load(key);
values.set(key, { dirty: false, value });
keys.unshift(key);
return value;
}
function set(key, value) {
if (values.has(key)) {
promote(key);
values.get(key) = { dirty: true, value };
} else {
makeRoom();
values.set(key, { dirty: true, value });
keys.unshift(key);
}
}
const cache = harden({ init, get, set });
return cache;
}
export function makeVirtualObjectManager(
syscall,
allocateExportID,
valToSlotTable,
m,
cacheSize,
) {
function load(instanceKey) {
return JSON.parse(syscall.vatstoreGet(instanceKey));
}
function store(instanceKey, rawData) {
syscall.vatstoreSet(instanceKey, JSON.stringify(rawData));
}
const cache = makeCache(cacheSize, load, store);
function makeKind(instanceMaker) {
const kindID = `${allocateExportID()}`;
let nextInstanceID = 1;
function makeRepresentative(instanceKey) {
const state = {};
const names = Object.getOwnPropertyNames(cache.get(instanceKey));
for (const prop of names) {
Object.defineProperty(state, prop, {
get: () => {
const rawState = cache.get(instanceKey);
return m.unserialize(rawState[prop]);
},
set: value => {
// serialize first, since 'value' might have getters and user
// code could provoke cache activity during serialization
const serialized = m.serialize(value);
const rawState = cache.get(instanceKey);
rawState[prop] = serialized;
cache.set(instanceKey, rawState);
},
});
}
harden(state);
const representative = instanceMaker(state);
delete representative.initialize;
harden(representative);
valToSlotTable.set(representative, instanceKey);
return representative;
}
function reanimate(instanceKey) {
return makeRepresentative(instanceKey);
}
kindTable.set(kindID, reanimate);
function makeNewInstance(...initArgs) {
const instanceKey = `o+${kindID}/${nextInstanceID}`;
nextInstanceID += 1;
// build one to get the initial state
const initialState = {};
const tempInstance = instanceMaker(initialState);
tempInstance.initialize(...initArgs);
const rawState = {};
for (const prop of Object.getOwnPropertyNames(initialState)) {
rawState[prop] = m.serialize(initialState[prop]);
}
cache.init(instanceKey, rawState);
// then throw it away and build the first representative from the cache
return makeRepresentative(instanceKey);
}
return makeNewInstance;
}
} |
a68bdc2
to
5f3366b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One item that needs to be fixed for safety, and a few that are mostly cosmetic. We need tests before landing, but the code looks good. Nice!
const vatGlobals = harden({ | ||
makeWeakStore, | ||
makeKind, | ||
flushCache, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we talked today about moving flushCache
out to a return value, where unit tests could reach it, rather than making it a vat global.
Looking at the return value down at the new line 643, I suspect we don't need to avoid returning m
that list. So if we add flushCache
to the object that build
returns (next to vatGlobals
and dispatch
, then we can change line 643 to just return build(syscall, ...
.
|
||
const initializationData = {}; | ||
const tempInstance = instanceMaker(initializationData); | ||
tempInstance.initialize(...args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to do valToSlotTable.set(tempInstance, instanceKey)
before calling tempInstance.initialize
. The user-supplied initialize
function is likely to interact with other representatives and WeakStores, and it might pass itself to one of those for use as a key: imagine the Purse/Payment/DepositFacet, and a case where the deposit facet does a depositLedger.init(deposit, purse)
during initialization. We need to get our instanceKey
associated with this instance early enough to allow those calls to work.
It's not trivial, though, because those same calls might also read from a WeakStore, and we won't be prepared to build a new Representative until we've saved the data from initialize
. I think we can land this before solving the problem, but we absolutely must build a unit test that exercises it so we know what more needs to be done.
So, actually let's put a comment here reminding us to add that valToSlot
addition, but let's not implement it yet, because I think doing it halfway might mask the kind of bug I'm anticipating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added #1938 to explore this further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm. This one is tricky, since initialize
is called before the state has been wrapped (since the wrapping needs the result of initialize
to act upon). If initialize
passes the instance to somebody else, they might call one of the other methods, which would then operate on the unwrapped state directly. This would be OK inside the call stack beneath initialize
, since the net result would basically just be a more complicated initialization. However, if one of those callees were to squirrel the "instance" away somewhere and invoke it later, it would be operating on a disconnected, non-persistent, non-consistent private copy of the state, which would be bad and wrong. I think this means that the wrapping operation needs to modify the state object in place (which you previously talked me into not doing -- in fact, now that I think about it this may have been the reason why I originally wrote the code to work that way).
In any case, some kind of verification that this works OK needs to be added to the unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In writing up #1938, I concluded our Purse example won't work work at all until we add the tempInstance
to valToSlot
before calling initialize
. We should add that Purse example to a unit test, and either include that in this PR (necessitating the valToSlot
addition in the PR too), or shortly after landing (precluding this feature's use in ERTP until we get #1938 fixed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this means that the wrapping operation needs to modify the state object in place (which you previously talked me into not doing -- in fact, now that I think about it this may have been the reason why I originally wrote the code to work that way).
Eeeeinteresting... and even more subtle than what I was investigating. Yeah, sorry if I undid that. Definitely needs some tests either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest version copes with having the representative being handed to somebody by initialize
. I believe the fix addresses the issues you describe in #1938 as well, but I'm working on a unit test to verify this for sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good, thanks. I'm still nervous about this case, the full test would be if that "somebody" uses the representative as the value in a weakstore (serializing it into just the ID), and then reads that value back out again (creating a new representative), and then calls methods on the new representative.
And I strongly suspect that the presence of any code after initialize()
returns means that the first and second representatives are not functionally equivalent. Or at least I am sure that the lack of any code after that point means they are equivalent, which may or may not imply the contrapositive.
Mind you, this is a subtle point, and I think I'd be ok with landing this even if that bug remained, as long as #1938 remains to track it, and we treat any potential illicit communication channel as a release blocker.
2d90c4a
to
7713aab
Compare
…able cache size as config param
848eda7
to
9473ac0
Compare
Still lacking unit tests but putting it up for review.