-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
always try to read ahead by at least 5 blocks in the PBDagReader #5162
Conversation
License: MIT Signed-off-by: Steven Allen <[email protected]>
Good: If a previous read is canceled, we cancel the preloads that the read triggered. Bad: Future reads at that point will fail. This fixes that issue. License: MIT Signed-off-by: Steven Allen <[email protected]>
This should reduce stuttering when streaming. License: MIT Signed-off-by: Steven Allen <[email protected]>
License: MIT Signed-off-by: Steven Allen <[email protected]>
This gives a >10% speedup when catting a large file with a network latency of 20ms. |
return err | ||
} | ||
dr.promises[dr.linkPosition] = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always remove the promise. There's no reason to leave this around.
@Stebalien Could I hijack this PR (and abuse your readiness to explain IPFS to me) to use it as an opportunity to learn about the concept (which I always had trouble understanding) of a "promise" in the DAG reader? I can offer you a (probably defective) review in exchange :) |
Ask away (just don't touch the code). Note: "Promise" here is just a standard promise. We use it to turn the out-of-order GetMany API into an in-order array of future values. |
@schomatis to elaborate a little bit more on what @Stebalien said. The underlying method in bitswap for 'getting a bunch of blocks' returns the blocks in any order. But when streaming a file, we need them in the right order. To do that, we add a shim API on top of the out of order |
Oh, I think I get it now, thanks for the clarification. I'll take a closer look at the code then. |
if dr.promises[dr.linkPosition] == nil { | ||
dr.preloadNextNodes(ctx) | ||
// If we drop to <= preloadSize/2 preloading nodes, preload the next 10. | ||
for i := dr.linkPosition; i < dr.linkPosition+preloadSize/2 && i < len(dr.promises); i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Apologies in advanced for the noise if I'm misunderstanding what this code does.)
So, if I'm understanding this correctly, right now the number of preloaded nodes available in advance goes like 10, 9, 8, ..., 0, all of a sudden the algorithm realizes it's out of nodes and has to load 10 more, causing the stuttering.
In that case, could we just move the for
logic inside preload
, calling it here every time without checking if promises[dr.linkPosition]
is nil
, so as to say: "I've consumed one node, hey preload()
, make sure I still have preloadSize
nodes available in front of me, you figure out what to do".
Maybe preload
wouldn't even need the for
, in the first call, dr.promises[0] == nil
, load preloadSize
nodes altogether, and after that, every time it's called (because a node has been consumed), preload the node dr.promises[dr.linkPosition + preloadSize]
(assuming here that every node before dr.linkPosition + preloadSize
is already loaded).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just realized that all the time I was talking about nodes I should have been talking about node promises, I can't know if there's a node until I call Get()
right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I've consumed one node, hey preload(), make sure I still have preloadSize nodes available in front of me, you figure out what to do".
Sounds reasonable.
assuming here that every node before dr.linkPosition + preloadSize is already loaded
Technically, you can seek around so I'd rather be robust (the check is cheap).
I've just realized that all the time I was talking about nodes I should have been talking about node promises, I can't know if there's a node until I call Get() right?
No, but you can know if we've made a request for the node (i.e., the promise exists).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it looks like making that change makes the other call to preload trickier. That second call currently overwrites any existing promises assuming that if the first preload has been canceled, the later ones probably have been as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that fetching multiple blocks at once is more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that fetching multiple blocks at once is more efficient.
Oh, so maybe another constant N should be added that would indicate to call preload only if there are less than N available promises, to avoid calling it every time a node is consumed, e.g., if N is 5 then the number of available promises in advance would go like 10, 9, ..., 5, only now call preload, 15, 14, ..., 5, call preload again, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That second call currently overwrites any existing promises assuming that if the first preload has been canceled, the later ones probably have been as well.
Maybe preload()
could be extended to also check if the context of dr.promises[i]
has been cancelled, not only that it's nil
. (Another argument could be added to preload()
to indicate if we want to overwrite the promises, but the other solution sound more correct to me.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so maybe another constant N should be added that would indicate to call preload only if there are less than N available promises, to avoid calling it every time a node is consumed, e.g., if N is 5 then the number of available promises in advance would go like 10, 9, ..., 5, only now call preload, 15, 14, ..., 5, call preload again, etc.
That's what I currently do. If we have fewer than half preloadSize
loaded, I load the next preloadSize
. That means we vary between 5 and 15.
Maybe preload() could be extended to also check if the context of dr.promises[i] has been cancelled, not only that it's nil. (Another argument could be added to preload() to indicate if we want to overwrite the promises, but the other solution sound more correct to me.)
That's what the TODO is about. However, that requires a change to the promises. See: ipfs/go-ipld-format#34
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the clarification, then maybe we could add that extra argument to indicate that the current promises are cancelled and preload
should request them again (overwriting the previous ones), if you think it's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could but that doesn't seem any better to me. It just moves code from one place to another and then puts it behind a condition. Once we merge that linked PR, definitely.
return ctx.Err() | ||
} | ||
// In this case, the context used to *preload* the node has been canceled. | ||
// We need to retry the load with our context and we might as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we elaborate a bit more on this? I had the impression we're always using the same Context
from the PBDagReader
structure. In which case the local ctx
won't be cancelled but the one from the NodePromise
will be? The context from the promise comes from this function, maybe in another call (at it may have been preloaded before), but at which point can the global context from the DAG reader change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the context from the call Read call. IMO, that's the correct context as it allows us to cancel reads and seek elsewhere (canceling any associated preloading as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, scratch that. If the user is using Read
, we use the reader's cotnext. If the user is manually calling CtxReadFull
(a context aware read), we use the supplied context. I don't really think any additional comments will help as the context is just whatever gets passed in (and listing every possible context is just going to lead to comment rot when that changes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and listing every possible context is just going to lead to comment rot when that changes).
It seems to me there are only (or mainly) two possibilities. The most common (by far I think) is that the reader's context is used everywhere, so in fact there is only one context (and this part of the code isn't executed). The second, and less common,
If the user is manually calling CtxReadFull
which is when this scenario takes place (if I understand correctly).
Clarifying that distinction seems worth it, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough.
License: MIT Signed-off-by: Steven Allen <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This should reduce stuttering when streaming.
Also, fix an unlikely bug where canceling a context from one read could fail a future read.