Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

always try to read ahead by at least 5 blocks in the PBDagReader #5162

Merged
merged 5 commits into from
Jul 17, 2018

Conversation

Stebalien
Copy link
Member

@Stebalien Stebalien commented Jun 28, 2018

This should reduce stuttering when streaming.

Also, fix an unlikely bug where canceling a context from one read could fail a future read.

License: MIT
Signed-off-by: Steven Allen <[email protected]>
Good: If a previous read is canceled, we cancel the preloads that the read
triggered.
Bad: Future reads at that point will fail.

This fixes that issue.

License: MIT
Signed-off-by: Steven Allen <[email protected]>
This should reduce stuttering when streaming.

License: MIT
Signed-off-by: Steven Allen <[email protected]>
License: MIT
Signed-off-by: Steven Allen <[email protected]>
@Stebalien
Copy link
Member Author

This gives a >10% speedup when catting a large file with a network latency of 20ms.

@Stebalien Stebalien added need/review Needs a review and removed status/in-progress In progress labels Jun 28, 2018
return err
}
dr.promises[dr.linkPosition] = nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always remove the promise. There's no reason to leave this around.

@schomatis
Copy link
Contributor

@Stebalien Could I hijack this PR (and abuse your readiness to explain IPFS to me) to use it as an opportunity to learn about the concept (which I always had trouble understanding) of a "promise" in the DAG reader? I can offer you a (probably defective) review in exchange :)

@Stebalien
Copy link
Member Author

Ask away (just don't touch the code). Note: "Promise" here is just a standard promise. We use it to turn the out-of-order GetMany API into an in-order array of future values.

@whyrusleeping
Copy link
Member

@schomatis to elaborate a little bit more on what @Stebalien said. The underlying method in bitswap for 'getting a bunch of blocks' returns the blocks in any order. But when streaming a file, we need them in the right order. To do that, we add a shim API on top of the out of order GetMany API that immediately returns these promise objects, and then in the background fulfills each promise as the block for it is received.
Then, the consumer simply calls Get on each promise, in order, and they get a nicely ordered stream of blocks.

@schomatis
Copy link
Contributor

Oh, I think I get it now, thanks for the clarification. I'll take a closer look at the code then.

if dr.promises[dr.linkPosition] == nil {
dr.preloadNextNodes(ctx)
// If we drop to <= preloadSize/2 preloading nodes, preload the next 10.
for i := dr.linkPosition; i < dr.linkPosition+preloadSize/2 && i < len(dr.promises); i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Apologies in advanced for the noise if I'm misunderstanding what this code does.)

So, if I'm understanding this correctly, right now the number of preloaded nodes available in advance goes like 10, 9, 8, ..., 0, all of a sudden the algorithm realizes it's out of nodes and has to load 10 more, causing the stuttering.

In that case, could we just move the for logic inside preload, calling it here every time without checking if promises[dr.linkPosition] is nil, so as to say: "I've consumed one node, hey preload(), make sure I still have preloadSize nodes available in front of me, you figure out what to do".

Maybe preload wouldn't even need the for, in the first call, dr.promises[0] == nil, load preloadSize nodes altogether, and after that, every time it's called (because a node has been consumed), preload the node dr.promises[dr.linkPosition + preloadSize] (assuming here that every node before dr.linkPosition + preloadSize is already loaded).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realized that all the time I was talking about nodes I should have been talking about node promises, I can't know if there's a node until I call Get() right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"I've consumed one node, hey preload(), make sure I still have preloadSize nodes available in front of me, you figure out what to do".

Sounds reasonable.

assuming here that every node before dr.linkPosition + preloadSize is already loaded

Technically, you can seek around so I'd rather be robust (the check is cheap).

I've just realized that all the time I was talking about nodes I should have been talking about node promises, I can't know if there's a node until I call Get() right?

No, but you can know if we've made a request for the node (i.e., the promise exists).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it looks like making that change makes the other call to preload trickier. That second call currently overwrites any existing promises assuming that if the first preload has been canceled, the later ones probably have been as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that fetching multiple blocks at once is more efficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that fetching multiple blocks at once is more efficient.

Oh, so maybe another constant N should be added that would indicate to call preload only if there are less than N available promises, to avoid calling it every time a node is consumed, e.g., if N is 5 then the number of available promises in advance would go like 10, 9, ..., 5, only now call preload, 15, 14, ..., 5, call preload again, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That second call currently overwrites any existing promises assuming that if the first preload has been canceled, the later ones probably have been as well.

Maybe preload() could be extended to also check if the context of dr.promises[i] has been cancelled, not only that it's nil. (Another argument could be added to preload() to indicate if we want to overwrite the promises, but the other solution sound more correct to me.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so maybe another constant N should be added that would indicate to call preload only if there are less than N available promises, to avoid calling it every time a node is consumed, e.g., if N is 5 then the number of available promises in advance would go like 10, 9, ..., 5, only now call preload, 15, 14, ..., 5, call preload again, etc.

That's what I currently do. If we have fewer than half preloadSize loaded, I load the next preloadSize. That means we vary between 5 and 15.

Maybe preload() could be extended to also check if the context of dr.promises[i] has been cancelled, not only that it's nil. (Another argument could be added to preload() to indicate if we want to overwrite the promises, but the other solution sound more correct to me.)

That's what the TODO is about. However, that requires a change to the promises. See: ipfs/go-ipld-format#34

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the clarification, then maybe we could add that extra argument to indicate that the current promises are cancelled and preload should request them again (overwriting the previous ones), if you think it's worth it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could but that doesn't seem any better to me. It just moves code from one place to another and then puts it behind a condition. Once we merge that linked PR, definitely.

return ctx.Err()
}
// In this case, the context used to *preload* the node has been canceled.
// We need to retry the load with our context and we might as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we elaborate a bit more on this? I had the impression we're always using the same Context from the PBDagReader structure. In which case the local ctx won't be cancelled but the one from the NodePromise will be? The context from the promise comes from this function, maybe in another call (at it may have been preloaded before), but at which point can the global context from the DAG reader change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the context from the call Read call. IMO, that's the correct context as it allows us to cancel reads and seek elsewhere (canceling any associated preloading as well).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, scratch that. If the user is using Read, we use the reader's cotnext. If the user is manually calling CtxReadFull (a context aware read), we use the supplied context. I don't really think any additional comments will help as the context is just whatever gets passed in (and listing every possible context is just going to lead to comment rot when that changes).

Copy link
Contributor

@schomatis schomatis Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and listing every possible context is just going to lead to comment rot when that changes).

It seems to me there are only (or mainly) two possibilities. The most common (by far I think) is that the reader's context is used everywhere, so in fact there is only one context (and this part of the code isn't executed). The second, and less common,

If the user is manually calling CtxReadFull

which is when this scenario takes place (if I understand correctly).

Clarifying that distinction seems worth it, IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough.

@ghost ghost added the status/in-progress In progress label Jul 6, 2018
@Stebalien Stebalien removed the status/in-progress In progress label Jul 6, 2018
Copy link
Contributor

@schomatis schomatis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Stebalien Stebalien added RFM and removed need/review Needs a review labels Jul 17, 2018
@whyrusleeping whyrusleeping merged commit 3218703 into master Jul 17, 2018
@whyrusleeping whyrusleeping deleted the feat/improve-preload branch July 17, 2018 04:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants