Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two stage cache for IPRS/IPNS #74

Open
Kubuxu opened this issue Feb 11, 2016 · 4 comments
Open

Two stage cache for IPRS/IPNS #74

Kubuxu opened this issue Feb 11, 2016 · 4 comments

Comments

@Kubuxu
Copy link
Member

Kubuxu commented Feb 11, 2016

What is two stage cache?

It is cache which is defined by two times instead of one. First defines how long the entry can be cached, second how long the entry is "safe" meaning how long we can use it without trying to validate if it is actual.

If we try getting cached entry there are few cases that can happen:

  • no entry available in cache or it is older than the first time => resolve entry, save it and return it
  • entry available and it is younger than the "safe" time => return the entry
  • entry available, it is older thank the "safe" time, younger than the first time => return the entry, run resolution in background if it is not already running, after it resolves it is reinserted into cache.

Why is it better?

Normal cache (with one time) has problem, if let's say the cache time is 1 minute we will have cache miss every 1 minute. It is the case currently with IPNS which ruins UX.

How it compares with current caching strategy of IPNS?

Currently entries are cached for a 60 seconds in normal cache. It is an equivalent of two stage cache with both times equal to 60 seconds (60, 60), cache is valid 60 seconds, re-resolve after 60 seconds.

What can it improve?

UX could be directly improved by changing it to (60, 30). This means if user uses entry after 30 seconds from resolution it will start re-resolving in background. This also increases consistency, because resolutions will be run more frequently.

We could also increase consistency without changing UX but increasing load on the network by making it (60, 0). This means that if entry is read for cache it will start re-resolving.

Due to how many possible configurations there are I suggest that we allow IPRS/IPNS entries to set their cache times in records them selves, defaulting to (60, 30) which would better in both UX and consistency than current (60, 60).

cc @jbenet @whyrusleeping @noffle

@jbenet
Copy link
Member

jbenet commented Feb 11, 2016

I like the thoughts, but i dont think this is needed exactly like this. Maybe a tweak: we can achieve the same by just making our tooling re-resolve names some time before they expire in the background. (set timeouts to re-resolve when now == EOL - resolving_delay). and maybe only not do that if the load would be huge (tons of names), or names not used much.

resolving much ahead of validity time should be useless.. only useful ~1 (~N=2) resolving_delay before.

important definition difference. i think it should be:

  • (first time) validity time - this is the "how long it is valid (i.e. safe to use)" one.
  • (second time) re-resolve timeout

@Kubuxu
Copy link
Member Author

Kubuxu commented Feb 11, 2016

Problem with automatic re-resolving when timeout is coming is that it would further increase passive network usage which already is quite big (internal connection stats of IPFS are broken). I don't want to increase it further as it sometimes already hurts my connection performance (I have only 50KiB/s upload and it sometimes hits half this value while working passively). Name resolutions are quite expensive operations as we actively try to get 16 confirmations for our record.

Thanks to doing it in background when user request it and cache is hit saves us the trouble of journaling which entries are used, checking if we are not hurting performance of the system an so on.

resolving much ahead of validity time should be useless.. only useful ~1 (~N=2) resolving_delay before.

Resolution time is hard to estimate as it deepens on time-outs, already connected peers, network performance and so on.
Currently I can see it vary between 5 to 30 seconds with KValue patch that was recently introduced.

Option to resolve it always but in background, two stage cache (60, 0), would be beneficial when we want to be sure that UX is kept optimal but updates are coming all the time, maximum possible consistency with the rest of the network, while at the same time not using any extra resources after user stops using the record.

@whyrusleeping
Copy link
Member

@Kubuxu when testing with the KValue patch, did you publish from a node that had the patch? Or just resolve from one?

@Kubuxu
Copy link
Member Author

Kubuxu commented Feb 11, 2016

I've upgraded both nodes and republished it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants