Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway: Cache-Control, max-age=ttl, ETag for /ipns requests #329

Open
ion1 opened this issue Oct 9, 2015 · 17 comments · Fixed by #459
Open

Gateway: Cache-Control, max-age=ttl, ETag for /ipns requests #329

ion1 opened this issue Oct 9, 2015 · 17 comments · Fixed by #459
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) topic/gateway Issues related to HTTP Gateway

Comments

@ion1
Copy link

ion1 commented Oct 9, 2015

The gateway does not seem to provide a Cache-Control/max-age header or an ETag for /ipns URLs.

It might be useful to assume some reasonable non-zero max-age for content on IPNS. A minute? Fifteen seconds?

It would also be good to serve an ETag corresponding to whatever IPFS hash the IPNS path resolved to.

@jbenet
Copy link
Member

jbenet commented Oct 9, 2015

It might be useful to assume some reasonable non-zero max-age for content
on IPNS. A minute? Fifteen seconds?

It could be determined by the IPNS record -- some will have a calculable
TTL.

@ghost
Copy link

ghost commented Oct 9, 2015

Until then, we could set must-revalidate and a quasi infinite max-age

@ion1
Copy link
Author

ion1 commented Oct 9, 2015

I’m not sure if I have understood this correctly, but I’m under the impression that must-revalidate only applies if the content has already expired. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4

@ghost
Copy link

ghost commented Oct 9, 2015

You're right, thanks -- so that means must-revalidate + max-age=0 = cache-indefinitely-unless-updated

@ghost
Copy link

ghost commented Oct 9, 2015

Note that for /ipfs/, we can still set a quasi-infinite max-age.

@ion1
Copy link
Author

ion1 commented Oct 9, 2015

It seems to me that must-revalidate means that given an object that expired according to max-age, a cache must return an error instead of a stale copy if it fails to connect to the origin server. I’m not sure must-revalidate is of any use here. Unless I’m misinterpreting it, of course.

@whyrusleeping whyrusleeping added the topic/gateway Issues related to HTTP Gateway label Oct 11, 2015
@ion1
Copy link
Author

ion1 commented Oct 14, 2015

DNS lets you say explicitly for how long a record can be cached. Perhaps we should do that with IPNS names? ipfs name publish --ttl=3h which would be taken advantage of by IPFS nodes who have resolved the name as well as by the gateway in the Cache-Control header.

@ion1
Copy link
Author

ion1 commented Oct 28, 2015

It was mentioned on IRC that resolve does not provide TTL information to be used for max-age.

resolveOnce could be changed to return a TTL value. resolve could start at infinity-ish and compute the minimum of the previous TTL and what resolveOnce returned at each iteration.

Given that the DNS lookup functions in the net package for Go do not seem to provide TTL information at the moment, the DNS resolveOnce could return a reasonable guess like one minute. If a DNS record points to /ipns/<keyhash>, that IPNS record can still decrement the TTL according to the iteration above if a low cache time is desired.

If resolve returned a TTL, the if strings.HasPrefix(urlPath, ipfsPathPrefix) check in gateway_handler.go could go away and the header be set unconditionally.

@ion1
Copy link
Author

ion1 commented Oct 29, 2015

Would this be a reasonable starting point?

diff --git a/namesys/interface.go b/namesys/interface.go
index 09c296c..b0b740d 100644
--- a/namesys/interface.go
+++ b/namesys/interface.go
@@ -47,6 +47,14 @@ const (
    // trust resolution to eventually complete and can't put an upper
    // limit on how many steps it will take.
    UnlimitedDepth = 0
+
+   // ImmutableTTL is the time-to-live value to be reported for immutable
+   // objects.
+   ImmutableTTL = 10 * 365 * 24 * time.Hour
+
+   // UnknownTTL is the time-to-live value to be reported for mutable
+   // paths when accurate information is not available.
+   UnknownTTL = time.Minute
 )

 // ErrResolveFailed signals an error when attempting to resolve.
@@ -88,7 +96,10 @@ type Resolver interface {
    // There is a default depth-limit to avoid infinite recursion.  Most
    // users will be fine with this default limit, but if you need to
    // adjust the limit you can use ResolveN.
-   Resolve(ctx context.Context, name string) (value path.Path, err error)
+   //
+   // Resolve also returns a time-to-live value which indicates the
+   // maximum amount of time the result may be cached.
+   Resolve(ctx context.Context, name string) (value path.Path, ttl time.Duration, err error)

    // ResolveN performs a recursive lookup, returning the dereferenced
    // path.  The only difference from Resolve is that the depth limit
@@ -97,7 +108,7 @@ type Resolver interface {
    //
    // Most users should use Resolve, since the default limit works well
    // in most real-world situations.
-   ResolveN(ctx context.Context, name string, depth int) (value path.Path, err error)
+   ResolveN(ctx context.Context, name string, depth int) (value path.Path, ttl time.Duration, err error)
 }

 // Publisher is an object capable of publishing particular names.
diff --git a/namesys/base.go b/namesys/base.go
index e552fce..29ef997 100644
--- a/namesys/base.go
+++ b/namesys/base.go
@@ -2,6 +2,7 @@ package namesys

 import (
    "strings"
+   "time"

    context "github.com/ipfs/go-ipfs/Godeps/_workspace/src/golang.org/x/net/context"

@@ -9,27 +10,36 @@ import (
 )

 type resolver interface {
-   // resolveOnce looks up a name once (without recursion).
-   resolveOnce(ctx context.Context, name string) (value path.Path, err error)
+   // resolveOnce looks up a name once (without recursion).  It also
+   // returns a time-to-live value which indicates the maximum amount of
+   // time the result may be cached.
+   resolveOnce(ctx context.Context, name string) (value path.Path, ttl time.Duration, err error)
 }

 // resolve is a helper for implementing Resolver.ResolveN using resolveOnce.
-func resolve(ctx context.Context, r resolver, name string, depth int, prefixes ...string) (path.Path, error) {
+func resolve(ctx context.Context, r resolver, name string, depth int, prefixes ...string) (path.Path, time.Duration, error) {
+   // Start with a long TTL.
+   ttl := ImmutableTTL
+
    for {
-       p, err := r.resolveOnce(ctx, name)
+       p, ttlOnce, err := r.resolveOnce(ctx, name)
+       // Use the lowest TTL reported by the resolveOnce invocations.
+       if ttlOnce < ttl {
+           ttl = ttlOnce
+       }
        if err != nil {
            log.Warningf("Could not resolve %s", name)
-           return "", err
+           return "", ttl, err
        }
-       log.Debugf("Resolved %s to %s", name, p.String())
+       log.Debugf("Resolved %s to %s (TTL %v -> %v)", name, p.String(), ttlOnce, ttl)

        if strings.HasPrefix(p.String(), "/ipfs/") {
            // we've bottomed out with an IPFS path
-           return p, nil
+           return p, ttl, nil
        }

        if depth == 1 {
-           return p, ErrResolveRecursion
+           return p, ttl, ErrResolveRecursion
        }

        matched := false
@@ -44,7 +54,7 @@ func resolve(ctx context.Context, r resolver, name string, depth int, prefixes .
        }

        if !matched {
-           return p, nil
+           return p, ttl, nil
        }

        if depth > 1 {

@whyrusleeping
Copy link
Member

@ion1 yeah, we could return a ttl with every resolution... I'm generally against more return types than 1+error, but this is likely a cleaner approach than others

@ion1
Copy link
Author

ion1 commented Oct 31, 2015

<jbenet> […] i'm overall -1 on returning the TTL on every Resolve call. i think that should be a separate function call.
<jbenet> most uses of Resolve do not care.
<jbenet> and its important to keep the interface as clean as we can possible have it.
<ion> jbenet: Fair enough. So it would be better to add a new function to the Resolver interface?
<jbenet> ion: yeah a new function SGTM.

I’m going to do that then.

ion1 referenced this issue in ion1/go-ipfs Nov 5, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 5, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 5, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 5, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 5, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 7, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 7, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 7, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
ion1 referenced this issue in ion1/go-ipfs Nov 8, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
@ion1 ion1 mentioned this issue Nov 9, 2015
11 tasks
ion1 referenced this issue in ion1/go-ipfs Nov 13, 2015
Closes ipfs/go-ipfs#1818.

License: MIT
Signed-off-by: Johan Kiviniemi <[email protected]>
@lidel
Copy link
Member

lidel commented Jan 2, 2019

Just for the record, this is still an open issue for /ipns/ responses returned by go-ipfs v0.4.18.

2019-01-02--14-08-56

Etag is present, but we are missing an explicit IPNS expiration header.

AFAIK it could be either:

  • Cache-Control header with max-age
  • or Expires header with a date

Why?
There is a Last-Modified header so most browsers will use "Heuristic Freshness" to determine how long to cache that asset for. This is a weak cache hint, as RFC7234 suggests something like (current time - last modified time) / 10, however, there are subtle differences between Firefox and Chrome.

Due to this we should provide explicit cache expiration headers for /ipns/ to unify experience across all vendors.

@magik6k magik6k added the help wanted Seeking public contribution on this issue label Jan 2, 2019
@magik6k
Copy link
Member

magik6k commented Jan 2, 2019

This will sort-of depend on ipfs/kubo#5884, as ipns supports dns resolution too. 'Real' IPNS records support TTL too, but AFAIK it's currently not used (basically it's set to 0).

@hacdias hacdias self-assigned this May 31, 2023
@hacdias hacdias transferred this issue from ipfs/kubo Jun 1, 2023
@ipfs ipfs deleted a comment from welcome bot Jun 1, 2023
@hacdias hacdias changed the title max-age, ETag for /ipns requests Gateway: max-age, ETag for /ipns requests Jun 1, 2023
@hacdias hacdias moved this from 🥞 Todo to 🔎 In Review in IPFS Shipyard Team Sep 5, 2023
@hacdias hacdias moved this from 🔎 In Review to 🏃‍♀️ In Progress in IPFS Shipyard Team Sep 5, 2023
@github-project-automation github-project-automation bot moved this from 🏃‍♀️ In Progress to 🎉 Done in IPFS Shipyard Team Oct 18, 2023
@BigLep BigLep mentioned this issue Nov 9, 2023
11 tasks
@lidel
Copy link
Member

lidel commented Mar 13, 2024

Reopening. We have spec at https://specs.ipfs.tech/http-gateways/path-gateway/#cache-control-response-header, but boxo/gateway does not implement it fully.

This is not fixed fully yet.

IPNS Records are handled correctly thanks to #459 but we don't set Cache-Control with max-age based on TTL as noted in #459 (comment). I remember we did not have TTL there because Golang standard library is not exposing access to original TTL of DNS record when resolver from operating system is used.

Remaining work here

  • Always return Cache-Control on /ipns/{dnslink} responses. Either correctly wire up TTL from DNS TXT record of DNSLink, or use 5 minute as default fallback when it is not available.
  • port Cache-Control implementation from routing/http/server: add cache control #584, where value is public, max-age=<ttl>, stale-while-revalidate=<dht-or-other-expiration-window>, stale-if-error=<dht-or-other-expiration-window>
  • update gateway-conformance tests to test for presence of cache-control on DNSLink response

@lidel lidel reopened this Mar 13, 2024
@lidel lidel added the kind/bug A bug in existing code (including security flaws) label Mar 13, 2024
@lidel lidel changed the title Gateway: max-age, ETag for /ipns requests Gateway: Cache-Control, max-age=ttl, ETag for /ipns requests Mar 13, 2024
@lidel lidel moved this from 🎉 Done to 🥞 Todo in IPFS Shipyard Team Mar 13, 2024
@2color
Copy link
Member

2color commented Mar 14, 2024

port Cache-Control implementation from #584, where value is public, max-age=, stale-while-revalidate=, stale-if-error=

For DNSLink, what would you recommend the stale-while-revalidate be, since it has nothing to do with the DHT?

@lidel
Copy link
Member

lidel commented Mar 14, 2024

Good question. Unless there is a better value, I suggest using Amino DHT expiration window (48h) in cases like this, where we need a sensible hard ceiling on caching mutable things.

Rationale: DNSLink resolves to some CID. Returning a cached HTTP response body is ok as long the the data behind that CID could be (in theory) retrievable from the public swarm provider. So if we count from the moment we resolved dnslink and successfully retrieved it, 48h is the max window we can assume the retrieval of the same payload would be successful.

@hacdias hacdias removed their assignment May 27, 2024
lidel added a commit that referenced this issue Jul 29, 2024
just a precaution, since we still have TTL=0 for many DNSLink responses
due to #329 (comment)
lidel added a commit that referenced this issue Jul 29, 2024
)

* fix(gateway): Cache-Control header for UnixFS directories
* fix: limit static cache-control to /ipfs

just a precaution, since we still have TTL=0 for many DNSLink responses
due to #329 (comment)

---------

Co-authored-by: Joshua Noble <[email protected]>
wenyue pushed a commit to wenyue/boxo that referenced this issue Oct 17, 2024
…pfs#643)

* fix(gateway): Cache-Control header for UnixFS directories
* fix: limit static cache-control to /ipfs

just a precaution, since we still have TTL=0 for many DNSLink responses
due to ipfs#329 (comment)

---------

Co-authored-by: Joshua Noble <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue kind/bug A bug in existing code (including security flaws) topic/gateway Issues related to HTTP Gateway
Projects
No open projects
Status: 🥞 Todo
Development

Successfully merging a pull request may close this issue.

8 participants
@ion1 @jbenet @lidel @whyrusleeping @2color @magik6k @hacdias and others