Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

send longer cache headers by default #1725

Merged
merged 15 commits into from
Jul 20, 2018
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions lib/all-badge-examples.js
Original file line number Diff line number Diff line change
Expand Up @@ -2219,6 +2219,9 @@ function findCategory(wantedCategory) {

function loadExamples() {
loadServiceClasses().forEach(ServiceClass => {
if (ServiceClass.category === 'debug') {
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you leave a comment in the code about the intent of this check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Perhaps it should check an env var like ENABLE_DEBUG_SERVICES.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but can we replace it with this?

if (!ServiceClass.examples) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels like a good check to make, but perhaps it should fail a test or issue a warning, unless category is debug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally if we wanted to create a badge but not add it to the examples page (for some reason), it would have been sufficient to define:

static get examples() {
  return [];
}

The reason I have specifically added a check here is because I've defined this badge with a category 'debug' which doesn't exist in the examples. If we remove that check, the next line const category = findCategory(ServiceClass.category); throws Error: Unknown category debug referenced in Time. If I had (wrongly IMO) defined this badge as being in some other category like 'downloads', just returning [] from examples() would be fine and I wouldn't have added this check. We want some way to ensure we don't call findCategory() on a category that doesn't exist.

It is probably also useful if:

  • There is some way to explicitly declare that particular badges should be excluded from the examples (we also do this with flip, for example)
  • loadExamples() throws an exception if example() is not implemented at all on some class, as in most cases that probably represents an error or oversight.

I don't have particularly strong opinions on the structure or convention we choose to to do that. Thoughts? (Note that [] is 'truthy' in javascript.)

Perhaps it should check an env var like ENABLE_DEBUG_SERVICES

Remember we're assembling examples for the front-end here. I'm not sure it is too important that it is configurable whether we show this debug badge on the examples page (presumably you'd want that on in local/dev if anywhere)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! It didn't register that this was in loadExamples.

Could you move the check into the block right below?

i.e.

if (category === undefined) {
  if (ServiceClass.category === 'debug') {
    return;
  } else {
    throw 
  }
}

}
const category = findCategory(ServiceClass.category);
if (category === undefined) {
throw Error(`Unknown category ${ServiceClass.category} referenced in ${ServiceClass.name}`);
Expand Down
16 changes: 11 additions & 5 deletions lib/request-handler.js
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
'use strict';

const envFlag = require('node-env-flag');
// eslint-disable-next-line node/no-deprecated-api
const domain = require('domain');
const request = require('request');
Expand Down Expand Up @@ -79,13 +80,18 @@ function handleRequest (makeBadge, handlerOptions) {
return (queryParams, match, end, ask) => {
const reqTime = new Date();

let maxAge = envFlag(process.env.LONG_CACHE, false) ? 900 : 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about making maxAge a handler option? That would simplify testing. Also I’d suggest we make the environment variable BADGE_MAX_AGE_SECONDS which would facilitate granular config changes without code changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said before default cache value (900 seconds -> 15 min) is too long in my opinion. Anything > 0 have a change to improve performance. Let's start with a really low value, e.g. 30s.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d suggest we make the environment variable BADGE_MAX_AGE_SECONDS which would facilitate granular config changes without code changes.

👍 Done.

As I said before default cache value (900 seconds -> 15 min) is too long in my opinion.

Making this change usefully abstracts the issue of what BADGE_MAX_AGE_SECONDS should be set to in production. If its not set, we default to 0 (sensible default for dev, staging or self-hosting). The source code itself is now not opinionated on this issue. Discussion of what value BADGE_MAX_AGE_SECONDS should be set to on our production servers can occur in a pull request to the badges/ServerScript repo modifying https://github.com/badges/ServerScript/blob/master/start-shields.sh once this code change is merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good. There's one caveat, which is there's been some discussion of me attempting a deploy without logs or shell access: #1742 (comment). I do have the ability to access the remote repo where the code is, though I am not sure where to find ServerScript.

if (queryParams.maxAge !== undefined && /^[0-9]+$/.test(queryParams.maxAge)) {
ask.res.setHeader('Cache-Control', 'max-age=' + queryParams.maxAge);
ask.res.setHeader('Expires', new Date(+reqTime + queryParams.maxAge * 1000).toGMTString());
} else {
// Cache management - no cache, so it won't be cached by GitHub's CDN.
// always let queryParams.maxAge override the default
// regardless of env.LONG_CACHE setting
maxAge = queryParams.maxAge;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should maxAge be the floor, at least until we have a use case that demands it? I’d hate to have everyone tuning this to a lower number on their own if we can centralize the tuning instead.

Copy link
Member

@RedSparr0w RedSparr0w Jul 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should limit the minimum maxAge for now,
maybe once we have a maxAge per type of badge.

We currently have no cache and I think that the few that will know maxAge=0 is an option shouldn't affect the service too much.
(I would assume most people just leave the badges default)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a really strong opinion either way on this - I'd be OK with either in principle.

If we set the cache too long and loads of users start adding ?maxAge=0 to work around it, that is going to be really hard to mop up once we've tuned it better as the problem becomes distributed. That leads me towards using a floor.

However one concern I would have with making the default the floor would be this scenario:

We set the default to 300 secs

As a user I call mybadge.svg?maxAge=600 and get back the header Cache-Control: max-age=600
Then I call mybadge.svg?maxAge=60 and get back the header Cache-Control: max-age=300

That's very unexpected behaviour that will manifest in a really subtle way. As an end user I'd be going out of my mind trying to debug that once I noticed it.. Is there a way we could make that scenario less confusing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we set the cache too long and loads of users start adding ?maxAge=0 to work around it, that is going to be really hard to mop up once we've tuned it better as the problem becomes distributed. That leads me towards using a floor.

That's the situation I'm concerned about, too.

That's very unexpected behaviour that will manifest in a really subtle way. As an end user I'd be going out of my mind trying to debug that once I noticed it.. Is there a way we could make that scenario less confusing?

Two less confusing options when the maxAge parameter is out of the configured range could be:

  • Return an error. Clear, but heavy handed and fairly unfriendly.
  • Return a 302 temporary redirect. At least if people are trying to debug and inspect closely what's happening, use curl, etc., they will see it very clearly. This makes two requests instead of one, which isn't great.

Something I could ask, which might help us decide: what is the use case for setting maxAge to a small value? Is it so a developer can debug a badge that seems slow to update, or is it because we want to let them opt out of our caching?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option could be two config settings: a default maxAge and a minimum maxAge. I'm leaning toward having a really small default but if down the line we decided to bump it up to 300 sec e.g., I could see letting people reduce it to a reasonable lower number like 30 or 60. Probably not zero though.

When both numbers are zero we end up thinking in circles. 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I could ask, which might help us decide: what is the use case for setting maxAge to a small value?

Yes - this is a good question to think about.. As things stand right now, the default is no cache so the only reason someone is going to be using this right now is to set a long cache. No user is using this right now expecting it to make the cache shorter, so modifying the feature so that it can only make the cache longer (nor shorter) is broadly consistent with current expectations of what this feature is used for.

Also based on some highly scientific research (GitHub advanced search for "shields.io" "maxAge=" in markdown files), most of the usage of this in the wild is with large values (larger than the ones we're suggesting for a default), so we're unlikely to cause much/any unexpected behaviour by ignoring small values.

I think given that context, it is sensible not to allow users to manually set a shorter cache. In terms of signalling, lets start off by communicating this by documenting the limit and see what happens..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a nice idea to document the limit. However the approach here won't work well, because it's pulling the value from the build-time environment instead of the server environment. If that setting changed post-deploy (which is what we're trying to accomplish by making this env-configurable), the front end wouldn't update. I suppose you could say "the server's configured default" which gives a hint that there is one…

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. Lets keep it vague for now but at least provide a hint..

}
if (maxAge === 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m a little confused whether maxAge is a number or a string at this point. Should we coerce the query param?

ask.res.setHeader('Cache-Control', 'no-cache, no-store, must-revalidate');
ask.res.setHeader('Expires', reqTime.toGMTString()); // Proxies, GitHub, see #221.
ask.res.setHeader('Expires', reqTime.toGMTString());
} else {
ask.res.setHeader('Cache-Control', 'max-age=' + maxAge);
ask.res.setHeader('Expires', new Date(+reqTime + maxAge * 1000).toGMTString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to set Expires along with Cache-Control: max-age? To my reading of the spec, Cache-Control: max-age takes priority.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's explained here #1651 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that pointer!

From #1651 (comment):

It may be pointless in a perfect world but different implementations take different approach, there will always be someone not following the spec or maybe also not supporting Cache-Control but supporting Expires.

It seems like we're acting on a kind of superstition. It would be great to have an example of a client which does this, or a reliable report that this is true.

I'm fine leaving it in for now, though let's add a reference that stays with the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Cache-Control header was defined as part of the HTTP/1.1 specification would be ignored by any HTTP client which implements the HTTP/1.0 standard but not HTTP/1.1. This is definitely an edge case and we should expect any modern web browser or downstream cache to implement HTTP/1.1 but we can't safely assume that only clients implementing HTTP/1.1 will connect to us. I've added a comment clarifying this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining that!

}

ask.res.setHeader('Date', reqTime.toGMTString());
Expand Down
24 changes: 22 additions & 2 deletions lib/request-handler.spec.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,12 @@ describe('The request handler', function() {
before(analytics.load);

let camp;
const initialLongCache = process.env.LONG_CACHE;

beforeEach(function (done) {
camp = Camp.start({ port: config.port, hostname: '::' });
camp.on('listening', () => done());
process.env.LONG_CACHE = initialLongCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting this in an afterEach seems slightly safer. It’ll run even if there’s a crash in some of this camp setup.

});
afterEach(function (done) {
clearRequestCache();
Expand Down Expand Up @@ -105,15 +108,32 @@ describe('The request handler', function() {
expect(handlerCallCount).to.equal(1);
});

it('should set the expires header to current time', async function () {
it('should set the expires header to current time + 15 mins if LONG_CACHE on', async function () {
process.env.LONG_CACHE = 1;
const res = await fetch(`${baseUri}/testing/123.json`);
expect(res.headers.get('expires')).to.equal(res.headers.get('date'));
const expectedExpiry = new Date(+(new Date(res.headers.get('date'))) + 900000).toGMTString();
expect(res.headers.get('expires')).to.equal(expectedExpiry);
expect(res.headers.get('cache-control')).to.equal('max-age=900');
});

it('should set the expires header to current time + max-age', async function () {
const res = await fetch(`${baseUri}/testing/123.json?maxAge=3600`);
const expectedExpiry = new Date(+(new Date(res.headers.get('date'))) + 3600000).toGMTString();
expect(res.headers.get('expires')).to.equal(expectedExpiry);
expect(res.headers.get('cache-control')).to.equal('max-age=3600');
});

it('should set Cache-Control: no-cache, no-store, must-revalidate if maxAge=0', async function () {
const res = await fetch(`${baseUri}/testing/123.json?maxAge=0`);
expect(res.headers.get('expires')).to.equal(res.headers.get('date'));
expect(res.headers.get('cache-control')).to.equal('no-cache, no-store, must-revalidate');
});

it('should set Cache-Control: no-cache, no-store, must-revalidate if LONG_CACHE off', async function () {
process.env.LONG_CACHE = 0;
const res = await fetch(`${baseUri}/testing/123.json`);
expect(res.headers.get('expires')).to.equal(res.headers.get('date'));
expect(res.headers.get('cache-control')).to.equal('no-cache, no-store, must-revalidate');
});

describe('the cache key', function () {
Expand Down
34 changes: 34 additions & 0 deletions services/time/time.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
'use strict';

const BaseService = require('../base');

module.exports = class Time extends BaseService {

async handle({library}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this key library could be removed.

return { message: new Date() };
}

// Metadata
static get defaultBadgeData() {
return {
label: 'time',
color: 'blue',
};
}

static get category() {
return 'debug';
}

static get url() {
return {
base: 'servertime',
format: '',
capture: []
};
}

static get examples() {
return [];
}
};