Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing back the default redirect code from 301 to 302. #1619

Merged
merged 1 commit into from
Aug 23, 2017
Merged

Changing back the default redirect code from 301 to 302. #1619

merged 1 commit into from
Aug 23, 2017

Conversation

Eihrister
Copy link
Contributor

A permanent redirect can not be changed remotely once it has been sent.

Permanent redirects (301 / 308) are exactly what the their descriptive
name suggest. A 301 or 308 will be indefinitely cached client-side,
sometimes even when a user clears their cache.

They should be used to keep search engine page rankings by sending a
permanent redirect on a page by page basis, not for site to site or
protocols (http to https).

Especially on a new Grav installation, the default should not be 301.

My reasoning is that with the first visit to a new installation, before
even doing anything else, the 301 is already permanently cached in the
browser.

Additionally, temporary redirects can be influenced remotely and are not
that expensive on the webserver nowadays with event-based handlers.

A permanent redirect can not be changed remotely once it has been sent.

Permanent redirects (301 / 308) are exactly what the their descriptive
name suggest. A 301 or 308 will be indefinitely cached client-side,
sometimes even when a user clears their cache.

They should be used to keep search engine page rankings by sending a
permanent redirect on a page by page basis, not for site to site or
protocols (http to https).

Especially on a new Grav installation, the default should not be 301.

My reasoning is that with the first visit to a new installation, before
even doing anything else, the 301 is already permanently cached in the
browser.

Additionally, temporary redirects can be influenced remotely and are not
that expensive on the webserver nowadays with event-based handlers.
@rhukster rhukster merged commit 667c434 into getgrav:develop Aug 23, 2017
@mahagr
Copy link
Member

mahagr commented Aug 25, 2017

👍 Thanks for doing this. I've meant to start a discussion about this myself as default 301 redirects make testing to be painful until you change the setting. In fact changing the redirect to 302 has been my first task in any test site I've had.

@jrpub
Copy link

jrpub commented Aug 27, 2017

@Eihrister

not for site to site or protocols (http to https).

Why 301 should not be used for site-to-site (www to non-www) or protocols? (http to https). Any reason or I didn't understand well? Thanks.

@Eihrister
Copy link
Contributor Author

Eihrister commented Aug 27, 2017

First of all a nice overview of the 4 current main status codes:

Permanent Temporary
Allows changing the request method from POST to GET 301 302
Doesn't allow changing the request method from POST to GET 308 307

@jrpub:

As I wrote in my original comment, once a "301 Moved Permanently" has been set it is impossible to be changed remotely. As the dictionary suggests, permanent is, well, without end. Permanent redirects are pretty much permanently cached in a lot of browsers (some exceptions clear it upon browser exit).

The original (and current) intent of a "301 Moved Permanently"

The original intent was to let browsers (and later search engines) know that an article/page had moved to a new location. For example, when redoing a website layout and URL's are changing. That is what it is still currently used for with search engines. Search engines will update their search results for pages they have.

This is very important, because site indexers only have a limited budget of time per website to index content. Plus, if they update the resource URL in their cache because of the 301, you get to keep your page rankings and what not.

A tale of cost

Once upon a time.... there were old webservers that were not event-based, dial-up lines and what not.
The thing was that a simple request would be blocking for a webserver, and hold up other requests. So even sending a temporary redirect would be costly on the server-end. On top of that, your internet line would also be slower and having to fetch a redirect status was also costly for them.

Nowadays, such requests are not blocking anymore. nginx can literally handle north of a million requests per second with barely any CPU on relatively any modern system (I was maxing out my network link, not the system). This is due to the new event-based model. Apache also has an event-worker nowadays, which is capable of very high figures as well.

Internet lines have become faster (not just speed, but lower latency as well), too.
For the reason of expense on either side, choosing a 301 over a 302 is no longer so relevant.

(Not) Staying in control

Choosing the default of 302 over 301 in Grav has to do with control. When you first set up your website, you might not be entirely sure over its structure (of domain names, redirects) yet. But once a 301 has been issued, it's literally impossible to influence remotely.

Some control can be had while setting it, but not once the damage is done. If you were accidentally sending someone from example.com to example.net, but you meant to send them to www.example.org, there is no chance you'll be able to ever convince the browser of your user/customer to come back to the correct site. You just lost a user, potentionally for a very, very long time.

There is also no reason to do site-to-site redirection with a 301. See the cost section above. Plus once a user is on the new site, they might keep using that instead anyway. And if not, they won't really care that much whether it's costing them an extra connect back to the webserver.

You are trading a 301 for 302 and with one extra request you gain fine-grained control instead of uncontrollably permanently cached redirects. Worth it.

http:// to https://

You do not need a redirect at all after the first visit of a user to your http:// site if you're serving https://.
Let alone a "301 Moved Permanently". You don't know if it moved permanently. Maybe someone changes their mind later and will still want to offer their static content over http:// in countries where https:// is illegal.

My point is, it's much better to use current day technologies for this. There is Strict Transport Security, which is a hundred times more useful for this.

Setting this header on your http:// site, will let the browser that your current site (and optionally all subdomains) are only accessible over https://. It will not matter if the user is typing in http://example.com, the browser will not even bother trying to fetch the http:// site. It will assume it was typed as https://example.com instead and never return to the http://-version.

However, opposed to a "301 Moved Permanently", it can be influenced remotely:

  • The header is cached for the duration mentioned in the max-age part of the header. Every time the browser visits the website and receives a new HSTS-header, it will update the remaining duration with the new age.
  • In regard to the previous point, setting the max-age to 0, will effectively remove and thereby undo the forced http:// to https:// redirect. Something you are not able to do with 301.

Is the link still used?

If you are issuing 301's, the browser will never return to your webserver for that link until it has been removed from the browser cache. That in turn means that the request will not show up in your application logs, either.

This makes it more difficult to see when old redirects can be removed, as you won't be able to tell how often they're still being used. With temporary redirects, you will keep seeing them in the logs.


Note: Same goes for 308 and 307 (respectively the newer variants of 301 and 302).

@rhukster
Copy link
Member

Great write up! 👍

@jrpub
Copy link

jrpub commented Aug 27, 2017

Good summary! Playing the devil:

I was just wondering if it will not affect the ranking of some websites if people just set a redirect, as it will be a 302. For me, culturally I would say, 301 is the default when I speak about redirects. Furthermore, there are still a lot of debates regarding search engine behaviours with 302 redirects.

  • I would like to know the number of websites/people forcing HTTPS with HSTS :) Most people just do a redirect

  • Google seems to have confirmed (last January I believe, so it is very new) there is no loss of page rank across a 302 redirect, but this may not be the case for other search engines, 301 is safer on this point of view

  • I think the risk of duplicate content and/or the non-removal of redirected pages in search results might be bigger than the risk you speak about. Even if big G confirmed that there is no loss of PR with 302, I am pretty sure that a 302 will stay indexed a lot more time (it would be quite logical...). A website dropping in search results because of the default redirects are 302 would be very unfortunate (and more critical I think than the browser-cache problem).

  • It might be better to tackle the problem by setting up a Content-cache in the answer of 301, with some default low values?

Still so much controversy regarding 301/302. No perfect solution!

@Eihrister
Copy link
Contributor Author

Eihrister commented Aug 27, 2017

  • HSTS is required to get a good ranking on things like the SSL Labs server test. It is widely supported, but whether or not other people use it, YOU can and should! Why do you have anything to do with other website owners?

  • I did mention that 301's were better for keeping search engine rankings, and I said that in my first comment as well. I explicitly mentioned when they are not a good idea and why. It seems you are correct, and I still held an old-world view of it when it comes to page ranking. But that also, in turn, means 301's are now really completely irrelevant.

Setting a proper caching header for 301's is difficult, and not all browsers used to handle them well (maybe they do now, but why risk it?).

Again: This PR was to change the default behaviour. Nothing is stopping you from using a 301. If you don't like better control over your website or prefer to use outdated technologies instead of those that are designed specifically for its purpose (HSTS), that's entirely up to you... :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants