Skip to content

Commit

Permalink
Merge pull request #290 from justincjohnson/feat/gateway-redirects
Browse files Browse the repository at this point in the history
IPIP: Gateway _redirects File
  • Loading branch information
lidel authored Sep 23, 2022
2 parents 7e9612d + 33f4f44 commit 16cc443
Show file tree
Hide file tree
Showing 5 changed files with 291 additions and 0 deletions.
77 changes: 77 additions & 0 deletions IPIP/0002-gateway-redirects-file.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# IPIP 0002: _redirects File Support on Web Gateways

- Start Date: (format: 2022-06-15)
- Related Issues:
- [ipfs/specs/issues/257](https://github.com/ipfs/specs/issues/257)
- [ipfs/kubo/pull/8890](https://github.com/ipfs/kubo/pull/8890)
- [ipfs-docs/pull/1275](https://github.com/ipfs/ipfs-docs/pull/1275)

## Summary

Provide support for URL redirects and rewrites for web sites hosted on Subdomain or DNSLink Gateways, thus enabling support for [single-page applications (SPAs)](https://en.wikipedia.org/wiki/Single-page_application), and avoiding [link rot](https://en.wikipedia.org/wiki/Link_rot) when moving to IPFS-backed hosting.

## Motivation

Web sites often need to redirect from one URL to another, for example, to change the appearance of a URL, to change where content is located without breaking existing links (see [Cool URIs don't change](https://www.w3.org/Provider/Style/URI), [link rot](https://en.wikipedia.org/wiki/Link_rot)), to redirect invalid URLs to a pretty 404 page, or to enable URL rewriting.
URL rewriting in particular is a critical feature for hosting SPAs, allowing routing logic to be handled by front end code. SPA support is the primary impetus for this RFC.

Currently the only way to handle URL redirects or rewrites is with additional software such as NGINX sitting in front of the Gateway. This software introduces operational complexity and decreases the uniformity of experience when navigating to content hosted on a Gateway, thus decreasing the value proposition of hosting web sites in IPFS.

This IPIP proposes the introduction of redirect support for content hosted on Subdomain or DNSLink Gateways, configured via a `_redirects` file residing underneath the root CID of the web site.

## Detailed design

Allow developers to configure redirect support by adding redirect rules to a file named `_redirects` stored underneath the root CID of their web site.
The format for this file is similar to those of [Netlify](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file) and [Cloudflare Pages](https://developers.cloudflare.com/pages/platform/redirects) but only supporting a subset of their functionality.

The format for the file is `from to [status]`.

- `from` - specifies the path to intercept (can include placeholders and a trailing splat)
- `to` - specifies the path or URL to redirect to (can include placeholders or splat matched in `from`)
- `status` - optional [HTTP status code](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) (301 if not specified)

Rules in the file are evaluated top to bottom.

For performance reasons this proposal does not include forced redirect support (i.e. redirect rules that are evaluated even if the `from` path exists). In other word, redirect logic will be evaluated if and only if the requested path does not exist. If the requested path exists, we won't even check for the existence of the `_redirects` file.

If a `_redirects` file exists but is unable to be processed, perhaps not even parsing correctly, errors will be returned to the user viewing the site via the Gateway.

The detailed specification is added in [`http-gateways/REDIRECTS_FILE.md`](../http-gateways/REDIRECTS_FILE.md).

### Test fixtures
QmQyqMY5vUBSbSxyitJqthgwZunCQjDVtNd8ggVCxzuPQ4

See spec for testing details.

## Design rationale

Popular services today such as [Netlify](https://docs.netlify.com/routing/redirects/#syntax-for-the-redirects-file) and [Cloudflare Pages](https://developers.cloudflare.com/pages/platform/redirects) allow developers to configure redirect support
using a `_redirects` file hosted at the top level of the web site. While we do not intend to provide all of the same functionality, it seems desirable to use a similar approach to provide a meaningful subset of the functionality offered by these services.

- The format is simple and low on syntax
- Many developers are already familiar with this file name and format
- Using a text file for configuration enables developers to make changes without using other IPFS tools
- The configuration can be easily versioned in both version control systems and IPFS by virtue of the resulting change to the root CID for the content

### User benefit

Provides general URL redirect and rewrite support, which enables three important features:
1. Developers will be able to host single-page applications in IPFS.
2. Same configuration file used for setting up pretty 404 pages.
3. The cost of switching hosting of an existing website to IPFS is lowered by making it possible to keep all legacy URLs working.

### Compatibility

If by some chance developers are already hosting sites that contain a `_redirects` file that does something else, they may need to update the contents of the file to match the new functionality. Errors returned to the user due to parsing errors will guide them regarding the required updates.

### Alternatives

- There was some discussion early on about a [manifest file](https://github.com/ipfs/specs/issues/257) that could be used to configure redirect support in addition to many other things. While the idea of a manifest file has merit, manifest files are much larger in scope and it became challenging to reach agreement on functionality to include.
There is already a large need for redirect support for SPAs, and this proposal allows us to provide that critical functionality without being hampered by further design discussion around manifest files.
In addition, similar to how Netlify allows redirect support to be configured in either a `_redirects` file or a more general [configuration file](https://docs.netlify.com/configure-builds/file-based-configuration/#redirects), there is nothing precluding IPFS from allowing developers to configure redirect support in an app manifest later on.
- There was some discussion with the [n0](https://github.com/n0-computer/) team about potential ways to improve the performance of retrieving metadata such as redirect rules, possibly including it as metadata with the root CID such that it would be included with the request for the CID to begin with.
I believe the performance concerns are alleviated by not providing forced redirect support, and looking for `_redirects` only if the DAG is missing a requested path. Never the less, if a more generic metadata facility were to be introduced in the future, it may make sense to reconsider how redirect rules are specified.

### Copyright

Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).
5 changes: 5 additions & 0 deletions http-gateways/DNSLINK_GATEWAY.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ In short:
- [HTTP Response](#http-response)
- [Appendix: notes for implementers](#appendix-notes-for-implementers)
- [Leveraging DNS for content routing](#leveraging-dns-for-content-routing)
- [Redirects, single-page applications, and custom 404s](#redirects-single-page-applications-and-custom-404s)

# HTTP API

Expand Down Expand Up @@ -98,3 +99,7 @@ Same as [HTTP Response section in `PATH_GATEWAY.md`](./PATH_GATEWAY.md#http-resp
TXT records with known content providers for the data behind a DNSLink. IPFS
clients will be able to detect DNSAddr and preconnect to known content
providers, removing the need for expensive DHT lookup.

## Redirects, single-page applications, and custom 404s

DNSLink Gateway implementations are free to include `_redirects` file support defined in [`REDIRECTS_FILE.md`](./REDIRECTS_FILE.md).
1 change: 1 addition & 0 deletions http-gateways/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,3 +40,4 @@ model](https://en.wikipedia.org/wiki/Same-origin_policy).

* [SUBDOMAIN_GATEWAY.md](./SUBDOMAIN_GATEWAY.md)
* [DNSLINK_GATEWAY.md](./DNSLINK_GATEWAY.md)
* [REDIRECTS_FILE.md](./REDIRECTS_FILE.md)
204 changes: 204 additions & 0 deletions http-gateways/REDIRECTS_FILE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
# `_redirects` File Specification

![draft](https://img.shields.io/badge/status-draft-yellow.svg?style=flat-square)

**Authors**:

- Justin Johnson ([@justincjohnson](https://github.com/justincjohnson))

----

**Abstract**

The Redirects File specification is an extension of the Subdomain Gateway and DNSLink Gateway specifications.

Developers can enable URL redirects or rewrites by adding redirect rules to a file named `_redirects` stored underneath the root CID of their web site.

This can be used, for example, to enable URL rewriting for hosting a single-page application, to redirect invalid URLs to a pretty 404 page, or to avoid [link rot](https://en.wikipedia.org/wiki/Link_rot) when moving to IPFS-based website hosting.

# Table of Contents

- [File Name and Location](#file-name-and-location)
- [File Format](#file-format)
- [From](#from)
- [To](#to)
- [Status](#status)
- [Placeholders](#placeholders)
- [Splat](#splat)
- [Comments](#comments)
- [Line Termination](#line-termination)
- [Max File Size](#max-file-size)
- [Evaluation](#evaluation)
- [Subdomain or DNSLink Gateways](#subdomain-or-dnslink-gateways)
- [Order](#order)
- [No Forced Redirects](#no-forced-redirects)
- [Error Handling](#error-handling)
- [Security](#security)
- [Appendix: notes for implementors](#appendix-notes-for-implementors)
- [Test fixtures](#test-fixtures)

# File Name and Location

The Redirects File MUST be named `_redirects` and stored underneath the root CID of the web site.

# File Format

The Redirects File MUST be a text file containing one or more lines with the following format (brackets indication optionality).

```
from to [status]
```

## From

The path to redirect from.

## To

The URL or path to redirect to.

## Status

An optional integer specifying the HTTP status code to return from the request. Supported values are:

- `200` - OK
- Redirect will be treated as a rewrite, returning OK without changing the URL in the browser.
- `301` - Permanent Redirect (default)
- `302` - Found (commonly used for Temporary Redirect)
- `303` - See Other (replacing PUT and POST with GET)
- `307` - Temporary Redirect (explicitly preserving body and HTTP method of original request)
- `308` - Permanent Redirect (explicitly preserving body and HTTP method of original request)
- `404` - Not Found
- Useful for redirecting invalid URLs to a pretty 404 page.
- `410` - Gone
- `451` - Unavailable For Legal Reasons

## Placeholders

Placeholders are named variables that can be used to match path segments in the `from` path and inject them into the `to` path.

For example:

```
/posts/:month/:day/:year/:slug /articles/:year/:month/:day/:slug
```

This rule will redirect a URL like `/posts/06/15/2022/hello-world` to `/articles/2022/06/15/hello-world`.

### Splat

If a `from` path ends with an asterisk (i.e. `*`), the remainder of the `from` path is slurped up into the special `:splat` placeholder, which can then be injected into the `to` path.

For example:

```
/posts/* /articles/:splat
```

This rule will redirect a URL like `/posts/2022/06/15/hello-world` to `/articles/2022/06/15/hello-world`.

Splat logic MUST only apply to a single trailing asterisk, as this is a greedy match, consuming the remainder of the path.

### Comments

Any line beginning with `#` will be treated as a comment and ignored at evaluation time.

For example:

```
# Redirect home to index.html
/home /index.html 301
```

is functionally equivalent to

```
/home /index.html 301
```

### Line Termination

Lines MUST be terminated by either `\n` or `\r\n`.

### Max File Size

The file size MUST NOT exceed 64 KiB.

# Evaluation

## Subdomain or DNSLink Gateways

Rules MUST only be evaluated when hosted on a Subdomain or DNSLink Gateway, so that we have [Same-Origin](https://en.wikipedia.org/wiki/Same-origin_policy) isolation.

## Order

Rules MUST be evaluated in order, redirecting or rewriting using the first matching rule.

## No Forced Redirects

All redirect logic MUST only be evaluated if the requested path is not present in the DAG. This means that any performance impact associated with checking for the existence of a Redirects File or evaluating redirect rules will only be incurred for non-existent paths.

# Error Handling

If the Redirects File exists but there is an error reading or parsing it, the errors MUST be returned to the user with a 500 HTTP status code.

# Security

This functionality will only be evaluated for Subdomain or DNSLink Gateways, to ensure that redirect paths are relative to the root CID hosted at the specified domain name.

Parsing of the `_redirects` file should be done safely to prevent any sort of injection vector or daemon crash.

The [max file size](#max-file-size) helps to prevent an additional [denial of service attack](https://en.wikipedia.org/wiki/Denial-of-service_attack) vector.

# Appendix: notes for implementors

## Test fixtures

Sample files for various test cases can be found in `QmQyqMY5vUBSbSxyitJqthgwZunCQjDVtNd8ggVCxzuPQ4`.
Implementations are free to use it for internal testing.

```
$ ipfs ls QmQyqMY5vUBSbSxyitJqthgwZunCQjDVtNd8ggVCxzuPQ4
QmcBcFnKKqgpCVMxxGsriw9ByTVF6uDdKDMuEBq3m6f1bm - bad-codes/
QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj - examples/
QmU7ysGXwAtiV7aBarZASJsxKoKyKmd9Xrz2FFamSCbg8S - forced/
QmWHn2TunA1g7gQ7q9rwAoWuot2hMpojZ6cZ9ERsNKm5gE - good-codes/
QmRgpzYQESidTtTojN8zRWjiNs9Cy6o7KHRxh7kDpJm3KH - invalid/
QmYzMrtPyBv7LKiEAGLLRPtvqm3SjQYLWxwWQ2vnpxQwRd - newlines/
QmQTfvjGmvTfxFpUcZNLdTLuKV227KJkGiN6xooHVeVZAS - too-large/
```

For example, the "examples" site can be found in `QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj`.

```
$ ipfs ls /ipfs/QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj
Qmd9GD7Bauh6N2ZLfNnYS3b7QVAijbud83b8GE8LPMNBBP 7 404.html
QmSmR9NShZ89VEBrn9SBy7Xxvjw8Qe6XArD5GqtHvbtBM3 7 410.html
QmVQqj9oZig9tH3ENHo4bxV5pNgssUwFCXUjAJAVcZVbJG 7 451.html
QmZU3kboiyi9jV59D8Mw8wzuvsr3HmvskqhYRRhdFA8wRq 317 _redirects
QmaWDLb4gnJcJbT1Df5X3j91ysiwkkyxw6329NLiC1KMDR - articles/
QmS6ZNKE9s8fsHoEnArsZXnzMWijKddhXXDsAev8LdTT5z 9 index.html
QmNwEgMrExwSsE8DCjZjahYfHUfkSWRhtqSkQUh4Fk3udD 7 one.html
QmVe2GcTbEPZkMbjVoQ9YieVGKCHmuHMcJ2kbSCzuBKh2s - redirected-splat/
QmUGVnZaofnd5nEDvT2bxcFck7rHyJRbpXkh9znjrJNV92 7 two.html
```

The `_redirects` file is as follows.

```
$ ipfs cat /ipfs/QmYBhLYDwVFvxos9h8CGU2ibaY66QNgv8hpfewxaQrPiZj/_redirects
/redirect-one /one.html
/301-redirect-one /one.html 301
/302-redirect-two /two.html 302
/200-index /index.html 200
/posts/:year/:month/:day/:title /articles/:year/:month/:day/:title 301
/splat/* /redirected-splat/:splat 301
/not-found/* /404.html 404
/gone/* /410.html 410
/unavail/* /451.html 451
/* /index.html 200
```

The non-existent paths that are being requested should be intercepted and redirected to the destination path and the specified HTTP status code returned. The rules are evaluated in the order they appear in the file.

Any request for an existing file should be returned as is, and not intercepted by the last catch all rule.
4 changes: 4 additions & 0 deletions http-gateways/SUBDOMAIN_GATEWAY.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Summary:
- [DNS label limits](#dns-label-limits)
- [Security considerations](#security-considerations)
- [URI router](#uri-router)
- [Redirects, single-page applications, and custom 404s](#redirects-single-page-applications-and-custom-404s)

# HTTP API

Expand Down Expand Up @@ -269,3 +270,6 @@ which in turn should redirect to

From there, regular subdomain gateway logic applies.

## Redirects, single-page applications, and custom 404s

Subdomain Gateway implementations are free to include `_redirects` file support defined in [`REDIRECTS_FILE.md`](./REDIRECTS_FILE.md).

0 comments on commit 16cc443

Please sign in to comment.