Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support leader freshness with None consistency #614

Merged
merged 1 commit into from
Jan 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 13 additions & 3 deletions DOC/CONSISTENCY.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,24 @@
# Read Consistency

Even though serving queries does not require consensus (because the database is not changed), [queries should generally be served by the leader](https://github.com/rqlite/rqlite/issues/5). Why is this? Because, without this check, queries on a node could return results that are significantly out-of-date. This could happen for one, or both, of the following two reasons:
Even though serving queries does not require consensus (because the database is not changed), [queries should generally be served by the leader](https://github.com/rqlite/rqlite/issues/5). Why is this? Because, without this check, queries on a node could return results that are out-of-date. This could happen for one, or both, of the following two reasons:

* The node, while still part of the cluster, has fallen behind the leader in terms of updates to its underlying database.
* The node is no longer part of the cluster, and has stopped receiving Raft log updates.

This is why rqlite offers selectable read consistency levels of _none_, _weak_, and _strong_. Each is explained below.

With _none_, the node simply queries its local SQLite database, and does not even check if it is the leader. This offers the fastest query response, but suffers from the potential issues listed above. _Weak_ instructs the node to check that it is the leader, before querying the local SQLite file. Checking leader state only involves checking local state, so is still very fast. There is, however, a very small window of time (milliseconds by default) during which the node may return stale data. This is because after the leader check, but before the local SQLite database is read, another node could be elected leader and make changes to the cluster. As result the node may not be quite up-to-date with the rest of cluster.
## None
With _none_, the node simply queries its local SQLite database, and does not even check if it is the leader. This offers the fastest query response, but suffers from the potential issues listed above.

To avoid even this last possibility, rqlite also offers _strong_. In this mode, rqlite sends the query through the Raft consensus system, ensuring that the node remains the leader at all times during query processing. However, this will involve the leader contacting at least a quorum of nodes, and will therefore increase query response times.
You can tell the node not return results (effectively) older than a certain time, however. If the read request sets the query parameter `freshess` to a [Go duration string](https://golang.org/pkg/time/#Duration), the node will check that less time has passed since it was last in contact with the leader, than that specified via freshness. `freshness` is ignored for all consistency levels except _none`, and is also ignored if set to zero.

If you decide to deploy [read-only nodes](https://github.com/rqlite/rqlite/blob/master/DOC/READ_ONLY_NODES.md) however, _none_ combined with `freshness` can be quite effective.

## Weak
_Weak_ instructs the node to check that it is the leader, before querying the local SQLite file. Checking leader state only involves checking local state, so is still very fast. There is, however, a very small window of time (milliseconds by default) during which the node may return stale data. This is because after the leader check, but before the local SQLite database is read, another node could be elected leader and make changes to the cluster. As result the node may not be quite up-to-date with the rest of cluster.

## Strong
To avoid even the issues associated with _weak_ consistency, rqlite also offers _strong_. In this mode, rqlite sends the query through the Raft consensus system, ensuring that the node remains the leader at all times during query processing. However, this will involve the leader contacting at least a quorum of nodes, and will therefore increase query response times.

_Weak_ is probably sufficient for most applications, and is the default read consistency level. To explicitly select consistency, set the query param `level` to the desired level.

Expand All @@ -18,6 +27,7 @@ Examples of enabling each read consistency level for a simple query is shown bel

```bash
curl -G 'localhost:4001/db/query?level=none' --data-urlencode 'q=SELECT * FROM foo'
curl -G 'localhost:4001/db/query?level=none&freshness=1s' --data-urlencode 'q=SELECT * FROM foo'
curl -G 'localhost:4001/db/query?level=weak' --data-urlencode 'q=SELECT * FROM foo'
curl -G 'localhost:4001/db/query' --data-urlencode 'q=SELECT * FROM foo' # Same as weak
curl -G 'localhost:4001/db/query?level=strong' --data-urlencode 'q=SELECT * FROM foo'
Expand Down
29 changes: 25 additions & 4 deletions http/service.go
Original file line number Diff line number Diff line change
Expand Up @@ -642,19 +642,25 @@ func (s *Service) handleQuery(w http.ResponseWriter, r *http.Request) {

isTx, err := isTx(r)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

timings, err := timings(r)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

lvl, err := level(r)
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

frsh, err := freshness(r)
if err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}

Expand All @@ -665,7 +671,7 @@ func (s *Service) handleQuery(w http.ResponseWriter, r *http.Request) {
return
}

results, err := s.store.Query(&store.QueryRequest{queries, timings, isTx, lvl})
results, err := s.store.Query(&store.QueryRequest{queries, timings, isTx, lvl, frsh})
if err != nil {
if err == store.ErrNotLeader {
leader := s.leaderAPIAddr()
Expand Down Expand Up @@ -911,6 +917,21 @@ func level(req *http.Request) (store.ConsistencyLevel, error) {
}
}

// freshness returns any freshness requested with a query.
func freshness(req *http.Request) (time.Duration, error) {
q := req.URL.Query()
f := strings.TrimSpace(q.Get("freshness"))
if f == "" {
return 0, nil
}

d, err := time.ParseDuration(f)
if err != nil {
return 0, err
}
return d, nil
}

// backupFormat returns the request backup format, setting the response header
// accordingly.
func backupFormat(w http.ResponseWriter, r *http.Request) (store.BackupFormat, error) {
Expand Down
20 changes: 14 additions & 6 deletions store/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,10 @@ var (
// operation.
ErrNotLeader = errors.New("not leader")

// ErrStaleRead is returned if the executing the query would violate the
// requested freshness.
ErrStaleRead = errors.New("stale read")

// ErrOpenTimeout is returned when the Store does not apply its initial
// logs within the specified time.
ErrOpenTimeout = errors.New("timeout waiting for initial logs application")
Expand Down Expand Up @@ -80,10 +84,11 @@ func init() {
// QueryRequest represents a query that returns rows, and does not modify
// the database.
type QueryRequest struct {
Queries []string
Timings bool
Tx bool
Lvl ConsistencyLevel
Queries []string
Timings bool
Tx bool
Lvl ConsistencyLevel
Freshness time.Duration
}

// ExecuteRequest represents a query that returns now rows, but does modify
Expand Down Expand Up @@ -565,8 +570,11 @@ func (s *Store) Query(qr *QueryRequest) ([]*sql.Rows, error) {
return nil, ErrNotLeader
}

r, err := s.db.Query(qr.Queries, qr.Tx, qr.Timings)
return r, err
// Read straight from database.
if qr.Freshness > 0 && time.Since(s.raft.LastContact()) > qr.Freshness {
return nil, ErrStaleRead
}
return s.db.Query(qr.Queries, qr.Tx, qr.Timings)
}

// Join joins a node, identified by id and located at addr, to this store.
Expand Down
Loading