Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
29692: ui: various glue fixes r=vilterp,couchand a=benesch

This PR restores the "no UI installed" message in short binaries:

![image](https://user-images.githubusercontent.com/882976/45196553-e713b880-b22a-11e8-928a-06c7a2da0f63.png)

I came across a few minor nits that seemed worth fixing too.

30135: sql: add '2.0' setting for distsql r=jordanlewis a=jordanlewis

The 2.0 setting for distsql (both a cluster setting and a session
setting) instructs the executor to use the 2.0 method of determining how
to execute a query: the query runs via local sql unless the query is
both distributable and recommended to be distributed, in which case it
runs via the distsql and is actually distributed.

Release note (sql change): add the '2.0' value for both the distsql
session setting and the sql.defaults.distsql cluster setting, which
instructs the database to use the 2.0 'auto' behavior for determining
whether queries run via distsql or not.

30148: storage: add new metrics for the RaftEntryCache r=nvanbenschoten a=nvanbenschoten

Four new metrics are introduced:
- `raft.entrycache.bytes`
- `raft.entrycache.size`
- `raft.entrycache.accesses`
- `raft.entrycache.hits`

30163: kv: Don't evict from leaseholder cache on context cancellations r=a-robinson a=a-robinson

This was a major contributor to the hundreds of NotLeaseHolderErrors per
second that we see whenever we run tpc-c at scale. A non-essential batch
request like a QueryTxn would get cancelled, causing the range to be
evicted from the leaseholder cache and the next request to that range to
have to guess at the leaseholder.

This is effectively an extension of #26764 that we should have thought to inspect more closely at the time.

Actually fixes #23543, which was not fully fixed before. Although I still haven't seen the errors drop all the way to 0, so I'm letting tpc-c 10k continue to run for a while longer to verify that they do. They are continuing to decrease about 15 minutes in. I don't think getting to 0 will be possible because there are still occasional splits and lease transfers), but it looks like it should be able to get down to single digit errors per second from the hundreds it was at before this change.

Also, avoid doing unnecessary sorting by latency of replicas in the dist_sender in the common case when we know who the leaseholder is and plan on sending our request there.

30197: sql/parser: fix the action for empty rules r=knz a=knz

Fixes #30141.


Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Jordan Lewis <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Alex Robinson <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
  • Loading branch information
6 people committed Sep 13, 2018
6 parents 33c7d27 + 9333e4f + 096bb41 + 5a13ea2 + 397409c + 9f1e6a9 commit 62c8984
Show file tree
Hide file tree
Showing 22 changed files with 389 additions and 175 deletions.
2 changes: 1 addition & 1 deletion docs/generated/settings/settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
<tr><td><code>server.shutdown.query_wait</code></td><td>duration</td><td><code>10s</code></td><td>the server will wait for at least this amount of time for active queries to finish</td></tr>
<tr><td><code>server.time_until_store_dead</code></td><td>duration</td><td><code>5m0s</code></td><td>the time after which if there is no new gossiped information about a store, it is considered dead</td></tr>
<tr><td><code>server.web_session_timeout</code></td><td>duration</td><td><code>168h0m0s</code></td><td>the duration that a newly created web session will be valid</td></tr>
<tr><td><code>sql.defaults.distsql</code></td><td>enumeration</td><td><code>1</code></td><td>default distributed SQL execution mode [off = 0, auto = 1, on = 2]</td></tr>
<tr><td><code>sql.defaults.distsql</code></td><td>enumeration</td><td><code>1</code></td><td>default distributed SQL execution mode [off = 0, auto = 1, on = 2, 2.0-off = 3, 2.0-auto = 4]</td></tr>
<tr><td><code>sql.defaults.optimizer</code></td><td>enumeration</td><td><code>1</code></td><td>default cost-based optimizer mode [off = 0, on = 1, local = 2]</td></tr>
<tr><td><code>sql.defaults.serial_normalization</code></td><td>enumeration</td><td><code>0</code></td><td>default handling of SERIAL in table definitions [rowid = 0, virtual_sequence = 1, sql_sequence = 2]</td></tr>
<tr><td><code>sql.distsql.distribute_index_joins</code></td><td>boolean</td><td><code>true</code></td><td>if set, for index joins we instantiate a join reader on every node that has a stream; if not set, we use a single join reader</td></tr>
Expand Down
9 changes: 9 additions & 0 deletions pkg/build/info.go
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,15 @@ func (b Info) Short() string {
b.Distribution, b.Tag, plat, b.Time, b.GoVersion)
}

// GoTime parses the utcTime string and returns a time.Time.
func (b Info) GoTime() time.Time {
val, err := time.Parse(TimeFormat, b.Time)
if err != nil {
return time.Time{}
}
return val
}

// Timestamp parses the utcTime string and returns the number of seconds since epoch.
func (b Info) Timestamp() (int64, error) {
val, err := time.Parse(TimeFormat, b.Time)
Expand Down
48 changes: 26 additions & 22 deletions pkg/kv/dist_sender.go
Original file line number Diff line number Diff line change
Expand Up @@ -433,23 +433,26 @@ func (ds *DistSender) sendSingleRange(
// Try to send the call.
replicas := NewReplicaSlice(ds.gossip, desc)

// Rearrange the replicas so that they're ordered in expectation of
// request latency.
var latencyFn LatencyFunc
if ds.rpcContext != nil {
latencyFn = ds.rpcContext.RemoteClocks.Latency
}
replicas.OptimizeReplicaOrder(ds.getNodeDescriptor(), latencyFn)

// If this request needs to go to a lease holder and we know who that is, move
// it to the front.
var knowLeaseholder bool
if !ba.IsReadOnly() || ba.ReadConsistency.RequiresReadLease() {
if storeID, ok := ds.leaseHolderCache.Lookup(ctx, desc.RangeID); ok {
if i := replicas.FindReplica(storeID); i >= 0 {
replicas.MoveToFront(i)
knowLeaseholder = true
}
}
}
if !knowLeaseholder {
// Rearrange the replicas so that they're ordered in expectation of
// request latency.
var latencyFn LatencyFunc
if ds.rpcContext != nil {
latencyFn = ds.rpcContext.RemoteClocks.Latency
}
replicas.OptimizeReplicaOrder(ds.getNodeDescriptor(), latencyFn)
}

br, err := ds.sendRPC(ctx, desc.RangeID, replicas, ba)
if err != nil {
Expand Down Expand Up @@ -1321,20 +1324,21 @@ func (ds *DistSender) sendToReplicas(
ambiguousError = err
}
log.VErrEventf(ctx, 2, "RPC error: %s", err)
if storeID, ok := ds.leaseHolderCache.Lookup(ctx, rangeID); ok && curReplica.StoreID == storeID {
// If the down replica is cached as the lease holder, evict
// it. The only other eviction happens below on
// NotLeaseHolderError, but if the next replica is the
// actual lease holder, we're never going to receive one of
// those and will thus pay the price of trying the down node
// first forever.
//
// NB: we could consider instead adding a successful reply
// from the next replica into the cache, but without a
// leaseholder (and taking into account that the local
// node can't be down) it won't take long until we talk
// to a replica that tells us who the leaseholder is.
ds.leaseHolderCache.Update(ctx, rangeID, 0 /* evict */)

// If the error wasn't just a context cancellation and the down replica
// is cached as the lease holder, evict it. The only other eviction
// happens below on NotLeaseHolderError, but if the next replica is the
// actual lease holder, we're never going to receive one of those and
// will thus pay the price of trying the down node first forever.
//
// NB: we should consider instead adding a successful reply from the next
// replica into the cache, but without a leaseholder (and taking into
// account that the local node can't be down) it won't take long until we
// talk to a replica that tells us who the leaseholder is.
if ctx.Err() == nil {
if storeID, ok := ds.leaseHolderCache.Lookup(ctx, rangeID); ok && curReplica.StoreID == storeID {
ds.leaseHolderCache.Update(ctx, rangeID, 0 /* evict */)
}
}
} else {
propagateError := false
Expand Down
5 changes: 3 additions & 2 deletions pkg/kv/dist_sender_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -269,9 +269,9 @@ func TestSendRPCOrder(t *testing.T) {
{
args: &roachpb.PutRequest{},
tiers: append(nodeTiers[5], roachpb.Tier{Key: "irrelevant", Value: ""}),
// Compare only the first resulting addresses as we have a lease holder
// Compare only the first resulting address as we have a lease holder
// and that means we're only trying to send there.
expReplica: []roachpb.NodeID{2, 5, 4, 0, 0},
expReplica: []roachpb.NodeID{2, 0, 0, 0, 0},
leaseHolder: 2,
},
// Inconsistent Get without matching attributes but lease holder (node 3). Should just
Expand Down Expand Up @@ -335,6 +335,7 @@ func TestSendRPCOrder(t *testing.T) {
ds := NewDistSender(cfg, g)

for n, tc := range testCases {
log.Infof(context.TODO(), "testcase %d", n)
verifyCall = makeVerifier(tc.expReplica)

{
Expand Down
16 changes: 7 additions & 9 deletions pkg/server/authentication.go
Original file line number Diff line number Diff line change
Expand Up @@ -356,19 +356,17 @@ const webSessionIDKeyStr = "webSessionID"

func (am *authenticationMux) ServeHTTP(w http.ResponseWriter, req *http.Request) {
username, cookie, err := am.getSession(w, req)
if err != nil && !am.allowAnonymous {
if err == nil {
ctx := req.Context()
ctx = context.WithValue(ctx, webSessionUserKey{}, username)
ctx = context.WithValue(ctx, webSessionIDKey{}, cookie.ID)
req = req.WithContext(ctx)
} else if !am.allowAnonymous {
log.Infof(req.Context(), "Web session error: %s", err)
http.Error(w, "a valid authentication cookie is required", http.StatusUnauthorized)
return
}

newCtx := context.WithValue(req.Context(), webSessionUserKey{}, username)
if cookie != nil {
newCtx = context.WithValue(newCtx, webSessionIDKey{}, cookie.ID)
}
newReq := req.WithContext(newCtx)

am.inner.ServeHTTP(w, newReq)
am.inner.ServeHTTP(w, req)
}

func encodeSessionCookie(sessionCookie *serverpb.SessionCookie) (*http.Cookie, error) {
Expand Down
66 changes: 17 additions & 49 deletions pkg/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,7 @@ import (
"compress/gzip"
"context"
"crypto/tls"
"encoding/json"
"fmt"
"html/template"
"io"
"io/ioutil"
"math"
Expand All @@ -33,7 +31,6 @@ import (
"sync/atomic"
"time"

assetfs "github.com/elazarl/go-bindata-assetfs"
raven "github.com/getsentry/raven-go"
gwruntime "github.com/grpc-ecosystem/grpc-gateway/runtime"
opentracing "github.com/opentracing/opentracing-go"
Expand All @@ -42,7 +39,6 @@ import (

"github.com/cockroachdb/cmux"
"github.com/cockroachdb/cockroach/pkg/base"
"github.com/cockroachdb/cockroach/pkg/build"
"github.com/cockroachdb/cockroach/pkg/gossip"
"github.com/cockroachdb/cockroach/pkg/internal/client"
"github.com/cockroachdb/cockroach/pkg/jobs"
Expand Down Expand Up @@ -1236,17 +1232,19 @@ func (s *Server) Start(ctx context.Context) error {
// endpoints.
s.mux.Handle(debug.Endpoint, debug.NewServer(s.st))

fileServer := http.FileServer(&assetfs.AssetFS{
Asset: ui.Asset,
AssetDir: ui.AssetDir,
AssetInfo: ui.AssetInfo,
})

// Serve UI assets. This needs to be before the gRPC handlers are registered, otherwise
// the `s.mux.Handle("/", ...)` would cover all URLs, allowing anonymous access.
maybeAuthMux := newAuthenticationMuxAllowAnonymous(
s.authentication, serveUIAssets(fileServer, s.cfg),
)
s.authentication, ui.Handler(ui.Config{
ExperimentalUseLogin: s.cfg.EnableWebSessionAuthentication,
LoginEnabled: s.cfg.RequireWebSession(),
GetUser: func(ctx context.Context) *string {
if u, ok := ctx.Value(webSessionUserKey{}).(string); ok {
return &u
}
return nil
},
}))
s.mux.Handle("/", maybeAuthMux)

// Initialize grpc-gateway mux and context in order to get the /health
Expand Down Expand Up @@ -1896,6 +1894,8 @@ func (s *Server) PGServer() *pgwire.Server {
return s.pgServer
}

// TODO(benesch): Use https://github.com/NYTimes/gziphandler instead.
// gzipResponseWriter reinvents the wheel and is not as robust.
type gzipResponseWriter struct {
gz gzip.Writer
http.ResponseWriter
Expand All @@ -1918,6 +1918,11 @@ func (w *gzipResponseWriter) Reset(rw http.ResponseWriter) {
}

func (w *gzipResponseWriter) Write(b []byte) (int, error) {
// The underlying http.ResponseWriter can't sniff gzipped data properly, so we
// do our own sniffing on the uncompressed data.
if w.Header().Get("Content-Type") == "" {
w.Header().Set("Content-Type", http.DetectContentType(b))
}
return w.gz.Write(b)
}

Expand All @@ -1944,43 +1949,6 @@ func (w *gzipResponseWriter) Close() error {
return err
}

func serveUIAssets(fileServer http.Handler, cfg Config) http.Handler {
return http.HandlerFunc(func(writer http.ResponseWriter, request *http.Request) {
if request.URL.Path != "/" {
fileServer.ServeHTTP(writer, request)
return
}

// Construct arguments for template.
tmplArgs := ui.IndexHTMLArgs{
ExperimentalUseLogin: cfg.EnableWebSessionAuthentication,
LoginEnabled: cfg.RequireWebSession(),
Tag: build.GetInfo().Tag,
Version: build.VersionPrefix(),
}
loggedInUser, ok := request.Context().Value(webSessionUserKey{}).(string)
if ok && loggedInUser != "" {
tmplArgs.LoggedInUser = &loggedInUser
}

argsJSON, err := json.Marshal(tmplArgs)
if err != nil {
http.Error(writer, err.Error(), 500)
}

// Execute the template.
writer.Header().Add("Content-Type", "text/html")
if err := ui.IndexHTMLTemplate.Execute(writer, map[string]template.JS{
"DataFromServer": template.JS(string(argsJSON)),
}); err != nil {
wrappedErr := errors.Wrap(err, "templating index.html")
http.Error(writer, wrappedErr.Error(), 500)
log.Error(request.Context(), wrappedErr)
return
}
})
}

func init() {
tracing.RegisterTagRemapping("n", "node")
}
89 changes: 66 additions & 23 deletions pkg/server/server_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
"io/ioutil"
"net/http"
"net/url"
"os"
"path/filepath"
"reflect"
"testing"
Expand All @@ -43,6 +44,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/storage/engine"
"github.com/cockroachdb/cockroach/pkg/testutils"
"github.com/cockroachdb/cockroach/pkg/testutils/serverutils"
"github.com/cockroachdb/cockroach/pkg/ui"
"github.com/cockroachdb/cockroach/pkg/util/hlc"
"github.com/cockroachdb/cockroach/pkg/util/httputil"
"github.com/cockroachdb/cockroach/pkg/util/leaktest"
Expand Down Expand Up @@ -858,6 +860,17 @@ func TestServeIndexHTML(t *testing.T) {
</html>
`

linkInFakeUI := func() {
ui.Asset = func(string) (_ []byte, _ error) { return }
ui.AssetDir = func(name string) (_ []string, _ error) { return }
ui.AssetInfo = func(name string) (_ os.FileInfo, _ error) { return }
}
unlinkFakeUI := func() {
ui.Asset = nil
ui.AssetDir = nil
ui.AssetInfo = nil
}

t.Run("Insecure mode", func(t *testing.T) {
s, _, _ := serverutils.StartServer(t, base.TestServerArgs{
Insecure: true,
Expand All @@ -874,32 +887,62 @@ func TestServeIndexHTML(t *testing.T) {
t.Fatal(err)
}

resp, err := client.Get(s.AdminURL())
if err != nil {
t.Fatal(err)
}
if resp.StatusCode != 200 {
t.Fatalf("expected status code 200; got %d", resp.StatusCode)
}
respBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Fatal(err)
}
respString := string(respBytes)
expected := fmt.Sprintf(
htmlTemplate,
fmt.Sprintf(
`{"ExperimentalUseLogin":false,"LoginEnabled":false,"LoggedInUser":null,"Tag":"%s","Version":"%s"}`,
build.GetInfo().Tag,
build.VersionPrefix(),
),
)
if respString != expected {
t.Fatalf("expected %s; got %s", expected, respString)
}
t.Run("short build", func(t *testing.T) {
resp, err := client.Get(s.AdminURL())
if err != nil {
t.Fatal(err)
}
if resp.StatusCode != 200 {
t.Fatalf("expected status code 200; got %d", resp.StatusCode)
}
respBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Fatal(err)
}
respString := string(respBytes)
expected := fmt.Sprintf(`<!DOCTYPE html>
<title>CockroachDB</title>
Binary built without web UI.
<hr>
<em>%s</em>`,
build.GetInfo().Short())
if respString != expected {
t.Fatalf("expected %s; got %s", expected, respString)
}
})

t.Run("non-short build", func(t *testing.T) {
linkInFakeUI()
defer unlinkFakeUI()
resp, err := client.Get(s.AdminURL())
if err != nil {
t.Fatal(err)
}
if resp.StatusCode != 200 {
t.Fatalf("expected status code 200; got %d", resp.StatusCode)
}
respBytes, err := ioutil.ReadAll(resp.Body)
if err != nil {
t.Fatal(err)
}
respString := string(respBytes)
expected := fmt.Sprintf(
htmlTemplate,
fmt.Sprintf(
`{"ExperimentalUseLogin":false,"LoginEnabled":false,"LoggedInUser":null,"Tag":"%s","Version":"%s"}`,
build.GetInfo().Tag,
build.VersionPrefix(),
),
)
if respString != expected {
t.Fatalf("expected %s; got %s", expected, respString)
}
})
})

t.Run("Secure mode", func(t *testing.T) {
linkInFakeUI()
defer unlinkFakeUI()
s, _, _ := serverutils.StartServer(t, base.TestServerArgs{})
defer s.Stopper().Stop(context.TODO())
tsrv := s.(*TestServer)
Expand Down
Loading

0 comments on commit 62c8984

Please sign in to comment.