Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sql,log: new structured event
FirstQuery
This change aims to expose “time to first query”, a product metric that correlates with how easy it is for a user to start using CockroachDB. The naive approach to do this would be to store a boolean "has the first query been reported in telemetry yet" inside a system table. This naive approach is defective because it incurs a storage read-eval-write cycle with an access to KV storage, or disk, or a fan-out across the network, upon every query. This is an unacceptable performance overhead. Instead, we want the collection of this metric to have virtually no performance impact on the common case of workloads beyond the first query. To get to a better design, we need to better understand the requirement. A naive interpretation of the requirement would be to assume that we need the time to first query to be reported *exactly once*. Besides the fact that this is strictly impossible from a computer-theoretical perspective [1], an exactly-once approach would also incur unacceptable performance overhead, due to the need to synchronize between multiple nodes, when the first query is sent through a load balancer. So what can we do instead? In an initial implementation (#63812), we tried to *approximate* “exactly once” reporting with a design that combines “at least once” with a best effort of “at most once”. This was achieved using a cluster setting. That solution was storing the time of the first query inside a cluster setting, and relying on the auto-propagation of setting changes through the cluster to announce to other nodes that the first query had been reported already. The use of a cluster setting also takes care of persisting this information across node restarts, again towards the need to approximate “at most once”. After further discussion with stakeholders, it appears that there is no strong requirement to reach “at most once”. The data processing pipeline that consumes the "first query" event is only interested to trigger the first time the event is recognized for a given cluster and can be easily configured to ignore events reported after that. The only practical consideration is to minimize the amount of telemetry data (to ensure scalability over the CockroachCloud fleet), and so we are not interested to send a "first query" event for *every single query executed* either (obviously). Based on this new understanding, this commit simplifies the approach taken in #63812: it makes every CockroachDB *process* report the first SQL query it executes on the TELEMETRY channel. Whether the process has reported the event already or not is stored in a memory-only atomic variable. This approach technically has "at most once" semantics per node (an event can get lost if the logging output is stalled at the moment the event is reported), but approximates "at least once" semantics very well for multi-node clusters, or when nodes restart over time—as, presumably, the logging system will not be clogged all the time. There are two open questions in this approach however: - when a CockroachCloud cluster is initially configured by the CC intrusion control plane), there are “administrative” queries that are *not internal* but also *not user initiated*. With the current patch, the "first query" event will be reported for these administrative SQL queries. Is this desired? (NB: the same question applied to #63812) If not, we will need to agree on a reliable way to identify a user-initiated query. (A similar question exists when the user opens `cockroach sql`, but does not type anything in. The `cockroach sql` command sends queries in the background, and would trigger the "first query" event even without user interaction. Are we OK with this?) - this approach will cause a "first query" event to be logged *at least once per SQL pod* (when multiple pods are running side-by-side), and also *once after every pod restart*. To retrieve the “time **to** first query” (i.e. not “time **of** first query”), the consumer of these events must then compute the minimum timestamp of all the first query events received, and substract the timestamp of cluster creation. (The cluster creation timestamp can be obtained either via an external source, e.g. when the customer initiated the orchestration to create the cluster, or by retrieving the first ``node_restart`` event from the OPS logging channel.) Is this behavior acceptable? Of note: departure from "at most once" delivery has another advantage with regards to product telemetry: it teaches us something about the effective utilization of all the nodes in a cluster. If we know the cluster has 10 nodes but only 3 report "first query" events, that might teach us that the cluster is over-provisioned. [1] in a distributed system, an event can have guaranteed at-least-once delivery, or at-most-once, but never both. We also have the same constraint for CDC. Release note (cli change): CockroachDB now reports a `first_query` structured event on the TELEMETRY channel for the first SQL query executed on each server process.
- Loading branch information