You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
vtgate performs random tablet selection (with preference for local cell tablets) when deciding which tablet to use for serving a query. On larger MySQL setups, where there might be a high variance in replication lag between replica nodes, this behaviour can lead to
data inconsistencies that is hard to understand and work around.
Here's an example:
Let's suppose we have 2 tables, check_suites and check_runs.
A check_suites row has many associated check_runs rows.
A row in check_runs can only be created after creating the parent check_suites row.
Data is never deleted from check_suites.
Based on this description, an application that runs against a "regular" MySQL primary/replica setup, where connections are load balanced using e.g. haproxy can rely on the fact that it can read from check_runs, and every row will have a reference to a check_suites row that is guaranteed to exist.
The application will have a "time consistent" view of the data, where a read query will always see data at the same point of a previous query, or a later point in time.
When sending requests through vtgate, that expectation is broken if queries are distributed across more than one replica. 😔
The first query that is reading from check_runs might hit replica A, which has almost no replication lag. The second query, reading from check_suites, might hit replica B, which has over a second of replication lag. This can lead to the second query not being able to find any matching row in the check_suites table, which in turn can lead to errors or other weird behaviour in our application.
The application no longer has a "time consistent" view of the data, instead read queries see data at newer and older points in time.
The text was updated successfully, but these errors were encountered:
I'm not sure this would solve the problem we're facing in a satisfactory way. I might be wrong on this, so here's what I think the problems with transaction reads would be.
First, it's unclear what parts of our codebase need to be wrapped in transactions, and with a codebase as large as ours, it'd be a herculean effort to figure that out.
We could wrap every web request that comes in with transaction, but that will lead to gravely reducing the multiplexing done on the vttablet. We'd hit the vttablet connection limits that we've set very quickly, as connections would be "locked" for the duration of that web request. This would be made even worse with scatter or other multi shard queries.
It's also hard for me to estimate what the change in semantics would mean for our application, and which isolation level would be the correct one to use.
Overall, it feels like a much heavier solution to this problem than what we would require. 🤔
vtgate
performs random tablet selection (with preference for local cell tablets) when deciding which tablet to use for serving a query. On larger MySQL setups, where there might be a high variance in replication lag between replica nodes, this behaviour can lead todata inconsistencies that is hard to understand and work around.
Here's an example:
check_suites
andcheck_runs
.check_suites
row has many associatedcheck_runs
rows.check_runs
can only be created after creating the parentcheck_suites
row.check_suites
.Based on this description, an application that runs against a "regular" MySQL primary/replica setup, where connections are load balanced using e.g.
haproxy
can rely on the fact that it can read fromcheck_runs
, and every row will have a reference to acheck_suites
row that is guaranteed to exist.The application will have a "time consistent" view of the data, where a read query will always see data at the same point of a previous query, or a later point in time.
When sending requests through
vtgate
, that expectation is broken if queries are distributed across more than one replica. 😔The first query that is reading from
check_runs
might hit replicaA
, which has almost no replication lag. The second query, reading fromcheck_suites
, might hit replicaB
, which has over a second of replication lag. This can lead to the second query not being able to find any matching row in thecheck_suites
table, which in turn can lead to errors or other weird behaviour in our application.The application no longer has a "time consistent" view of the data, instead read queries see data at newer and older points in time.
The text was updated successfully, but these errors were encountered: