Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add snapshot methods to query builder/model #231

Open
matthewjumpsoffbuildings opened this issue Oct 1, 2024 · 9 comments
Open

Add snapshot methods to query builder/model #231

matthewjumpsoffbuildings opened this issue Oct 1, 2024 · 9 comments

Comments

@matthewjumpsoffbuildings
Copy link
Contributor

matthewjumpsoffbuildings commented Oct 1, 2024

Would be great if the snapshot method could be chained directly on query builder or model method calls. I tried to suss it myself to make a PR but I am not sure the best approach.

Ideally it would be great if you could do something like

// snapshot from a model
User::snapshot()->all();

// snapshot from a builder
$queryBuilder
    ->where('foo', 'bar')
    ->snapshot()
    ->get();

Ideally you should be able to insert the snapshot() method anywhere in the chain, but since snapshots use a closure to execute under the hood Im not sure how it could be made chainable

@taka-oyama
Copy link
Collaborator

taka-oyama commented Oct 1, 2024

A normal SELECT statement outside of a transaction is already a snapshot read only transaction.

@matthewjumpsoffbuildings
Copy link
Contributor Author

So running a SELECT sets currentSnapshot somewhere on the Connection and makes it execute the query in a snapshot? I am struggling to see where this happens in the code?

I remember there was some confusion between snapshots and timestamp bounds/single use transactions. Timestamp bounds and single use transactions are not the same as snapshots, as discussed here:

@matthewjumpsoffbuildings
Copy link
Contributor Author

matthewjumpsoffbuildings commented Oct 1, 2024

Also if it is the case that normal SELECT statements do somehow set currentSnapshot, how do you specify the timestamp bounds for the snapshot? Because the only place I see a snapshot being created is in the ManagesSnapshots trait, and the only time that happens is when Connection::snapshot($timestampBound) is called directly, which is surely not called when you just run a normal SELECT?

Eg this line here

$this->currentSnapshot = $this->getSpannerDatabase()->snapshot($options);

Is the only way to create a snapshot and pass the timestamp bound to it, and that doesnt appear to be called when running a normal SELECT?

@matthewjumpsoffbuildings
Copy link
Contributor Author

I just tested this, if I run something like

DB::connection()->table('totals')->get();

At no point does Connection::executeSnapshotQuery get called, nor does Connection::snapshot. Since Connection::snapshot seems to be the only place that actually runs this->getSpannerDatabase()->snapshot($options);, it is not the case that normal SELECT statements use snapshots. Rather, as before, they appear to be using single use transactions (optionally with timestamp bounds), which as discussed previously, are not the same as snapshots, and do not guarantee lock free reads

@matthewjumpsoffbuildings
Copy link
Contributor Author

Finally, it should actually not be the case that regular SELECT statements use snapshots. Snapshot use should require explicit specification.

This code should not run in a snapshot:

DB::connection()->table('totals')->get();

However something like this should be possible

DB::connection()->table('totals')->snapshot(new ExactStaleness(20))->get();

@taka-oyama
Copy link
Collaborator

taka-oyama commented Oct 1, 2024

From what I can read from here, all non read write transactions or DMLs are snapshot read-only transactions.

So running this is automatically running the statement as singleUse snapshot read-only transaction. The only reason I added Connection::snapshot(...) is to make it possible to set the staleness for all queries inside the closure for convenience.

@matthewjumpsoffbuildings
Copy link
Contributor Author

matthewjumpsoffbuildings commented Oct 1, 2024

I am confused by this. It appears perhaps the documentation is contradicting itself when it says this

Although reads using these timestamp bound modes are not part of a read-write transaction, they can block waiting for concurrent read-write transactions to commit. Bounded staleness reads attempt to pick a timestamp to avoid blocking, but may still block.

So its saying you can have a timestamp bound read that is not part of a read-write transaction (which according to the documentation you linked saying theres only 3 transaction modes, read-write, snapshot, partitioned, must mean that its in a snapshot right, since it cant be partitioned), and yet this read that is in a snapshot can still lock?

When the other documentation says snapshot never lock, as stated here

Snapshot transactions do not take locks. Instead, they work by choosing a Cloud Spanner timestamp, then executing all reads at that timestamp. Since they do not acquire locks, they do not block concurrent read-write transactions.

From my understanding, unless you explicitly call Database::snapshot() to create a Snapshot object, a read you run could still potentially lock, even with timestamp bounds

Perhaps this is a case of the documentation being unclear

@taka-oyama
Copy link
Collaborator

taka-oyama commented Oct 1, 2024

Bounded staleness reads attempt to pick a timestamp to avoid blocking, but may still block.

The doc really should go more into detail about how it can still block.

From what I have tested in production, switching from strong reads to stale reads resulted in better performance and less blocking overall.

@matthewjumpsoffbuildings
Copy link
Contributor Author

matthewjumpsoffbuildings commented Oct 1, 2024

The doc really should go more into detail about how it can still block.

Agreed, it seems confusing and like a direct contradiction currently, but it really does seem to me that the best practice if you want snapshot reads with a complete guarantee of no locking, that you should call Database::snapshot() and run execute() on the returned Snapshot object

From what I have tested in production, switching from strong reads to stale reads resulted in better performance and less blocking overall.

I am sure it does, but since the docs do seem to indicate that stale reads alone is not enough to guarantee no locks, it would make sense to add explicit snapshot usage options

Your closure snapshot() method on Connection is enough for me to have peace of mind, but adding the ability to trigger explicit snapshots on Builder and Model would be a nice to have

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants