Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshots #145

Closed
matthewjumpsoffbuildings opened this issue Nov 24, 2023 · 14 comments · Fixed by #215
Closed

Snapshots #145

matthewjumpsoffbuildings opened this issue Nov 24, 2023 · 14 comments · Fixed by #215
Assignees
Labels
enhancement New feature or request

Comments

@matthewjumpsoffbuildings
Copy link
Contributor

I see that snapshot read only transactions are unsupported. Was this based on it not being important for most use cases, or is it too difficult to implement?

Or perhaps is it planned for future versions?

@taka-oyama
Copy link
Collaborator

Are you referring to stale reads? They are supported.

@taka-oyama taka-oyama added the question Further information is requested label Nov 24, 2023
@matthewjumpsoffbuildings
Copy link
Contributor Author

That feature is great, and will cover most use cases, but a snapshot, unlike a standard single instance read only transaction, creates a read only transaction that you can use multiple times in a request. It's not essential, but just like you seem to have made provision for other queries to use either the base database object, or a transaction object, it would also be good to have a way for a set of queries to use a snapshot that remains consistent across the handling of a method that contains multiple reads, even if data is modified in between the different reads

Heres the docs about snapshot - https://cloud.google.com/spanner/docs/samples/spanner-read-only-transaction#code-sample

@taka-oyama
Copy link
Collaborator

Ah, I see. I've never really had any use for such a case so it's not implemented.

I can put it on the TODO but implementing it seems a bit challenging, so I can't guarantee that it will get implemented anytime soon.

@matthewjumpsoffbuildings
Copy link
Contributor Author

Understandable, its not critical to have in most cases, but good to flag it for the future

@taka-oyama taka-oyama added enhancement New feature or request and removed question Further information is requested labels Nov 29, 2023
@matthewjumpsoffbuildings
Copy link
Contributor Author

@taka-oyama I just want to bump this after further investigation.

It appears that while the existing withStaleness() option is useful for performance, it is still not sufficient and a proper implementation of snapshots is essential.

The primary reason is locking. From the docs here https://cloud.google.com/spanner/docs/timestamp-bounds#timestamp_bound_types

Although reads using these timestamp bound modes are not part of a read-write transaction, they can block waiting for concurrent read-write transactions to commit. Bounded staleness reads attempt to pick a timestamp to avoid blocking, but may still block.

Even with timestamp bounds - there is still potential for locking of other read-write transactions.

On the other hand, snapshots take no locks and never block other transactions. In view of this, it is critical they be implemented

@matthewjumpsoffbuildings
Copy link
Contributor Author

I found that simply adding this to Connection

    public function snapshot($options = [], Closure $callback)
    {
        if($this->currentTransaction)
            throw new Exception('Cant run a snapshot in the middle of a transaction');

        $this->currentTransaction = $this->getSpannerDatabase()->snapshot($options);

        $result = $callback();

        $this->currentTransaction = null;

        return $result;
    }

And then updating the typing of currentTransaction in ManagesTransactions like so

trait ManagesTransactions
{
    /**
     * @var Transaction|Snapshot|null
     */
    protected Transaction|Snapshot|null $currentTransaction = null;

Is sufficient to make a simple way to call a snapshot query in a closure like this

$conn->snapshot(['exactStaleness' => new Duration(15)], function() {
    return Model::all();
});

@taka-oyama
Copy link
Collaborator

taka-oyama commented Jun 3, 2024

Even with timestamp bounds - there is still potential for locking of other read-write transactions.

Was there a specific case where it did block?

There is a bug in the current code that causes the staleness to be ignored when inside a read-write transaction.
I am planning to fix that in the next few days.

@taka-oyama
Copy link
Collaborator

taka-oyama commented Jun 3, 2024

Btw, if you enable data boost in your query, that's using snapshot in the background would this solve your case?

@taka-oyama
Copy link
Collaborator

There is a bug in the current code that causes the staleness to be ignored when inside a read-write transaction.
I am planning to fix that in the next few days.

This has been fixed in v7.4.2 and v8.1.1

@matthewjumpsoffbuildings
Copy link
Contributor Author

Btw, if you enable data boost in your query, that's using snapshot in the background would this solve your case?

This is not viable since databoost queries are incredibly limited in what kind of queries you can run, due to the requirements of being root partionable etc

@matthewjumpsoffbuildings
Copy link
Contributor Author

Even with timestamp bounds - there is still potential for locking of other read-write transactions.

Was there a specific case where it did block?

There is a bug in the current code that causes the staleness to be ignored when inside a read-write transaction. I am planning to fix that in the next few days.

No, there wasnt a specific case but in a high load application, it is an unacceptable risk, snapshots were designed for read only analytics queries that do not lock rows and are the correct way to run read only queries for optimal performance

@taka-oyama
Copy link
Collaborator

taka-oyama commented Jun 3, 2024

I couldn't find any mention about snapshots not locking in the docs but I finally found details on snapshot transaction in the API reference.

Snapshot read-only. Snapshot read-only transactions provide guaranteed consistency across several reads, but do not allow writes. Snapshot read-only transactions can be configured to read at timestamps in the past, or configured to perform a strong read (where Spanner will select a timestamp such that the read is guaranteed to see the effects of all transactions that have committed before the start of the read). Snapshot read-only transactions do not need to be committed.

@matthewjumpsoffbuildings
Copy link
Contributor Author

https://cloud.google.com/spanner/docs/reference/rest/v1/TransactionOptions#:~:text=For%20transactions%20that,are%20not%20needed.

For transactions that only read, snapshot read-only transactions provide simpler semantics and are almost always faster. In particular, read-only transactions do not take locks,

@matthewjumpsoffbuildings
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants