-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5.5] [WIP] Prevent database race conditions on replicated databases #20445
Conversation
Anyone know why i.e. I want this in - $statement->expects($this->once())->method('rowCount')->will($this->returnValue(['boom']));
+ $statement->expects($this->twice())->method('rowCount')->will($this->returnValue(['boom'])); But I get an undefined method. I tried |
I dunno, I always thought that was just the accepted behavior of read / write database setups and part of the complexity you take on when you use them. I'm not sure we really need any magic around that fact? |
Hi @taylorotwell , I see what you mean by it being left to the application to handle that complexity. However, I would like to mention that there are implementations of this in other frameworks that do provide connection "stickiness" for data integrity. E.g. https://github.com/taskrabbit/makara for Ruby on Rails apps I also agree with @laurencei that there could be a negative side effect for large |
Maybe this could be a config option?
Let people decide if they want it or not? Can be default to |
Eh, I'm fine without the configuration option. |
Made the tests fail. |
@taylorotwell - sorry - I only just saw your comment after I just pushed a config option already. You can now do Let me know if you want me to revert the config option now? Test's pass now. Something with mockery not letting me do I'll update the docs if this gets pulled in. |
Which test file did you change? I already have this pulled down and just want to get my tests passing locally because I already made some changes. |
I didnt change the tests. I changed - $this->setRecordsModified($statement->rowCount() > 0);
- return $statement->rowCount();
+ $count = $statement->rowCount();
+ $this->setRecordsModified($count > 0)
+ return $count; This way it only calls If you see my comment at the top, for some reason mockery wont allow |
Merged with some changes. |
I realize this thread is closed, but I want to chime in. I definitely understand the problem and have ran into replication issues, but this solution doesn't address intent. This solution can undermine the very benefit to having a read/write connection, and worse of all, it can lead to nondeterministic behavior, introducing performance issues. Consider, for example, an application that writes an audit log to the database. In that application, the audit log happens to write first, followed by an expensive read operation that has nothing to do with the audit log write. With the current solution, the expensive read will take place on the write connection, but it shouldn't. What we really need is a way to tell Laravel that our intentions for operations like the write, immediate read, are for it to read from the write connection. Although it implies something slightly different, a database transaction already implies this behavior. If we need a separate construct to be able to accomplish this outside of a database transaction, a similar construct should be created. Otherwise, a database transaction already reads from the write connection and should be used as the solution. |
@nerdo - I disagree.
You can create multiple database connections (to the same database). The main content has "sticky" writes, the second connection does not. Just get your log model to use the non-sticky connection - and that means you log writing never causes a sticky read.
Actually - you can do this if you want. Just keep At the end of the day, as mentioned in the original comments - it is difficult to do any other way. This can be solved in many different ways, and all have pros/cons that are different for each application. This was the simplest solution for most people. You are always welcome to submit a PR with an alternative solution if you can think of one :) |
Love this! Very useful especially in write-heavy production environments (such as mine). @laurencei forgive the ignorance, but you stated:
Could you provide an example for this? I'd probably do the opposite (i.e. by default sticky = true, but like the flexibility to be able to tell the system that I don't need to implement the sticky option for a specific query. My second question is, does this apply on a per-model basis, or the entire session? In other words, if I have the sticky option enabled, and write to the users table, then ask for a read from a different table, will the sticky option force the read from the other table to also be on the write connection? Thanks for the work! |
Unless there is a need to perform these operations outside of a database transaction, creating one is the alternative solution and doesn't require a pull request. In other words, the following code (and its variants) should have the same effect as the sticky solution: $allModels = DB::transaction(function () use ($args) {
MyModel::create($args);
return MyModel::all();
}); My argument is that a solution such as this more clearly shows the read and write operations are intentionally grouped together for a reason. Admittedly, the reason isn't clear without adding comments, but, as far as I know, this approach has no unintended side-effects. If doing these operations within a database transaction isn't desirable, then I think that a solution similar to transactions should be implemented for this purpose so that it's clearer what is happening, more deterministic, and easy to isolate the subtle nuances of what is happening under the hood. If the sticky solution is left on, then you always have to keep it in mind to avoid an accidental heavy read operation on the write database. This implies that the order of logical operations may matter when they really shouldn't. |
@nerdo and what if you can't group them together into a transaction like that? So for example, what if you write first, then only later on in the same session read those records? In that case, you can't group them into a db transaction, and replication lag may cause your read records to not be current. |
You’re right, a database transaction wouldn’t work in that case. But, the scenario you described makes an even stronger case for making it more clear what is actually happening. Manipulating the state of the connection (sticky config) can potentially lead to performance issues that are really hard to troubleshoot. An alternative might be to mark the query as “read from the write connection” instead of it being part of the state of the connection. |
I'm having the same issue producing view results from AWS Aurora. The data writes to the instance fine. On success of the ajax call; I load a div which then displays items in the selected database from the reader instance. However, the reader is 20ms behind and found that the data is only loading partial data that's been uploading. I added a setTimeout and found that 1 second, all the data loaded fine. Results are that the reader instance is too far behind however can't read from a writer instance due to write-only. A way of forcing write-only results to VIEW would be required. |
@laurencei @taylorotwell is this option configurable as I can't find documentation. |
@isnipepixels it's documented here: https://laravel.com/docs/5.5/database - look at the "sticky" option under "Read and Write Connections" |
@laurencei how does this affect queue daemon workers? It's occurred to me that once "sticky" has been set, and the write endpoint is now being used for reading, if it's not "reset" after the worker has completed a job, that worker will forever use the write connection for both reading and writing. Or have I missed something? |
@Magiweb - its an interesting thought. The reality is it would probably stick to using the The easiest solution is give your queue workers a duplicate database connection - but without the But then you run the risk of out-of-sync data during the queues. So each use case will be different. |
I disagree. It makes more sense to specify it as part of the connection (imo), since you are actually making a different layer2 connection to a different database server. It makes a lot of sense to specify it as part of the connection. |
@laurencei My understanding is that queue:listen is deprecated in 5.5 (or 5.6?) and in any event is far less resource efficient, so I don't think that is a viable long-term workaround. So perhaps then it would be useful to be able to specify the connection to use as a model / DB method, like DB::connection() or Model::on(), but for "read / write" host specification, if that's at all possible (I haven't looked into the code)? Something like Model::find(1)->useWrite(); (or ->useRead()) At least this way we have the sticky option for "normal", unsupervised use, which is great, but then we also have the added flexibility to specify which connection to use for those that want more granular control or need it such as in the queue worker case. Is it possible? |
@Magiweb - no, you are wrong, there is no depreciation of I think the easiest solution is just have a different "connection" for your queue workers (but pointing to the same database) - and on that connection dont have a sticky connection. |
@Magiweb ->useWrite(); (or ->useRead()) won't work on services like AWS as some servers wont allow view from the write server etc. So you will be in the same boat. We've currently had to have a temp fix of usleep(30000) after insert to read the data straight after, otherwise, it won't sync in time to the read server and will fail the rest of the script. |
@narwy Hmm I understand what you mean for "useRead()" perhaps since you don't ever want to accidentally attempt to write to a read replica. But I've got a lot of experience with AWS and have never had a problem reading from a "write" server, so useWrite() for a select after having just done write I think is still possible and very useful. And in fact is what "sticky" does, but it does it all the time after having done a write, so you have no control over it. Delaying for long periods of time is really not an elegant solution, for a number of reasons. |
This is an attempt to address the database race issues discussed in laravel/ideas#713
This is still WIP as it is not fully tested. I want to write some tests and do some integration testing before this is pulled in (if it is to be accepted).
The scenario occurs when you have a master/slave type replicating database setup. You setup a
write
and aread
connection in Laravel. If youwrite
to the database and then immediately query for something, there is a race condition that yourread
result will not include your recent changes. This is a problem when it occurs in the same request cycle - because you logically expect a result to be there, but it may not.For example:
write
database.User::all()
(which currently goes to theread
database) in the same request cycle.If the replication lag time is greater than the time between calls, then the result that is returned is not logical or expected. This applies if you want to use your new model on further logic as well:
write
database (and get returned theid
of the new insert).User::find($id_of_just_inserted_user)
(which currently goes to theread
database.Once again - it is possible that you cannot get the
User
you just created, because theread
database has not replicated the change yet.This PR attempts to address this by once a
write
action has occurred, any subsequentread
actions in the same request only will go back to thewrite
database, to prevent a race condition.In fact, the framework itself was trying to overcome this same problem with
selectFromWriteConnection()
on some connections. We can remove that code, since we are fixing it at the base level.The only negative I can see from this PR is if you did a small
write
, but then did some heavyreads
in the same request, then that would affect your write database. But it is a trade off between that performance, vs the risk of incorrect data integrity.