-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stability: disconnect + crash under load #4925
Comments
The immediate cause is tons of async intent resolutions. Queueing those up (as I'm planning to do for #4881) will help, although it's unclear how things got into a state where nothing can make progress. |
Plan is to add a pool of intent resolution workers which pull intent resolution tasks from a shared channel. Back pressure is applied if the channel fills up causing the calling goroutine to block. |
Need to make sure this doesn't cause deadlock. We could drop the intent resolution on the floor if the channel is full. |
Other brainstorming: intents are associated with a transaction. We could maintain a map of transaction to in-flight intent resolution so that we don't pile up a bunch of RPCs for the same transaction. |
Cc @nvanbenschoten. |
An explosion of these calls for the same transaction (or a small number of transactions) was responsible for cockroachdb#4925 Fixes cockroachdb#4925
This is a last line of defense against issues like cockroachdb#4925.
unrelated-ish, but running single-node single-photo after 25min:
That's really slow for single-node single-photo. This was running on #5219, but not much faster on |
Something's pretty slow down at RocksDB again. I'll make a conscious effort to plumb the tracing further down that path. |
hey @RaduBerinde, with some more tracing I see this:
basically there seem to be a lot of 10k element scans going on. Iirc that's your chunking size limit? Can you from what the photo app is selecting tell whether that's going to be a query which you're about to optimize away using the batch-wide max results? Doubt that the photos app really wants the 10k rows back (haven't checked). |
probably one of the selects here: https://github.com/cockroachdb/examples-go/blob/master/photos/db.go |
Yeah 10k is the chunk size limit. The batch-wide limit is not going to optimize anything away, it will just allow us to use the chunk limit in all scenarios - currently, we can only scan in chunks when we have a single span to scan. But this won't be a factor here, I looked at the SELECTS in I looked at the SELECTs locally using EXPLAIN and there is one that looks problematic:
This means we will scan all comments with that photoID and sort them. We aren't using the We could force it to do what we want by doing something like: SELECT commentID, userID, message, timestamp FROM comments
WHERE photoID = $1 AND commentID in
(SELECT commentID FROM comments WHERE photoID = $1 ORDER BY timestamp DESC LIMIT 100)
ORDER BY timestamp DESC |
Thanks for looking into this - I'm going to update the photos app with your query, at least for the time being. Just curious, shouldn't the where clause show up in EXPLAIN above the scan? |
Rather than changing the query, I think the photos schema could be adjusted. We could add a |
sgtm, but I'll still go ahead and do the "stupid" update first to see whether that opens up another performance issue hidden behind it at the moment. |
another thing: It'd be helpful if (at least cc @RaduBerinde |
EXPLAIN shows the high-level plan, it doesn't show expressions. You can see however that the scan range is restricted to
It's a bit more complicated than that here. It's not really a filter that has a low-pass ratio, it's the fact that we retrieve all results for the given photoID to sort them by their timestamps (before we can return the top 100). As of a PR I have out, |
does that mean
but wouldn't "oh we got these 10k results and did something but at the end we threw away all but 100" generally indicate the right thing, namely that whichever source the data is coming from is fetching the data in a way that doesn't use the right boundary condition? |
Yes, DEBUG spits out all the keys that are queried plus per-row information as it travels through the system.
Right, I was just saying it's not a property of a filter in this case. Maybe we want an |
+1 On Mon, Mar 14, 2016 at 7:28 PM RaduBerinde [email protected]
|
In cockroachdb#4925, we observed ineffective planning for a query in the photos app. We prefer to use the primary index and sort rather than use a non-covering index which makes sense in general (non-covering indices require and expensive indexJoin) but in this case we also had a limit. In such a case using the index would require looking only at the first rows instead of getting all matching rows and sorting. In this change we tweak the index selection: if we have a reasonable limit, we give a "boost" to all indices that match the ordering exactly. The boost exactly offsets the non-covering index penalty. In addition to the new tests, I also verified the photo app query in cockroachdb#4925 now uses the index. Fixes cockroachdb#5246.
In cockroachdb#4925, we observed ineffective planning for a query in the photos app. We prefer to use the primary index and sort rather than use a non-covering index which makes sense in general (non-covering indices require an expensive indexJoin) but in this case we also had a limit. In such a case using the index would require looking only at the first rows instead of getting all matching rows and sorting. In this change we tweak the index selection: if we have a reasonable limit, we give a "boost" to all indices that match the ordering exactly. The boost exactly offsets the non-covering index penalty. In addition to the new tests, I also verified the photo app query in cockroachdb#4925 now uses the index. Fixes cockroachdb#5246.
In cockroachdb#4925, we observed ineffective planning for a query in the photos app. We prefer to use the primary index and sort rather than use a non-covering index which makes sense in general (non-covering indices require an expensive indexJoin) but in this case we also had a limit. In such a case using the index would require looking only at the first rows instead of getting all matching rows and sorting. In this change we tweak the index selection: if we have a reasonable limit, we give a "boost" to all indices that match the ordering exactly. The boost exactly offsets the non-covering index penalty. In addition to the new tests, I also verified the photo app query in cockroachdb#4925 now uses the index. Fixes cockroachdb#5246.
There is not clear and obvious culprit here, so I'll attempt to summarize as best I can.
build sha: e6793e1
4-node cluster with the photos example running against it.
Things are fine for a few hours, until the following happens:
Crash timelines and full logs. All crashes are OOM
node4: 17:40:41
node4.crash.txt
node3: 18:04:32
node3.crash.txt
node2: 18:09:45
node2.crash.txt
node1: 21:15:49
node1.crash.txt
Now, looking for originating events is where it gets a bit tedious.
One of the first signs of badness if around 17:26 when we get:
This happens on nodes 1, 2, and 4 at that precise time, and 20 seconds lated on node 3.
Lease holders were the following, with 3 being the last one:
The text was updated successfully, but these errors were encountered: