-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/kv: add guidance on large statements #7524
Comments
This slack thread was about large statements, not large transactions (#4409 was about transaction size). |
Ah thanks for clarifying that! |
Bumping this up in priority. It's worth noting that cockroachdb/cockroach#50330 will be worked on soon. |
I think we should improve this messaging and update it this release. |
I'd like to add this to our list of known limitations (as I document limitations for the 21.1 release), but I'm not sure we should go into so much detail on write amplification/blob storage support in the known limitation description. We should keep it practical. I'm thinking something like:
@bdarnell Am I oversimplifying this limitation and the recommendations? Hopefully with that description, we can close this out. @rmloveland I'll lean on Rich to develop a more complete resolution to documenting write amplification #6433. |
I don't think parameterized statements help enough to be worth calling out. (it's at best a factor of 2 for the encoding overhead, right? An 8MiB blob is not going to be a good experience even if parameterization makes the difference for whether it fits under the 16MiB limit or not) I think the guidance is basically "don't do that" (if anything, you're undersimplifying :)). I'd put the note about cloud storage for blobs on the blob type page instead of tying it to the statement size limit (because blobs are unadvisable in CRDB well before you hit that limit - we recommend no more than 1MiB per value). |
SGTM. Thanks Ben! |
We discuss this briefly in #4409, and I recently filed an issue to improve the message in cockroachdb/cockroach#50330, but we should add guidance on large transactions. We have had recent customers tripped up on this limitation (cc @kai-niemi )
I've pulled out relevant conversation [from slack below] (https://cockroachlabs.slack.com/archives/CHVV403F0/p1590406019235800) that shows input from @bdarnell and @knz .
We currently have a 16 MiB limit due to our implementation of the postgres network protocol. it's a magic number we could change, but we've mainly discussed that in the context of lowering it, not raising it. write amplification is the main thing that prevents us from excelling at workloads with a lot of large blobs. there's also increased memory usage and especially increased variance in memory usage that can lead to instability. and if rows containing large blobs are not insert-only, you run the risk of hitting the max range size with mvcc history (less of an issue now with the larger default range size).
We can provide guidance that using placeholders will help, since that way you only get charged for the size of the value without the encoding necessary to fit it into a SQL statement.
if large blobs are important to your application, i'd generally recommend storing them in s3 or similar and only storing references to them in your RDBMS
See also:
The text was updated successfully, but these errors were encountered: