-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
database/sql: missing row fetch size and other optimizations for large dataset #13067
Comments
Go prefers to start with simplicity instead of preemptively adding knobs. That explains why no explicit knobs yet exist. But in this case it seems a potential knob already does exist: the database DSN string. Your driver could support a hint in there. Or your driver could notice the result set is huge and buffer more, if it's buffering at all. Before we do anything, I'd want to hear from the driver author first and make sure they've done the optimizations on their side. |
hi, |
You can use StmtCfg to set prefetch memory and row count in rana/ora. -Ashwin |
Please take questions to the mailing list or other forums: https://golang.org/wiki/Questions |
@asambeka I'm still feeling out the best practices here. But the advice I would like you to consider would be to add any driver knobs you need into the context value after the driver adds support for the driver. For instance, your PrefetchRowCount property could be set at some lower request handler as a default, and by the time it reaches your request, you just pass it in normally. Context methods will be supported in go1.8+. |
CL https://golang.org/cl/38533 mentions this issue. |
Previously all arguments were passed through driver.IsValid. This checked arguments against a few fundamental go types and prevented others from being passed in as arguments. The new interface driver.NamedValueChecker may be implemented by both driver.Stmt and driver.Conn. This allows this new interface to completely supersede the driver.ColumnConverter interface as it can be used for checking arguments known to a prepared statement and arbitrary query arguments. The NamedValueChecker may be skipped with driver.ErrSkip after all special cases are exhausted to use the default argument converter. In addition if driver.ErrRemoveArgument is returned the argument will not be passed to the query at all, useful for passing in driver specific per-query options. Add a canonical Out argument wrapper to be passed to OUTPUT parameters. This will unify checks that need to be written in the NameValueChecker. The statement number check is also moved to the argument converter so the NamedValueChecker may remove arguments passed to the query. Fixes #13567 Fixes #18079 Updates #18417 Updates #17834 Updates #16235 Updates #13067 Updates #19797 Change-Id: I89088bd9cca4596a48bba37bfd20d987453ef237 Reviewed-on: https://go-review.googlesource.com/38533 Reviewed-by: Brad Fitzpatrick <[email protected]> Run-TryBot: Brad Fitzpatrick <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
In go1.9 drivers will be able to implement driver.NamedValueChecker on the driver.Conn. When inspecting the arguments it can determine if the argument is some type of option value and pass back driver.ErrRemoveArgument so it will not be passed into the query argument list, but still use it on the connection to configure it in some way. I believe this should fully resolve this issue. |
I have been trying to work with large datasets in Oracle and mySQL DBs. There are around 20 million records in these databases. Idea was to use golang to pull and export them as JSON. While I was able to use golang concurrency for export, I realized performance for pull from DB was VERY bad. Here is an estimate how bad it is:
golang (default driver setting): 30 min for 10 million (running next/scan in loop)
I was able to find configuration options for one of the Oracle drivers (gopkg.in/rana/ora.v3) and after setting PrefetchRowCount to 50K, I was able to reduce time taken to 5 min.
Big question, why is there no standard way to provide hint to underlying driver in database/sql package about configuration parameters(Or am I missing something here)? I think this is a big implementation issue, and will restrict golang use to very premetive scenarios around databases. This will also encourage non stadard usage of drivers. PrefetchRowCount is just one example, and there are many others including language (NLS settings) etc. Both ODBC and JDBC specifications have very well articulated and provide lot of options for performance tuning, and other aspects around data retrieval. Having atleast some of those will certainly make life easier.
The text was updated successfully, but these errors were encountered: