-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execution Context integration for Database write operations #7072
Conversation
e29fc61
to
90a21d4
Compare
distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Outdated
Show resolved
Hide resolved
that the operation checks for errors like missing columns. | ||
|
||
More expensive checks, like clashing keys are checked only on a sample of | ||
1000 rows, so it is possible for the operation to encounter one of these |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this optimization should be a separate flag; 'dry run' means 'no real side effects' rather than 'speed up expensive checks'. The two flags might often be used together, but perhaps they should be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dry run is not really a flag, but specifies the behaviour when the Output context is disabled.
The only scenario we could do is dry run with checking all values or checking none at all (then the actual check will happen on the proper execution).
I don't think we should be adding additional flags for this - as this parameter would not at all apply when the context is enabled.
We can reconsider if we want to run the check at all. I think checking a subset is a compromise that will allow us to catch some errors while retaining the dry run behaviour and not sabotaging the performance. Do you think we should instead not check at all in dry run mode? Or check all entries?
I like checking all entries because it actually does a 'proper' dry run - it verifies the behaviour that will happen on actual execution. The only worry was that it may be too expensive.
In the in memory backend, this operation will be as expensive as any operations preparing the data before hand (roughly O(N) cost).
Problem is a bit bigger in the DB backend, as all operations are done 'lazily' by default - they just construct more and more complex SQL queries, but do not run them. Now this check would require to run the query. Still, the cost is comparable to the cost of attaching a Table visualization to any of the queries.
And then, there is the issue that while update_database_table
does the check, but does not actually retain the data; select_into_database_table
is meant to create a temporary dry run table. If we create it with all the data, it's not much different from the 'proper' run, only difference is table name and that it is a temporary table. Still processing all the data gives us the closest experience to the actual run, apart from side effects.
I like processing all data in the dry runs. I would consider trying it and going back to smaller samples if the performance is really unsatisfactory. @jdunkerley what do you think?
distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Postgres/Postgres_Connection.enso
Show resolved
Hide resolved
90a21d4
to
fce9a4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good.
A few comment style suggestions.
And some code style suggestions.
distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Show resolved
Hide resolved
|
||
Some operations, like writing to tables, require their target to be a | ||
trivial query. | ||
is_trivial_query : Boolean ! Table_Not_Found |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like something that the Context
should answer.
Likewise, we can choose to insert with some columns removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Why would it be the context? it's a property of the whole table, exactly because of (2) - I also want to check if the columns are intact and unmodified
For example, table.set "[X] + 2" "X"
will have context completely unchanged, but the contents of the column X will all be shifted by 2. And so when inserting to this table, should we join along values of the original X or the updated X? It is ill-defined.
- And that's why - I don't think we should allow any column modifications, be it rename or removal. After such operations it's no longer the same table, and we are inserting to the original one, not the modified one.
e.g. what if I have table T with columns A, B and C and remove C. Then I append to this new table a table with columns A and B and have error_on_missing_columns=True
. C
is missing from the table I will actually append into, but not from the table that I set as my target. Do I error on this missing C or not? It's unclear.
So due to these examples, IMO it only makes sense to append to a 'trivial' table.
distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Outdated
Show resolved
Hide resolved
|
||
? Side Effects | ||
|
||
Note that the `read` method is running without restrictions when the | ||
output context is disabled, but it can technically cause side effects, | ||
if it is provided with a DML query. Usually it is preferred to use | ||
`execute_update` for DML queries, or if they are supposed to return | ||
results, the `read` should be wrapped in an execution context check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel this needs work to be reworded - on such a primitive and core function this feel like will confuse end users but agree we need some message here. One for the doc review work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. Just a note that it is a core but pretty advanced function. I guess once someone knows SQL they should be aware that executing an UPDATE RETURNING
will not only read but will cause changes. What may not be as clear to even experienced SQL developers, is that such read in the IDE may be re-run many times when the workflow is being modified, and thus the side effect may be invoked multiple times as well (that's why we hide the effects behind the execution context, which we do not have here).
Maybe we should actually detect some keywords like UPDATE, CREATE, DROP, ALTER, INSERT, DELETE
and warn here?
distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Upload_Table.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Upload_Table.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Upload_Table.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Upload_Table.enso
Outdated
Show resolved
Hide resolved
distribution/lib/Standard/Database/0.0.0-dev/src/Internal/Upload_Table.enso
Outdated
Show resolved
Hide resolved
26cbf3f
to
dfed355
Compare
dfed355
to
5b8fc83
Compare
5b8fc83
to
c6d5914
Compare
build.sbt
Outdated
"org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided", | ||
"org.xerial" % "sqlite-jdbc" % sqliteVersion, | ||
"org.postgresql" % "postgresql" % "42.4.0" | ||
"org.graalvm.truffle" % "truffle-api" % graalVersion % "provided", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need GraalVM dependency for:
- importing
Value
type - we need to returnValue
notObject
, asObject
will cause a polyglot conversion that loses Enso warnings, - accessing
TruffleLogger
to log that a maintenance operation failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot use truffle-api
in standard libraries. Try sdk
- that's the JAR which exposes Value
.
Btw. truffle-api
API transitively depends on sdk
- e.g. bringing in more may seem to work, but it is not good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
truffle-sdk
does not seem to exist
[warn] Note: Unresolved dependencies path:
[error] stack trace is suppressed; run 'last common-polyglot-core-utils / update' for the full output
[error] (common-polyglot-core-utils / update) sbt.librarymanagement.ResolveException: Error downloading org.graalvm.truffle:truffle-sdk:22.3.1
[error] Not found
[error] Not found
[error] not found: C:\Users\progr\.ivy2\localorg.graalvm.truffle\truffle-sdk\22.3.1\ivys\ivy.xml
[error] not found: https://repo1.maven.org/maven2/org/graalvm/truffle/truffle-sdk/22.3.1/truffle-sdk-22.3.1.pom
distribution/lib/Standard/Database/0.0.0-dev/src/Connection/Connection.enso
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
truffle-api
is an API for those who write interpreters. That is a different "level of Java" than the one used in standard libraries. Moreover I don't think the Java types are even accessible.
build.sbt
Outdated
"org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided", | ||
"org.xerial" % "sqlite-jdbc" % sqliteVersion, | ||
"org.postgresql" % "postgresql" % "42.4.0" | ||
"org.graalvm.truffle" % "truffle-api" % graalVersion % "provided", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot use truffle-api
in standard libraries. Try sdk
- that's the JAR which exposes Value
.
Btw. truffle-api
API transitively depends on sdk
- e.g. bringing in more may seem to work, but it is not good idea.
9f58d7d
to
046c91c
Compare
bc62843
to
7bbc16f
Compare
"com.ibm.icu" % "icu4j" % icuVersion, | ||
"org.graalvm.truffle" % "truffle-api" % graalVersion % "provided" | ||
"com.ibm.icu" % "icu4j" % icuVersion, | ||
"org.graalvm.sdk" % "graal-sdk" % graalVersion % "provided" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
graal-sdk
is the API to use in Java parts of standard libraries. It's javadoc is available here: https://www.graalvm.org/sdk/javadoc/
Btw. the Truffle javadoc also contains the org.graalvm
classes, but that's because of transitive dependency.
2f06f55
to
516bba6
Compare
Add a check for transaction support. remove outdated check grow builder on seal to ensure all present CR1: Connection CR2: rephrasing docs CR2: Dry_Run_Operation clearer method names CR4: code style javafmt
improve ref counting checkpoint add a test for not overwriting pre-existing tables OperationSynchronizer
…before some DB operations. Better keep track of allocated dry run tables and ensure that a dry run name does not collide with a pre-existing user table.
fix fixes
…ibs" This reverts commit 0c98ff81dd4d95ad38cfbfe5e8872b9c546cbeb7.
`"org.graalvm.sdk" % "graal-sdk" % graalVersion % "provided"` in helper Java libs
…n failure will now be reported to stderr
37598ba
to
fdafcab
Compare
Pull Request Description
Closes #6887
Important Notes
Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
./run ide build
.