Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dpp 392 todo cleanup postgre batch #9875

Merged
merged 7 commits into from
Jun 2, 2021

Conversation

nmarton-da
Copy link
Contributor

Pull Request Checklist

  • Read and understand the contribution guidelines
  • Include appropriate tests
  • Set a descriptive title and thorough description
  • Add a reference to the issue this PR will solve, if appropriate
  • Include changelog additions in one or more commit message bodies between the CHANGELOG_BEGIN and CHANGELOG_END tags
  • Normal production system change, include purpose of change in description

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.


val packages: PGTable[DBDTOV1.Package] = PGTable(
tableName = "packages",
insertSuffix = "on conflict (package_id) do nothing",
Copy link
Contributor

@tudor-da tudor-da Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This insert conflict action clause is not a property of a table but rather of a statement. I see this is also the only place where it is non-empty. I'd remove it from PGTable if it doesn't complicate code in other places

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in general, but here in this DSL PGTable's responsibility is to:

  • provide method to create the insert SQL for preparedstatements (val insertStatement)
  • provide the method to assemble batch arrays (def prepareData)
  • provide method to set up data preparedStatement data from built batch arrays (def setupData)

We could of course walk some more lengths to have these responsibilities separated somehow, currently it seemed to me the right level of abstraction, but please hit me with ideas if you can think something better!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, for all the tables defined here we assume that there is only one SQL statement to insert data into them, and the generation of that statement is responsibility of the PGTable object. In this case, the insertSuffix property seems fine.

Copy link
Contributor

@mziolekda mziolekda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

* First draw of PGSchema and PostgresDbBatch

changelog_begin
changelog_end
changelog_begin
changelog_end
…rating on tables of small size, and since execution of those queries only at initialisation: this performance improvement is not necessary

changelog_begin
changelog_end
changelog_begin
changelog_end
changelog_begin
changelog_end
changelog_begin
changelog_end
@nmarton-da nmarton-da force-pushed the dpp-392-todo-cleanup-postgre-batch branch from c718c6f to 13a6fd4 Compare June 2, 2021 18:46
@nmarton-da nmarton-da merged commit 6e48034 into main Jun 2, 2021
@nmarton-da nmarton-da deleted the dpp-392-todo-cleanup-postgre-batch branch June 2, 2021 20:06

val packages: PGTable[DBDTOV1.Package] = PGTable(
tableName = "packages",
insertSuffix = "on conflict (package_id) do nothing",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, for all the tables defined here we assume that there is only one SQL statement to insert data into them, and the generation of that statement is responsibility of the PGTable object. In this case, the insertSuffix property seems fine.

@@ -785,7 +135,6 @@ object PostgresStorageBackend extends StorageBackend[RawDBBatchPostgreSQLV1] {
|FROM package_entries
|WHERE ledger_offset > ?;
|
|-- TODO append-only: we do not have currently an index to support efficiently this operation. either add or make sure it is okay to do full table scans here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we went with the make sure it is okay to do full table scans here option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think it is okay to not add indexes just for initialization purposes for a relatively small table.
But you are right, let's get some more broad approval here!
@meiersi-da @gerolf-da @mziolekda do you agree?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's compare the cost and risk profile of the two options:

  1. add index: low implementation cost and risk (one SQL statement, one manual test to check query plan), low runtime cost as indices work well in DB's
  2. no index: no implementation cost and risk, medium operational risk, as a customer might discover slow startup on a ledger having ingested lots of packages.

It seems to me that Option 1 offers the better trade-offs, and follow the mantra of prudent engineering and optimizes for a smooth operational experience of our customers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm 🤔 so already multiple opinions, that is fine, created task to mediate, capture and implement a decision here for the six cases: https://digitalasset.atlassian.net/browse/DPP-428

azure-pipelines bot pushed a commit that referenced this pull request Jun 9, 2021
This PR has been created by a script, which is not very smart
and does not have all the context. Please do double-check that
the version prefix is correct before merging.

@garyverhaegen-da is in charge of this release.

Commit log:
```
3cb8d5c Deprecate use of divulged contracts (#9930)
e9a2e2e Allow two different time formats as input for the metrics reporting interval (#9926)
1dc8472 Fix variable name (#9927)
4037b1c Add metrics to the http json service (#9923)
bd26c43 LF: fix TransactionCoderSpec (#9922)
69ef624 Fix flaky InstrumentedSourceSpec test (#9921)
cbcec32 LF: clean up the remaining Exception TODOs (#9913)
5374209 Add dependency proto3-suite for Setup.hs (#9917)
c77324d skip fix_bazel_cache on forks (#9920)
ddf93fc Address TODOs in JdbcIndexer (#9911)
a0d3e80 Add missing telemetryContext instances in the ledger-api-client's Ctx (#9916)
1e009b4 Upgrade canton to 0.25.0-snapshot supporting lf-1.13 and enable/fix tests (#9915)
db14536 LF: Freeze archive proto for LF 1.14 (#9912)
15f9dfb update compat versions for 1.14.0-snapshot.20210602.7086.0.f36f556b (#9900)
b4247c1 [In-memory fan-out] BuffersUpdater implementation (#9858)
7e733c3 DPP-390 remove appendonly todos (#9910)
512f1fd  LF: Preview version of LF 1.14 (#9906)
2da94c3 Add warning about using divulged contracts in pruning doc (#9883)
3dee3d0 LF: Fix encoder for Exception (#9904)
b22c046 ledger-api-test-tool: Only wait for parties on participants under test. (#9903)
b8c02d6 Bump perf test for for #9871 (#9902)
4660e57 Fix typoed participant name in ExceptionsIT (#9901)
316069d Postgres batching refinements (#9898)
7bc925e LF: Factorize the logic for AST lookup (#9871)
2fa250f DPP-387 Remove migration todos (#9896)
65f9395 Try to fix windows skipping again (#9893)
850c731 Include PackageMetadata in LF interface (#9892)
41266c3 Fix the formatting on the LF spec. (#9889)
dc79830 DPP-393 Enable sandbox tests for the append-only schema (#9876)
3c83b77 Parallel Ingestion renames (#9894)
6e48034 Dpp 392 todo cleanup postgre batch (#9875)
08ed7b6 rotate release duty after 1.14.0-snapshot.20210601.7081.0.7d716e6d (#9878)
081c54b Revert "Try to unflake JSON API failure tests (#9855)" (#9887)
df0f3ff maybe fix deleted release tags (#9891)
4b020dd Release ledger-api-bench-tool (#9890)
d801914 Release new snapshot (#9885)
4a4dde0 Implemented JdbcLedgerDaoTransactionLogUpdatesSpec (#9813)
```
Changelog:
```
- [Ledger API] Use of divulged contracts in later transactions is
deprecated. Divulged contracts were already incompatible with
pruning. We are working on features to handle the cases currently
covered by divulgence.
- for applications which support the --metrics-reporter-interval cli option, these now support both the java and scala duration string format (e.g. PT1M30S and 1.5min)
Small fixes to the doc
[HTTP/JSON API]
- metrics reporting can now be chosen via the command line option --metrics-reporter (currently hidden), valid options are console, csv://, graphite:// and prometheus://
- metrics reporting interval can also now be chosen via a command line option --metrics-reporting-interval (currently hidden)
[ledger api] Add warning about incompatibility between pruning and
divulged contracts to pruning docs.
```

CHANGELOG_BEGIN
CHANGELOG_END
garyverhaegen-da pushed a commit that referenced this pull request Jun 9, 2021
This PR has been created by a script, which is not very smart
and does not have all the context. Please do double-check that
the version prefix is correct before merging.

@garyverhaegen-da is in charge of this release.

Commit log:
```
3cb8d5c Deprecate use of divulged contracts (#9930)
e9a2e2e Allow two different time formats as input for the metrics reporting interval (#9926)
1dc8472 Fix variable name (#9927)
4037b1c Add metrics to the http json service (#9923)
bd26c43 LF: fix TransactionCoderSpec (#9922)
69ef624 Fix flaky InstrumentedSourceSpec test (#9921)
cbcec32 LF: clean up the remaining Exception TODOs (#9913)
5374209 Add dependency proto3-suite for Setup.hs (#9917)
c77324d skip fix_bazel_cache on forks (#9920)
ddf93fc Address TODOs in JdbcIndexer (#9911)
a0d3e80 Add missing telemetryContext instances in the ledger-api-client's Ctx (#9916)
1e009b4 Upgrade canton to 0.25.0-snapshot supporting lf-1.13 and enable/fix tests (#9915)
db14536 LF: Freeze archive proto for LF 1.14 (#9912)
15f9dfb update compat versions for 1.14.0-snapshot.20210602.7086.0.f36f556b (#9900)
b4247c1 [In-memory fan-out] BuffersUpdater implementation (#9858)
7e733c3 DPP-390 remove appendonly todos (#9910)
512f1fd  LF: Preview version of LF 1.14 (#9906)
2da94c3 Add warning about using divulged contracts in pruning doc (#9883)
3dee3d0 LF: Fix encoder for Exception (#9904)
b22c046 ledger-api-test-tool: Only wait for parties on participants under test. (#9903)
b8c02d6 Bump perf test for for #9871 (#9902)
4660e57 Fix typoed participant name in ExceptionsIT (#9901)
316069d Postgres batching refinements (#9898)
7bc925e LF: Factorize the logic for AST lookup (#9871)
2fa250f DPP-387 Remove migration todos (#9896)
65f9395 Try to fix windows skipping again (#9893)
850c731 Include PackageMetadata in LF interface (#9892)
41266c3 Fix the formatting on the LF spec. (#9889)
dc79830 DPP-393 Enable sandbox tests for the append-only schema (#9876)
3c83b77 Parallel Ingestion renames (#9894)
6e48034 Dpp 392 todo cleanup postgre batch (#9875)
08ed7b6 rotate release duty after 1.14.0-snapshot.20210601.7081.0.7d716e6d (#9878)
081c54b Revert "Try to unflake JSON API failure tests (#9855)" (#9887)
df0f3ff maybe fix deleted release tags (#9891)
4b020dd Release ledger-api-bench-tool (#9890)
d801914 Release new snapshot (#9885)
4a4dde0 Implemented JdbcLedgerDaoTransactionLogUpdatesSpec (#9813)
```
Changelog:
```
- [Ledger API] Use of divulged contracts in later transactions is
deprecated. Divulged contracts were already incompatible with
pruning. We are working on features to handle the cases currently
covered by divulgence.
- for applications which support the --metrics-reporter-interval cli option, these now support both the java and scala duration string format (e.g. PT1M30S and 1.5min)
Small fixes to the doc
[HTTP/JSON API]
- metrics reporting can now be chosen via the command line option --metrics-reporter (currently hidden), valid options are console, csv://, graphite:// and prometheus://
- metrics reporting interval can also now be chosen via a command line option --metrics-reporting-interval (currently hidden)
[ledger api] Add warning about incompatibility between pruning and
divulged contracts to pruning docs.
```

CHANGELOG_BEGIN
CHANGELOG_END

Co-authored-by: Azure Pipelines DAML Build <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants