LF: Make DarReader ZipEntries immulatble #10243

remyhaemmerle-da · 2021-07-12T15:35:12Z

CHANGELOG_BEGIN
CHANGELOG_END

Pull Request Checklist

Read and understand the contribution guidelines
Include appropriate tests
Set a descriptive title and thorough description
Add a reference to the issue this PR will solve, if appropriate
Include changelog additions in one or more commit message bodies between the CHANGELOG_BEGIN and CHANGELOG_END tags
Normal production system change, include purpose of change in description

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.

daml-lf/archive/src/main/scala/com/digitalasset/daml/lf/archive/DarReader.scala

remyhaemmerle-da · 2021-07-12T15:36:13Z

daml-lf/archive/src/main/scala/com/digitalasset/daml/lf/archive/DarReader.scala

+    val buffer = new Array[Byte](buffSize)
+    var output = Bytes.Empty
+    Iterator.continually(zip.read(buffer)).takeWhile(_ >= 0).foreach { size =>
+      output = output ++ Bytes.fromByteArray(buffer, 0, size)


concatenation is constant time.

cocreature

seems sensible, thanks!

cocreature · 2021-07-12T17:39:58Z

daml-lf/archive/src/main/scala/com/digitalasset/daml/lf/archive/DarReader.scala

-    go(Success(Map.empty)).map(ZipEntries(name, _))
-  }
+  ): Try[ZipEntries] =
+    bracket(Try(darStream))(is => Try(is.close()))


I’m not sure this is a good idea. imho a function should not close a stream it didn’t open and we didn’t open the zip stream here.

daml-lf/archive/src/main/scala/com/digitalasset/daml/lf/archive/DarReader.scala

cocreature · 2021-07-12T17:44:05Z

daml-lf/data/src/main/scala/com/digitalasset/daml/lf/data/Bytes.scala


  def fromByteBuffer(a: ByteBuffer): Bytes =
    new Bytes(ByteString.copyFrom(a))

  def fromInputStream(a: InputStream): Bytes =
    new Bytes(ByteString.readFrom(a))

+  // reads at most `len` bytes, does not close `a`
+  def fromInputStream(a: InputStream, len: Int): Bytes = {


this seems like a fair amount of complexity and potential source of bugs. do the performance gains matter?

I reverted the optimization. Do you prefer like that ?

thx seems simpler! And just to be clear, I’m happy to make an optimization here if it provides a benefit just not without some measurements.

CHANGELOG_BEGIN CHANGELOG_END

This reverts commit 3b25a8c.

remyhaemmerle-da · 2021-07-12T18:57:43Z

daml-lf/archive/src/main/scala/com/digitalasset/daml/lf/archive/DarReader.scala

+    private[GenDarReader] def readDalfNames: Try[Dar[String]] =
+      bracket(get(ManifestName).map(_.toInputStream))(is => Try(is.close()))
+        .flatMap(DarManifestReader.dalfNames)
+        .recoverWith { case NonFatal(e1) =>
+          findLegacyDalfNames.recoverWith { case NonFatal(_) =>
+            Failure(Error.InvalidDar(this, e1))
+          }


@cocreature Do we have to handle dar without Manifest ?
If no, let me drop this code, otherwise should we have test for that (It seems we currently do not have any) ?

The compiler will always add a manifest and I’d generally consider anything not produced by the compiler to be unsupported so should be fine to drop it.

@akshayshirahatti-da

This PR has been created by a script, which is not very smart and does not have all the context. Please do double-check that the version prefix is correct before merging. @akshayshirahatti-da is in charge of this release. Commit log: ``` 1f35db1 ledger-on-sql: Use random log entry ID allocation so we don't depend on `SeedService`. [KVL-1002] (#10255) b8e2198 Separate traces from warnings in engine. (#10253) 0a7f2b1 [JSON-API] Small refactoring to start using akka http routes internally (#10252) 6e8ec1d LF: Drop old depreated code (#10251) de7a08f [In-memory fan-out] Handle `null` submitters in getTransactionLogUpdates (#10248) 584169a Participant state v1-to-v2 adaptors [KVL-1002] (#10246) e4d7cd7 LF: Deprecate com.daml.lf.data.TryOps.Bracket (#10249) b59f365 [JSON-API] Correctly extract the request source URL/IP (#10244) edaf541 Adapt `ledger-api-bench-tool` to work with LR [DPP-424] (#10171) 6fe6ae0 LF: Make DarReader ZipEntries immulatble (#10243) 05e5218 Consolidating EventStorageBackend (#10224) 3fd3abf Instrument buffer after contract state events stream (#10209) 643a2de v2 participant state API draft [KVL-998] (#10210) 5abcd04 Adds Unit tests for parallel-ingestion [DPP-455] (#10238) 3cdedcf kvutils: Extract validators from TransactionCommitter [KVL-1015] (#10235) 00d622f Make @UNTIL-LF exclusive, add @UNTIL-LF-FEATURE. (#10240) 2bcbd4e es: switch to persistent nodes (#10236) 7f4bc2a update compat versions for 1.14.2 (#10233) 4eba4b0 Ledger: decouple BD value serialization library from archive library. (#10218) 729afa8 Write a few lines on how to run tests against an Oracle database (#10230) 2b67ebb tf: refactor appr var (#10232) 202b7f7 add akshay to appr team (#10229) 2c5410c Release SDK 1.14.2 (#10226) 89d87ad update compat versions for 1.14.2-snapshot.20210708.7135.0.aa297840 (#10223) 4839344 TODO Cleanup: QueryNonPruned [DPP-459] (#10225) 0bea5e3 Allow retriable lookupMaximumLedgerTime on contention (#10211) ca2a9e9 Release 1.14.2 snapshot (#10222) 999577a tweak ES cluster (#10219) 6680d36 Warn on DA.Exception import (#10201) f19f5b0 LF: Simplify DarReader (#10217) 8f1a8f2 Release DAML SDK 1.14.1 (#10204) ebb76dc LF: reorganize errors in com.daml.lf.archive (#10213) 6964645 Improvements for validation tests (#10214) 5f124c3 Avoid collision in execution log postfix (#10205) 9e27ae0 Reshuffle cocreature in daml-lf codeowners (#10215) 41b8448 LF: Simplify archive reader. (#10208) 38734f0 es-feed: ignore invalid files (#10207) 6586dde Move exercise context computation to the client side (#10199) c92c678 Mark Daml profiler as stable (#10203) cb1f4ec ci/windows: disable spool (#10200) 2326d42 Publish execution logs on unix platforms (#10194) 4269345 Drop cleanup of node_modules (#10195) c8faed8 Deduplicate Java codegen running instructions (#10185) e4c8c39 update compat versions for 1.14.1-snapshot.20210706.7133.0.8ec16e94 (#10196) 0d881f5 Improvements to the documentation with regards to offsets (#10180) 1d5ba4f feed elasticsearch cluster (#10193) e2bdca6 Use PartialFunction for more concise code. (#10191) fe98c48 release 1.14.1-snapshot.20210706.7133.0.8ec16e94 (#10186) 4fe626a Drop logging on CompletionSource (#10190) 582aa5f Fix a typo in an exceptions example. (#10188) 8750c0c reduce noise in Scala bindings (#10187) 98b5ffe Add divulgence warning and test in script service. (#10179) 8578e56 Tests for transaction validation (#10167) 05a72a3 update compat versions for 1.15.0-snapshot.20210705.7286.0.62aabcc4 (#10184) 7b93923 [In-memory fan-out] Performance optimizations [DPP-470] (#10127) 6c49619 Document how to run the Java codegen from the Daml assistant (#10181) 61aa774 Release SDK 1.15 RC (#10182) ``` Changelog: ``` - [JSON-API] If the service is put behind a proxy filling either of these headers X-Forwarded-For & X-Real-Ip then these will now be respected for logging the request source ip/url - [Daml Compiler] Importing DA.Exception on LF < 1.14 now produces a warning that exceptions require Daml-LF >= 1.14. - [Daml Profiler] The Daml profiler is now a stable feature. [Docs] Improvements to the documentation with regards to offsets [Docs] Document how to use the Java codegen using the Daml assistant ``` CHANGELOG_BEGIN CHANGELOG_END

@akshayshirahatti-da

* release 1.15.0-snapshot.20210713.7343.0.1f35db17 This PR has been created by a script, which is not very smart and does not have all the context. Please do double-check that the version prefix is correct before merging. @akshayshirahatti-da is in charge of this release. Commit log: ``` 1f35db1 ledger-on-sql: Use random log entry ID allocation so we don't depend on `SeedService`. [KVL-1002] (#10255) b8e2198 Separate traces from warnings in engine. (#10253) 0a7f2b1 [JSON-API] Small refactoring to start using akka http routes internally (#10252) 6e8ec1d LF: Drop old depreated code (#10251) de7a08f [In-memory fan-out] Handle `null` submitters in getTransactionLogUpdates (#10248) 584169a Participant state v1-to-v2 adaptors [KVL-1002] (#10246) e4d7cd7 LF: Deprecate com.daml.lf.data.TryOps.Bracket (#10249) b59f365 [JSON-API] Correctly extract the request source URL/IP (#10244) edaf541 Adapt `ledger-api-bench-tool` to work with LR [DPP-424] (#10171) 6fe6ae0 LF: Make DarReader ZipEntries immulatble (#10243) 05e5218 Consolidating EventStorageBackend (#10224) 3fd3abf Instrument buffer after contract state events stream (#10209) 643a2de v2 participant state API draft [KVL-998] (#10210) 5abcd04 Adds Unit tests for parallel-ingestion [DPP-455] (#10238) 3cdedcf kvutils: Extract validators from TransactionCommitter [KVL-1015] (#10235) 00d622f Make @UNTIL-LF exclusive, add @UNTIL-LF-FEATURE. (#10240) 2bcbd4e es: switch to persistent nodes (#10236) 7f4bc2a update compat versions for 1.14.2 (#10233) 4eba4b0 Ledger: decouple BD value serialization library from archive library. (#10218) 729afa8 Write a few lines on how to run tests against an Oracle database (#10230) 2b67ebb tf: refactor appr var (#10232) 202b7f7 add akshay to appr team (#10229) 2c5410c Release SDK 1.14.2 (#10226) 89d87ad update compat versions for 1.14.2-snapshot.20210708.7135.0.aa297840 (#10223) 4839344 TODO Cleanup: QueryNonPruned [DPP-459] (#10225) 0bea5e3 Allow retriable lookupMaximumLedgerTime on contention (#10211) ca2a9e9 Release 1.14.2 snapshot (#10222) 999577a tweak ES cluster (#10219) 6680d36 Warn on DA.Exception import (#10201) f19f5b0 LF: Simplify DarReader (#10217) 8f1a8f2 Release DAML SDK 1.14.1 (#10204) ebb76dc LF: reorganize errors in com.daml.lf.archive (#10213) 6964645 Improvements for validation tests (#10214) 5f124c3 Avoid collision in execution log postfix (#10205) 9e27ae0 Reshuffle cocreature in daml-lf codeowners (#10215) 41b8448 LF: Simplify archive reader. (#10208) 38734f0 es-feed: ignore invalid files (#10207) 6586dde Move exercise context computation to the client side (#10199) c92c678 Mark Daml profiler as stable (#10203) cb1f4ec ci/windows: disable spool (#10200) 2326d42 Publish execution logs on unix platforms (#10194) 4269345 Drop cleanup of node_modules (#10195) c8faed8 Deduplicate Java codegen running instructions (#10185) e4c8c39 update compat versions for 1.14.1-snapshot.20210706.7133.0.8ec16e94 (#10196) 0d881f5 Improvements to the documentation with regards to offsets (#10180) 1d5ba4f feed elasticsearch cluster (#10193) e2bdca6 Use PartialFunction for more concise code. (#10191) fe98c48 release 1.14.1-snapshot.20210706.7133.0.8ec16e94 (#10186) 4fe626a Drop logging on CompletionSource (#10190) 582aa5f Fix a typo in an exceptions example. (#10188) 8750c0c reduce noise in Scala bindings (#10187) 98b5ffe Add divulgence warning and test in script service. (#10179) 8578e56 Tests for transaction validation (#10167) 05a72a3 update compat versions for 1.15.0-snapshot.20210705.7286.0.62aabcc4 (#10184) 7b93923 [In-memory fan-out] Performance optimizations [DPP-470] (#10127) 6c49619 Document how to run the Java codegen from the Daml assistant (#10181) 61aa774 Release SDK 1.15 RC (#10182) ``` Changelog: ``` - [JSON-API] If the service is put behind a proxy filling either of these headers X-Forwarded-For & X-Real-Ip then these will now be respected for logging the request source ip/url - [Daml Compiler] Importing DA.Exception on LF < 1.14 now produces a warning that exceptions require Daml-LF >= 1.14. - [Daml Profiler] The Daml profiler is now a stable feature. [Docs] Improvements to the documentation with regards to offsets [Docs] Document how to use the Java codegen using the Daml assistant ``` CHANGELOG_BEGIN CHANGELOG_END * 1.16 not 1.15 changelog_begin changelog_end Co-authored-by: Azure Pipelines DAML Build <[email protected]> Co-authored-by: Moritz Kiefer <[email protected]>

I haven’t found any conclusive information as to why ON COMMIT doesn’t work incrementally but https://docs.oracle.com/en/database/oracle/oracle-database/19/adjsn/json-query-rewrite-use-materialized-view-json_table.html#GUID-8B0922ED-C0D1-45BD-9588-B7719BE4ECF0 recommends that for rewriting (which isn’t what we do here but both involve a materialized view on json_table). Benchmarks: before: InsertBenchmark.run 1000 1 1000 avgt 5 0.327 ± 0.040 s/op InsertBenchmark.run 1000 3 1000 avgt 5 0.656 ± 0.043 s/op InsertBenchmark.run 1000 5 1000 avgt 5 1.034 ± 0.051 s/op InsertBenchmark.run 1000 7 1000 avgt 5 1.416 ± 0.106 s/op InsertBenchmark.run 1000 9 1000 avgt 5 1.734 ± 0.143 s/op QueryBenchmark.run 1000 10 N/A avgt 5 0.071 ± 0.016 s/op After: Benchmark (batchSize) (batches) (numContracts) Mode Cnt Score Error Units InsertBenchmark.run 1000 1 1000 avgt 5 0.217 ± 0.034 s/op InsertBenchmark.run 1000 3 1000 avgt 5 0.232 ± 0.027 s/op InsertBenchmark.run 1000 5 1000 avgt 5 0.226 ± 0.051 s/op InsertBenchmark.run 1000 7 1000 avgt 5 0.225 ± 0.048 s/op InsertBenchmark.run 1000 9 1000 avgt 5 0.232 ± 0.021 s/op QueryBenchmark.run 1000 10 N/A avgt 5 0.080 ± 0.014 s/op The difference in query times is just noise and changes across runs. So we get the expected behavior of inserts being independent of the total ACS size now. We could still explore if we gain something by avoiding the materialized view to reduce constant factors but that’s much less of an issue. fixes #10243 changelog_begin changelog_end

remyhaemmerle-da commented Jul 12, 2021

View reviewed changes

daml-lf/archive/src/main/scala/com/digitalasset/daml/lf/archive/DarReader.scala Show resolved Hide resolved

remyhaemmerle-da commented Jul 12, 2021

View reviewed changes

remyhaemmerle-da added the component/daml-engine DAML-LF Engine & Interpreter label Jul 12, 2021

remyhaemmerle-da added this to the Maintenance milestone Jul 12, 2021

remyhaemmerle-da marked this pull request as ready for review July 12, 2021 15:41

remyhaemmerle-da requested a review from cocreature as a code owner July 12, 2021 15:41

cocreature approved these changes Jul 12, 2021

View reviewed changes

remyhaemmerle-da added 2 commits July 12, 2021 20:23

LF: Make DarReader ZipEntries immulatble

9a37bfd

CHANGELOG_BEGIN CHANGELOG_END

optim

3b25a8c

remyhaemmerle-da force-pushed the remy-lf-zipentries branch from e512330 to 3b25a8c Compare July 12, 2021 18:24

remyhaemmerle-da added 3 commits July 12, 2021 20:29

Revert "optim"

3d9a0e2

This reverts commit 3b25a8c.

a function should not close a stream it didn’t open

967146a

typo

3068bf0

remyhaemmerle-da commented Jul 12, 2021

View reviewed changes

we do not support dar without manifest !

d2e68dc

remyhaemmerle-da added the automerge label Jul 12, 2021

mergify bot merged commit 6fe6ae0 into main Jul 12, 2021

mergify bot deleted the remy-lf-zipentries branch July 12, 2021 21:13

cocreature mentioned this pull request Jul 15, 2021

Refresh json API materialized view on statement #10285

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LF: Make DarReader ZipEntries immulatble #10243

LF: Make DarReader ZipEntries immulatble #10243

remyhaemmerle-da commented Jul 12, 2021

remyhaemmerle-da Jul 12, 2021

cocreature left a comment

cocreature Jul 12, 2021

remyhaemmerle-da Jul 12, 2021

cocreature Jul 12, 2021

remyhaemmerle-da Jul 12, 2021 •

edited

Loading

cocreature Jul 12, 2021

remyhaemmerle-da Jul 12, 2021

remyhaemmerle-da Jul 12, 2021

cocreature Jul 12, 2021

remyhaemmerle-da Jul 12, 2021

remyhaemmerle-da Jul 12, 2021

LF: Make DarReader ZipEntries immulatble #10243

LF: Make DarReader ZipEntries immulatble #10243

Conversation

remyhaemmerle-da commented Jul 12, 2021

Pull Request Checklist

Choose a reason for hiding this comment

cocreature left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

remyhaemmerle-da Jul 12, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

remyhaemmerle-da Jul 12, 2021 •

edited

Loading