Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data dependencies support for reexports #11147

Merged
merged 13 commits into from
Oct 18, 2021
Merged

Data dependencies support for reexports #11147

merged 13 commits into from
Oct 18, 2021

Conversation

akrmn
Copy link
Contributor

@akrmn akrmn commented Oct 6, 2021

Pull Request Checklist

  • Read and understand the contribution guidelines
  • Include appropriate tests
  • Set a descriptive title and thorough description
  • Add a reference to the issue this PR will solve, if appropriate
  • Include changelog additions in one or more commit message bodies between the CHANGELOG_BEGIN and CHANGELOG_END tags
  • Normal production system change, include purpose of change in description

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.

@akrmn akrmn force-pushed the data-deps-reexports branch from e76eb75 to 71f1c40 Compare October 6, 2021 15:33
]
]

, dataDependenciesTest "Using reexported values"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking this is a different issue than the one described in #10772, but I think they can both be taken care of with the same solution, namely using the md_exports field of ModDetails, which includes all values, types and classes exported by a module regardless of their origin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we support more than just full module reexports that’s even better 👍

@@ -488,7 +488,14 @@ convertModule lfVersion pkgMap stablePackages isGenerated file x depOrphanModule
templates <- convertTemplateDefs env
exceptions <- convertExceptionDefs env
interfaces <- convertInterfaces env (eltsUFM (cm_types x))
pure (LF.moduleFromDefinitions lfModName (Just $ fromNormalizedFilePath file) flags (types ++ templates ++ exceptions ++ definitions ++ interfaces ++ depOrphanModules))
let defs =
types
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a bit tired of the super long line, so I factored this out. Please let me know if this is the style you'd prefer for splitting the expression into lines, or if you prefer operators at the end or something different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we have any strong guidelines here. This looks perfectly reasonable.

@akrmn akrmn force-pushed the data-deps-reexports branch from 71f1c40 to 4db8fe7 Compare October 14, 2021 13:14
@akrmn akrmn marked this pull request as ready for review October 14, 2021 13:15
@akrmn akrmn requested a review from cocreature as a code owner October 14, 2021 13:15
@akrmn akrmn force-pushed the data-deps-reexports branch 2 times, most recently from 835def6 to 4dda6a7 Compare October 14, 2021 13:44
@akrmn
Copy link
Contributor Author

akrmn commented Oct 14, 2021

Please note that a few tests are failing, weirdly saying that a type or class defined in that module is not in scope, for example,

File:
  .daml/package-database/1.dev/daml-prim-c63a11a5757a45ef46e4df0f86039939c571b5c298fb7e6e1523dc0bdf102082-c63a11a5757a45ef46e4df0f86039939c571b5c298fb7e6e1523dc0bdf102082/GHC/Enum.daml
Hidden:   no
Range:    14:9-16:68
Source:   typecheck
Severity: DsError
Message:
  .daml/package-database/1.dev/daml-prim-c63a11a5757a45ef46e4df0f86039939c571b5c298fb7e6e1523dc0bdf102082-c63a11a5757a45ef46e4df0f86039939c571b5c298fb7e6e1523dc0bdf102082/GHC/Enum.daml:14:9:
  error:
  • Not in scope: type constructor or class ‘GHC.Enum.Enum’
  • In the export: GHC.Enum.Enum(GHC.Enum.succ, GHC.Enum.pred,
  GHC.Enum.toEnum, GHC.Enum.fromEnum, GHC.Enum.enumFrom,
  GHC.Enum.enumFromThen, GHC.Enum.enumFromTo,
  GHC.Enum.enumFromThenTo)

@akrmn
Copy link
Contributor Author

akrmn commented Oct 14, 2021

The error with GHC.Enum (and similar) can probably be fixed by filtering out any values/types defined in the module itself, but I'm not sure if that's the reason for all the failures. I'll try that for now.

@akrmn
Copy link
Contributor Author

akrmn commented Oct 14, 2021

That silenced some errors (and some warnings), but I'm still getting quite a few errors such as this one:

File:
  .daml/package-database/1.dev/daml-stdlib-26def009ea16788210e60d5b5927068ffe8a492b441937798898a4f317bfbb5b-26def009ea16788210e60d5b5927068ffe8a492b441937798898a4f317bfbb5b/DA/Internal/RebindableSyntax.daml
Hidden:   no
Range:    37:8-37:85
Source:   not found
Severity: DsError
Message:
  Could not find module
  ‘Pkg_312f93eed15148f6b5d47ac144b47c510af030d99135277f7a890156ad068540.GHC.Real’
  Perhaps you meant
  Pkg_312f93eed15148f6b5d47ac144b47c510af030d99135277f7a890156ad068540.GHC.Base (defined via package
  flags to be GHC.Base)
  Pkg_312f93eed15148f6b5d47ac144b47c510af030d99135277f7a890156ad068540.GHC.Enum (defined via package
  flags to be GHC.Enum)
  Pkg_312f93eed15148f6b5d47ac144b47c510af030d99135277f7a890156ad068540.GHC.Err (defined via package
  flags to be GHC.Err)
Failed to compile interface for data-dependency: daml-stdlib-26def009ea16788210e60d5b5927068ffe8a492b441937798898a4f317bfbb5b

@akrmn akrmn force-pushed the data-deps-reexports branch 3 times, most recently from c22e09d to 70b2ac3 Compare October 14, 2021 15:31
Copy link
Contributor

@cocreature cocreature left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thank you!

@@ -488,7 +488,14 @@ convertModule lfVersion pkgMap stablePackages isGenerated file x depOrphanModule
templates <- convertTemplateDefs env
exceptions <- convertExceptionDefs env
interfaces <- convertInterfaces env (eltsUFM (cm_types x))
pure (LF.moduleFromDefinitions lfModName (Just $ fromNormalizedFilePath file) flags (types ++ templates ++ exceptions ++ definitions ++ interfaces ++ depOrphanModules))
let defs =
types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think we have any strong guidelines here. This looks perfectly reasonable.

]
]

, dataDependenciesTest "Using reexported values"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we support more than just full module reexports that’s even better 👍

@@ -20,24 +20,68 @@ def get_text(pkg): .text_interned_str | resolve_interned_string(pkg);

def norm_ty(pkg): if has("interned") then pkg.interned_types[.interned] else . end;

# @SINCE-LF 1.7
def norm_qualified_module(pkg; f):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could factor out a generic struct normalization or something but not quite sure that works and definitely does not have to happen in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah! I've wished for that way too many times while working on this

. (\(LFC.QualName q) -> LF.qualModule q)
. exportName
where
forbiddenModules =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that the same list targeted by

isInternal (GHC.moduleNameString -> x)
and can we use that function instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aha! So far it is, thanks for sharing this!

availInfoToExportInfo = \case
GHC.Avail name -> ExportInfoVal
<$> convertQualName name
GHC.AvailTC name pieces fields -> ExportInfoTC
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way to detect a full module reexport? I’m slightly worried about the size of this expression exploding if you reexport a hole module that defines lots of identifiers. Probably not a huge issue but if we could detect those somehow and store them as a single "reexport this module" part instead of this that seems potentially much nicer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn’t have to be in this PR but probably worth thinking about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not with md_exports :( I'd have to look for a different API, probably something closer to the parser

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm yeah looks like we can only get it from hsmodExports. Another option to avoid large expressions from blowing up could be to emit multiple export expressions at the extreme end one per export but we could also do some form of batching like 5 exports max per item or whatever a sensible number is.

A large number of expressions isn’t as problematic as a single very large expression because the latter will be handled relatively poorly by some of our code.

@akrmn
Copy link
Contributor Author

akrmn commented Oct 14, 2021

I iteratively added a few modules to skip, for now,

                        [ "DA.Internal.Compatible"
                        , "DA.Internal.LF"
                        , "DA.Internal.Prelude"
                        , "GHC.Base"
                        , "GHC.Classes"
                        , "GHC.CString"
                        , "GHC.Enum"
                        , "GHC.Integer.Type"
                        , "GHC.Num"
                        , "GHC.Prim"
                        , "GHC.Real"
                        , "GHC.Show"
                        ]

Now most data-dependencies tests pass, but this feels extremely ad-hoc and prone to break :/

A few tests still fail, and they mention missing values from modules such as DA.List or DA.Action.State.Class, which really don't feel like they belong in the list above.

@cocreature
Copy link
Contributor

A few tests still fail, and they mention missing values from modules such as DA.List or DA.Action.State.Class, which really don't feel like they belong in the list above.

which test specifically is failing?

@akrmn
Copy link
Contributor Author

akrmn commented Oct 14, 2021

A few tests still fail, and they mention missing values from modules such as DA.List or DA.Action.State.Class, which really don't feel like they belong in the list above.

which test specifically is failing?

These two:

  • Cross DAML-LF version: 1.6 -> 1.7
  • Cross DAML-LF version: 1.7 -> 1.8

EDIT: which might make sense if the missing functions from DA.List weren't available before DAML-LF 1.8 ?

@cocreature
Copy link
Contributor

EDIT: which might make sense if the missing functions from DA.List weren't available before DAML-LF 1.8 ?

Thanks will take a look later today or early tomorrow. Your idea sounds like it might be on the right track but I haven’t thought about it too deeply.

@cocreature
Copy link
Contributor

I’ve looked into it and I think I understand the failure: data-dependencies for LF < 1.8 are very limited. Specifically, typeclasses and typeclass-constrained methods are not available here (they depend on LF type synonyms which only got introduced in 1.8).

So I think there are two options:

  1. Don’t emit those export markers for LF < 1.8.
  2. For LF < 1.8 filter out the things that rely on typeclasses.

2 seems harder to implement and at this point nobody should be on LF < 1.8 (it’s more than 2 years old iirc) so I’d suggest to go with 1.

@akrmn
Copy link
Contributor Author

akrmn commented Oct 15, 2021

thanks for looking into it @cocreature! I'll go with option 1

@akrmn akrmn force-pushed the data-deps-reexports branch 2 times, most recently from 04617bc to 9ccda71 Compare October 15, 2021 14:27
@akrmn
Copy link
Contributor Author

akrmn commented Oct 15, 2021

@akrmn akrmn force-pushed the data-deps-reexports branch 3 times, most recently from ab2323e to d7a32cc Compare October 18, 2021 08:19
@akrmn
Copy link
Contributor Author

akrmn commented Oct 18, 2021

[An] option to avoid large expressions from blowing up could be to emit multiple export expressions at the extreme end one per export but we could also do some form of batching like 5 exports max per item or whatever a sensible number is.

A large number of expressions isn’t as problematic as a single very large expression because the latter will be handled relatively poorly by some of our code.

@cocreature, how would I find a sensible number for the size of these batches? If that's too much trouble, I could go ahead with one expression per GHC.AvailInfo. That leads me to a design question, should these expressions simply be named $exports_<i> or do you have something else in mind?

@cocreature
Copy link
Contributor

how would I find a sensible number for the size of these batches? If that's too much trouble, I could go ahead with one expression per GHC.AvailInfo. That leads me to a design question, should these expressions simply be named $exports_<i> or do you have something else in mind?

I don’t think there is a supersystematic way of coming up with these expressions unfortunately. I’d probably use 10 or so as a start but it’s definitely somewhat arbitrary.

As for naming, yeah just appending a number sounds perfectly fine.

@akrmn akrmn force-pushed the data-deps-reexports branch 2 times, most recently from d3f991c to ef7e273 Compare October 18, 2021 10:09
@akrmn akrmn force-pushed the data-deps-reexports branch from ef7e273 to 22db0a0 Compare October 18, 2021 12:39
@akrmn akrmn enabled auto-merge (squash) October 18, 2021 14:40
@akrmn akrmn merged commit 6255837 into main Oct 18, 2021
@akrmn akrmn deleted the data-deps-reexports branch October 18, 2021 14:51
azure-pipelines bot pushed a commit that referenced this pull request Oct 20, 2021
This PR has been created by a script, which is not very smart
and does not have all the context. Please do double-check that
the version prefix is correct before merging.

@sofiafaro-da is in charge of this release.

Commit log:
```
8ff347d Expand type synonyms lazily (#11282)
2fd200f KV: support contextual logging in KeyValueConsumption [KVL-1143] (#11288)
f4df1cc converting server errors to proper client errors (#11184)
525e4ce Add GrpcStatus.toProto(JavaStatus) -> ScalaStatus conversion [KVL-1143] (#11292)
9b00a1a Rotate release rotation (#11291)
dd09c38 Upgrade rules-nodejs (#11290)
87f1418 [Short] Remove unnecessary traits from ApiSubmissionServiceSpec (#11254)
6b65a72 Merge npm install steps in create-daml-app tests (#11287)
a9f6afb kvutils: Rename `VersionedOffset` to `KVOffset`. (#11286)
98cf8d8 KV: introduce v2 error codes behind a CLI switch [KVL-1140] (#11224)
46f6877 Increase time model skew limits (#11273)
8f94cff kvutils: Use `VersionedOffsetBuilder` where possible, and introduce `VersionedOffsetMutator`. [KVL-1154] (#11277)
81fde97 Bazel: Call `_wrap_rule` directly when building the Scala REPL rule. (#11279)
edb2b04 Document scaladoc usage in Bazel (#11278)
6255837 Data dependencies support for reexports (#11147)
2fc7490 [Self-service error codes] Adapt error factories [DPP-656] (#11270)
b1a6b11 ledger-api-test-tool: Add static time awareness [KVL-1156] (#11266)
243c120 Add an LF typechecking benchmark (#11276)
2db8c13 Test authorization within Exercise choice. (#11246)
ec58ed6 Typecheck nested LF type applications more efficiently (#11253)
a940016 Add debugRaw corresponding to traceM in Haskell (#11259)
a988579 Fix Scala repl targets (#11272)
e4808f6 Extract UtilLF module from daml-lf-conversion to its own library (#11263)
38712c0 sandbox-next - Disable participant command deduplication [KVL-1156] (#11264)
82fa229 Add daml-util-ghc lib (#11260)
68a93cf kvutils - Use the ledger configured time for command dedup [KVL-1149] (#11239)
c8cede0 Bump startup time in create-daml-app tests (#11261)
4fac87f Bump the schema version for the JSON API (#11252)
5507670 update NOTICES file (#11256)
0063b10 Retry faster during startup (#11255)
04feb40 Improve reset service tests (#11240)
b3375fd update compat versions for 1.18.0-snapshot.20211013.8071.0.514e8b50 (#11237)
9ed1eb3 Address security notice on `ansi-regex` (#11243)
387d0e8 Make compatibility migrations `eternal` (#11242)
be899b3 Extract rejection_reason.proto from daml_kvutils.proto [KVL-1090] (#11235)
a1d94e1 LF: Create special case class for VersionedContractInstance (#11127)
b738988 Release 1.18 snapshot (#11234)
3c26852 LF: Use template Id in exercise node of fixed choice. (#11229)
139b6f3 CI: Set `PROJ_DIR` inside the bash lib, not outside. (#11236)
ce64cb2 interfaces: Do some TODOs (#11231)
7a88c7d trigger-service: dev-mode-unsafe flag (#11233)
909a1bf [DPP-417][DDP-612] Adapt ApiSubmissionService to support V2 error codes (#11052)
```
Changelog:
```
- [JSON API] Several kinds of gRPC server errors are now reported with
  associated HTTP statuses; for example, a Daml-LF interpreter error now
  returns a 400 instead of a 500, and an exercise on an archived contract
  returns a 409 Conflict instead of a 500.  Errors internal to JSON API
  (e.g. internal assertion failures) are no longer detailed in the HTTP
  response; their details are only logged.
  See `issue #11184 <https://github.com/digital-asset/daml/pull/11184>`__.

- [Daml Stdlib] Add `debugRaw` as a convenience wrapper around
  `traceRaw` when used inside a do-block. `debugRaw` compares to
  `debug` like `traceRaw` compares to `trace` meaning it expects a
  `Text` instead of calling `show` on an expression.

sandbox-next - Disable participant command deduplication

kvutils - Command deduplication uses the configured time model (static/wall) and not running always using wall-clock
[JSON API] Solving a bug that could cause the JSON API to return
correct result if a contract with the same key is observed twice
required a schema change. The JSON API data needs to be dropped
and the query store needs to reset. If you are migrating from a
previous version, either reset your database manually or start
the HTTP JSON API with one of the options that regenerate the
schema (`create-only`, `create-if-needed-and-start`, `create-and-start`).
```

CHANGELOG_BEGIN
CHANGELOG_END
sofiafaro-da pushed a commit that referenced this pull request Oct 20, 2021
This PR has been created by a script, which is not very smart
and does not have all the context. Please do double-check that
the version prefix is correct before merging.

@sofiafaro-da is in charge of this release.

Commit log:
```
8ff347d Expand type synonyms lazily (#11282)
2fd200f KV: support contextual logging in KeyValueConsumption [KVL-1143] (#11288)
f4df1cc converting server errors to proper client errors (#11184)
525e4ce Add GrpcStatus.toProto(JavaStatus) -> ScalaStatus conversion [KVL-1143] (#11292)
9b00a1a Rotate release rotation (#11291)
dd09c38 Upgrade rules-nodejs (#11290)
87f1418 [Short] Remove unnecessary traits from ApiSubmissionServiceSpec (#11254)
6b65a72 Merge npm install steps in create-daml-app tests (#11287)
a9f6afb kvutils: Rename `VersionedOffset` to `KVOffset`. (#11286)
98cf8d8 KV: introduce v2 error codes behind a CLI switch [KVL-1140] (#11224)
46f6877 Increase time model skew limits (#11273)
8f94cff kvutils: Use `VersionedOffsetBuilder` where possible, and introduce `VersionedOffsetMutator`. [KVL-1154] (#11277)
81fde97 Bazel: Call `_wrap_rule` directly when building the Scala REPL rule. (#11279)
edb2b04 Document scaladoc usage in Bazel (#11278)
6255837 Data dependencies support for reexports (#11147)
2fc7490 [Self-service error codes] Adapt error factories [DPP-656] (#11270)
b1a6b11 ledger-api-test-tool: Add static time awareness [KVL-1156] (#11266)
243c120 Add an LF typechecking benchmark (#11276)
2db8c13 Test authorization within Exercise choice. (#11246)
ec58ed6 Typecheck nested LF type applications more efficiently (#11253)
a940016 Add debugRaw corresponding to traceM in Haskell (#11259)
a988579 Fix Scala repl targets (#11272)
e4808f6 Extract UtilLF module from daml-lf-conversion to its own library (#11263)
38712c0 sandbox-next - Disable participant command deduplication [KVL-1156] (#11264)
82fa229 Add daml-util-ghc lib (#11260)
68a93cf kvutils - Use the ledger configured time for command dedup [KVL-1149] (#11239)
c8cede0 Bump startup time in create-daml-app tests (#11261)
4fac87f Bump the schema version for the JSON API (#11252)
5507670 update NOTICES file (#11256)
0063b10 Retry faster during startup (#11255)
04feb40 Improve reset service tests (#11240)
b3375fd update compat versions for 1.18.0-snapshot.20211013.8071.0.514e8b50 (#11237)
9ed1eb3 Address security notice on `ansi-regex` (#11243)
387d0e8 Make compatibility migrations `eternal` (#11242)
be899b3 Extract rejection_reason.proto from daml_kvutils.proto [KVL-1090] (#11235)
a1d94e1 LF: Create special case class for VersionedContractInstance (#11127)
b738988 Release 1.18 snapshot (#11234)
3c26852 LF: Use template Id in exercise node of fixed choice. (#11229)
139b6f3 CI: Set `PROJ_DIR` inside the bash lib, not outside. (#11236)
ce64cb2 interfaces: Do some TODOs (#11231)
7a88c7d trigger-service: dev-mode-unsafe flag (#11233)
909a1bf [DPP-417][DDP-612] Adapt ApiSubmissionService to support V2 error codes (#11052)
```
Changelog:
```
- [JSON API] Several kinds of gRPC server errors are now reported with
  associated HTTP statuses; for example, a Daml-LF interpreter error now
  returns a 400 instead of a 500, and an exercise on an archived contract
  returns a 409 Conflict instead of a 500.  Errors internal to JSON API
  (e.g. internal assertion failures) are no longer detailed in the HTTP
  response; their details are only logged.
  See `issue #11184 <https://github.com/digital-asset/daml/pull/11184>`__.

- [Daml Stdlib] Add `debugRaw` as a convenience wrapper around
  `traceRaw` when used inside a do-block. `debugRaw` compares to
  `debug` like `traceRaw` compares to `trace` meaning it expects a
  `Text` instead of calling `show` on an expression.

sandbox-next - Disable participant command deduplication

kvutils - Command deduplication uses the configured time model (static/wall) and not running always using wall-clock
[JSON API] Solving a bug that could cause the JSON API to return
correct result if a contract with the same key is observed twice
required a schema change. The JSON API data needs to be dropped
and the query store needs to reset. If you are migrating from a
previous version, either reset your database manually or start
the HTTP JSON API with one of the options that regenerate the
schema (`create-only`, `create-if-needed-and-start`, `create-and-start`).
```

CHANGELOG_BEGIN
CHANGELOG_END

Co-authored-by: Azure Pipelines Daml Build <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants