Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F# sig data contains Too Much Information #12746

Closed
dsyme opened this issue Feb 18, 2022 · 7 comments
Closed

F# sig data contains Too Much Information #12746

dsyme opened this issue Feb 18, 2022 · 7 comments
Assignees
Labels
Area-Build Everything related to building F# compiler, tooling and VS extension. Area-Compiler-CodeGen IlxGen, ilwrite and things at the backend Feature Improvement Impact-Low (Internal MS Team use only) Describes an issue with limited impact on existing code. Theme-Performance
Milestone

Comments

@dsyme
Copy link
Contributor

dsyme commented Feb 18, 2022

While working on #11521 with @vzarytovskii we've identified that the F# signature data metadata resource blob contains Too Much Information.

In particular it contains

  • the metadata for "private" declarations in implementation files that don't have signature files
  • the metadata for "internal" declarations when there are no InternalsVisibleTo in the assembly

This means the reference assembly feature will not be quite as effective as we would like , as reference assemblies will get updated even when private implementation details change.

A prototype of the fixes needed is in #12674, however this needs more work. Note that the work needed to reduce the information on the signature bloc is of a different nature to the rest of the work on reference assemblies and is in a sense completely independent from a logical/correctness point of view. It also has some subtle edge cases, and thus we think this fix is best made separately, and perhaps released under preview or else have a command line option to disable it.

The reference assembly feature is still highly useful without this fix as there are plenty of cases where it's still useful without this.

@KevinRansom KevinRansom added the Impact-Low (Internal MS Team use only) Describes an issue with limited impact on existing code. label Mar 28, 2022
@vzarytovskii vzarytovskii added Area-Compiler-CodeGen IlxGen, ilwrite and things at the backend Area-Build Everything related to building F# compiler, tooling and VS extension. labels Jun 1, 2022
@auduchinok
Copy link
Member

In particular it contains
the metadata for "private" declarations in implementation files that don't have signature files
the metadata for "internal" declarations when there are no InternalsVisibleTo in the assembly

Keeping this info could really help if we want to use it later for, say, decompiling or debug related features. If one day we want to use signature data for a decompiler-like feature, it would be sad to find out that some needed info is missing, and we wouldn't be able to help users to do anything with assemblies coming from NuGet libraries in that case.

If this signature info is not taking too much space, maybe it's better to keep it?

@vzarytovskii
Copy link
Member

If this signature info is not taking too much space, maybe it's better to keep it?

Problem is MVID is generated taking this info into account, resulting in it to be different in case if private function arity or parameter names are changed.

And MVID will be used to decide whether reference assemblies shall be rebuilt.

That is the main reason we would like to trim it.

Theoretically, MVID calculation logic can be changed to only account for "public" sigdata members, but it will require substantial changes to the ilwriter.

@auduchinok
Copy link
Member

Theoretically, MVID calculation logic can be changed to only account for "public" sigdata members, but it will require substantial changes to the ilwriter.

That sounds like a much better solution, though, indeed it'd involve more work

@vzarytovskii vzarytovskii added this to the Backlog milestone Sep 21, 2022
@charlesroddie
Copy link
Contributor

The major problem here is that local functions are affected, not just top level explicit private/internal functions.

Original:

let f() = 1 + 1

Effects of changes:

let f() = 2 // Ok: no rebuild of dependent projects
let f() = // Error: Rebuild of dependent projects since an inner g function is defined
    let g(i: int) = i + i
    g 1

My estimates based on typical changes and builds in my work:

Currently reference assemblies in F# are performing at 25% of their potential, with benefits applying for changes that don't involve creating or changing functions. These are often formatting changes, or amendments to simple calculations or numbers or strings. This 25% is already a very noticeable improvement in the experience of programming in F#.

Enabling this for inner functions would get this to 90%, since working with functions is extremely common in F#, with the remaining 10% from removing explicitly defined top-level private/internal functions.

@vzarytovskii says on slack:

I guess we shall first ignore those functions when calculating mvid, and later on consider not generating them in signature blob

@charlesroddie
Copy link
Contributor

charlesroddie commented Feb 27, 2023

The previous conv has gone down a bad path, with a suggestion of intentionally including information in a signature blob which is not part of the public signature, i.e. intentional incorrectness! That should never even be considered. The spurious info should just be deleted.

Fixing the incorrectness (deleting sig info), rather than doing a workaround (preserving it and deleting mvids) is not only correct, but also easier (according to the above conv) and will also improve the responsiveness of the compiler because there is less reading to do, which may make working in IDEs on F# projects which depend on other F# projects more responsive.

@KevinRansom this issue is not Impact-Low but will be one of the top 10 improvements in F# in the last 10 years! It could save F# devs about 15mins per day of unnecessary builds.

@vzarytovskii
Copy link
Member

vzarytovskii commented Feb 27, 2023

The previous conv has gone down a bad path, with a suggestion of intentionally including information in a signature blob which is not part of the public signature, i.e. intentional incorrectness!

It is not intentional incorrectness, it is part of initial compiler design.

That should never even be considered. The spurious info should just be deleted.

It is a potential breaking change, so it cannot be "just deleted", it needs to be aproached very carefully.

Fixing the incorrectness (deleting sig info), rather than doing a workaround (preserving it and deleting mvids) is not only correct, but also easier (according to the above conv) and will also improve the responsiveness of the compiler because there is less reading to do, which may make working in IDEs on F# projects which depend on other F# projects more responsive.

It won't affect majority of the compiler flow, since reading is very fast, as is calculating mvid.

@KevinRansom Kevin Ransom FTE this issue is not Impact-Low but will be one of the top 10 improvements in F# in the last 10 years! It could save F# devs about 15mins per day of unnecessary builds.

Issue is Impact-Low, because the whole team agreed on it on triage, it is internal, it does not signal what the priority is, of where it is in planning. We also shouldn't be adding impact on non-bugs, not entirely sure why is it here.

Improvements to the type-checker or incremental building can (and probably will) result in much, much more significant improvements, rather than reference assemblies, since in majority of the cases, typechecking will still be involved even when reference assemblies are involved.

@T-Gro T-Gro self-assigned this Mar 6, 2023
@T-Gro T-Gro modified the milestones: Backlog, March-2023 Mar 6, 2023
@vzarytovskii vzarytovskii modified the milestones: March-2023, April-20203 Apr 3, 2023
@vzarytovskii vzarytovskii removed this from the April-2023 milestone May 4, 2023
@T-Gro T-Gro added this to the May-2023 milestone May 4, 2023
@T-Gro
Copy link
Member

T-Gro commented Jul 25, 2023

As of now, the sig data remains as it was (= "too much"), however the MVID calculation does not take into account which eliminates the main drawback for ref assembly effectiveness.

I will close this issue for now, shall be reopened if there is another reason (besides ref assemblies) to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Build Everything related to building F# compiler, tooling and VS extension. Area-Compiler-CodeGen IlxGen, ilwrite and things at the backend Feature Improvement Impact-Low (Internal MS Team use only) Describes an issue with limited impact on existing code. Theme-Performance
Projects
Archived in project
Development

No branches or pull requests

7 participants