Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that Language Version support flows to Parser #6891

Merged

Conversation

KevinRansom
Copy link
Member

So ... over the holiday weekend, I amused myself by trying to figure a good way to flow the LanguageVersion to the parser.

As a proof of concept I uses the WildCard self Language feature provided by @gusty because it's 1. easy, and 2. it's wholly implemented in the parser.

In RTM F# this code fails to compile:

open System

type MyClass () =

    member _.FooBar() = printfn "Hello, World";;

Like this:

Microsoft (R) F# Compiler version 10.4.0 for F# 4.6
Copyright (c) Microsoft Corporation. All Rights Reserved.

c:\temp\underscore\Program.fs(8,13): error FS0010: Unexpected symbol '.' in member definition. Expected 'with', '=' or other token.

c:\temp\underscore\Program.fs(14,1): error FS3113: Unexpected end of input in type definition

c:\temp\underscore>

With this pr it still fails like this:

fsharp\artifacts\bin\fsc\Debug\net472\fsc @underscore.rsp
Microsoft (R) F# Compiler version 10.6.0.0 for F# 4.7
Copyright (c) Microsoft Corporation. All Rights Reserved.

c:\temp\underscore\Program.fs(8,13): error FS0010: Unexpected symbol '.' in member definition. Expected 'with', '=' or other token.

c:\temp\underscore\Program.fs(14,1): error FS3113: Unexpected end of input in type definition

c:\temp\underscore>

With language preview enabled it succeeds:

fsharp\artifacts\bin\fsc\Debug\net472\fsc --langversion:preview @underscore.rsp
Microsoft (R) F# Compiler version 10.6.0.0 for F# 4.7
Copyright (c) Microsoft Corporation. All Rights Reserved.

c:\temp\underscore>

So ….

  • I had to refactor the LanguageVersion/LanguageFeatures and move them out of CompileOps, so that features can be used to identify the features.
  • I added a SupportsFeature to the LexBuffer which is easily retrievable using parseState. this is a func that takes a LanguageFeature and returns true/false

In the parser the code to verify a feature support looks like:

    if not (parseState.LexBuffer.SupportsFeature LanguageFeature.SingleUnderscorePattern) then
        raiseParseErrorAt (rhs parseState 2) (FSComp.SR.parsUnexpectedSymbolDot())

ToDo:

  • Some tests
  • At the points in which there are // @@@@@@@@ in the code, I need to ensure that a verification function flows to them.

@dsyme, @gusty let me know what you think

Thanks

| UNDERSCORE DOT pathOp { let (LongIdentWithDots(lid,dotms)) = $3 in (None,LongIdentWithDots(ident("_",rhs parseState 1)::lid, rhs parseState 2::dotms)) }
| UNDERSCORE DOT pathOp {
if not (parseState.LexBuffer.SupportsFeature LanguageFeature.SingleUnderscorePattern) then
raiseParseErrorAt (rhs parseState 2) (FSComp.SR.parsUnexpectedSymbolDot())
Copy link
Contributor

@gusty gusty May 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this approach will work in all features. Here effectively there was an error when this rule was absent, but what about features where the absence of the rule doesn't result in an error message but a different parser result?

Moreover I think the spirit of this switch is to provide a mechanism for breaking changes. When a feature resulted in an error (like the underscore feature) normally is not a breaking change.

I'm not sure what's the best way to solve this, maybe adding a when featureX to the rule, but I think the fsy file doesn't support when clauses (though not sure).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gusty,

The current parser does not really allow for retry on pattern matches or filters, once it arrives in one of these blocks of code, the parsed source code has been successfully parsed. Once we are here we pretty much can do only one of two things … error out or modify the AST produced within this block.

If the source is to have a changed behavior based on language version, then produce an AST that reflects the needs of the new language feature, or if the new feature is not supported, then produce the AST required by the old language version. If the source didn't match before then provoke an error.


The switch is not about allowing breaking changes in the language. It's purpose is to allow users of the compiler to access a specific version of the language.

We expect source code that compiled and ran using the F# X version of the language to continue to successfully compile and run with F# X+1 or F# X+2 or F# X+99 of the language.

Some times we will fix bugs in the implementation of the tooling that will cause the above to not be achieved. Cases where we fixed a bug should not be impacted by this switch. On the other hand, often those bugs, become an intrinsic part of the language and never get fixed.

I hope this helps

Kevin

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. If that's the case, it looks good to me.

@dsyme
Copy link
Contributor

dsyme commented May 31, 2019

Seems ok to me

/// LanguageFeature enumeration
[<RequireQualifiedAccess>]
type LanguageFeature =
| LanguageVersion46 = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really a language feature? Wouldn't it be better as a separate enum for language versions? It could be quite straightforward to check if a feature supported by a language version.

val langFeatureSupported: langVersion -> langFeature -> bool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@auduchinok,

There are two audiences … compiler users are focused on language version. Thus the langversion switch. Feature developers are focused on their feature. Until a feature is confirmed for a specific language version, it is assigned to the version beyond latest. The developer merely asks if their feature is supported by the specified LanguageVersion, and invokes the feature code or the old version failure path.

When a feature is approved for an RTM language version, the code will be modified to check for that specific language version and the feature flag will disappear. We will do this to keep the compatibility matrix from blowing our minds. The roslyn guys tell me this is what they do too, because no one was happy with enabling individual features.

type LanguageVersion (specifiedVersion) =

// When we increment language versions here preview is higher than current RTM version
static let languageVersion46 = 4.6m
Copy link
Member

@auduchinok auduchinok May 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that using numbers is better than enum here.
Numbers could be good as an underlying type inside some LanguageVersion type and wouldn't be exposed this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we do a numeric comparison on the number, to figure out if the version is supported or not. Sure we could compare enum ordinals, but it's not really much of a thing.

@KevinRansom KevinRansom force-pushed the langversion4wildcardself branch 4 times, most recently from c62d136 to 74dc501 Compare June 2, 2019 06:54
@KevinRansom KevinRansom changed the title [WIP] Ensure that Language Version support flows to Parser Ensure that Language Version support flows to Parser Jun 2, 2019
@KevinRansom
Copy link
Member Author

@brettfo, @cartermp, Okay I have enabled languageversion feature tests for the wild card feature:
The tests run on coreclr and desktop, for 4.6 "old" and 4.7 "new".

Tests are here:
https://github.com/dotnet/fsharp/pull/6891/files#diff-4064c98175bdeeb7be8021a63f4f6ab3L12

Comparer here:
https://github.com/dotnet/fsharp/pull/6891/files#diff-117f38fa2884da281589cadfe52c7bffR1
I'm pretty pleased with the comparer, it is compatible with the perl one, only has fewer bugs, and better error output. And the regex patterns are not that mysterious.

Note: this reverts the tests modified when the feature was merged, back to 4.6 compatible code. And adds in new tests that run under 4.7 and 4.6.

The tests run in the fsharpsuite test suite and rely on SingleTestBuild and run.

I have also amended where the @@@@@ were originally to, flow LanguageVersion via the lexer and parser.
In service.fs, which is intended to be used by tooling rather than the compiler, the isFeatureSupprted api, always returns true because in Roslyn, the compiler is the only place that knows about language version. the tooling knows everything.

@cartermp
Copy link
Contributor

cartermp commented Jun 2, 2019

Also tagging @TIHan for review

if isStringEmpty content then expect
else expect.Replace(content, "")

let rdr = XmlReader.Create(new StringReader(nocontentxpect))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not System.Xml.Linq.XDocument? It's (I think) a much better API than using a reader.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I had used Google search for samples I'm sure I would have got the good api, but I used Bing.

@KevinRansom KevinRansom merged commit 522fc42 into dotnet:release/fsharp47 Jun 3, 2019

let FunctionAsLexbuf (bufferFiller: char[] * int * int -> int) : Lexbuf =
LexBuffer<_>.FromFunction bufferFiller
let StringAsLexbuf (supportsFeature: Features.LanguageFeature -> bool, s:string) : Lexbuf =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KevinRansom I know I spoke to you that I thought this approach was fine. Now that I'm looking at this more carefully, I don't think it is a good idea.

This feels like a code smell. I don't like that we need to pass supportsFeature in order to create a lexbuffer, it should not care about this. The only reason why is to be able to access supportsFeature in the parser (.fsy).
Lexbuffers and language features are two separate concerns and should not be intertwined this way; we have made this more complex.

@@ -421,6 +421,7 @@ parsAttributesMustComeBeforeVal,"Attributes should be placed before 'val'"
568,parsAllEnumFieldsRequireValues,"All enum fields must be given values"
569,parsInlineAssemblyCannotHaveVisibilityDeclarations,"Accessibility modifiers are not permitted on inline assembly code types"
571,parsUnexpectedIdentifier,"Unexpected identifier: '%s'"
10,parsUnexpectedSymbolDot,"Unexpected symbol '.' in member definition. Expected 'with', '=' or other token."
Copy link
Contributor

@TIHan TIHan Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is interesting that we have to put this kind of error message in FSComp. Before, we relied on what the parser would give us for the error. But since we are adding logic for language version, we have no choice but to copy the same error message that the parser would create, and then create a new resource in FSComp in order to raise the same error message in conjunction with language version. While this solves the problem for the language version, there is now potential for error text to be out of sync if the parser decides to change the text for the same kind of error in a different rule. Though, in this case, it is unlikely to change anytime soon.

While we do have other parser errors in FSComp, I don't think we have errors with exact text of what the parser would normally produce on its own; we only have specific parse errors that try to be more descriptive for the user.

This is sort of a code smell in my opinion, but not a big deal. It isn't necessarily due to the language version flag itself; this problem will exist if you try to preserve old parser logic and errors behind any kind of flag. However, I only think a flag would be absolutely necessary if we change the syntax in such a way that broke existing code; which that isn't even the purpose of the language version flag anyway or something that we ever wish to do.

Copy link
Contributor

@TIHan TIHan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this has already been reviewed and merged in. I should have reviewed this sooner.

I have a very strong feeling, that trying to preserve old parser logic + errors behind any kind of flag in this manner, is going make our parser more complex than before. This is mainly the fault of YACC and the lack of backtracking.

I urge that we do not try to do anything more to our parser regarding language version, which effectively means, I think we should halt any more changes to our compiler that tries to preserve logic behind the language version flag.

In principle, I love the idea behind the language version flag. I'm just very concerned about how much complexity this will add to our compiler. We have put flags in parts of the compiler before, but the language version flag will be much more invasive.

@cartermp
Copy link
Contributor

cartermp commented Jun 4, 2019

In principle, I love the idea behind the language version flag. I'm just very concerned about how much complexity this will add to our compiler. We have put flags in parts of the compiler before, but the language version flag will be much more invasive.

Is the concern about the parser or LangVersion in general? Because we need LangVersion if we ever hope to merge language features in progressively, and it'll be necessary for .NET 5.

I can't speak for the validity of the change with regards to the parser, but we'll also need a way to guard new syntax and supply a meaningful error message about it. Is there a better alternative?

@TIHan
Copy link
Contributor

TIHan commented Jun 4, 2019

Is there a better alternative?

There certainly could be. I don't know what it is though. We would need to spend more time looking into it.

Is the concern about the parser or LangVersion in general?

Both, but I'm putting more concern on the parser.

Because we need LangVersion if we ever hope to merge language features in progressively, and it'll be necessary for .NET 5

I do not think we need LangVersion to handle this. An alternative to this problem is to have a preview/feature flag that enables specific features that we dub as preview-able. This would be less invasive and less scope than a full-on LangVersion flag. As far as .NET 5, I don't know what exactly makes it necessary.

@matthid
Copy link
Contributor

matthid commented Jun 5, 2019

There certainly could be. I don't know what it is though. We would need to spend more time looking into it.

Can we not just update the parser and emit the "old" error messages later based on versioning?
This could allow us to emit similar errors like C# already does where it tells us that the code doesn't compile now but would compile with a different compiler setting...

@cartermp cartermp added this to the .NET Core 3.0 milestone Jun 17, 2019
@KevinRansom KevinRansom deleted the langversion4wildcardself branch July 2, 2019 22:35
@cartermp cartermp modified the milestones: .NET Core 3.0, 16.3 Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants