Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with very long lines and/or files #119

Closed
dungpa opened this issue Jan 2, 2014 · 6 comments
Closed

Problems with very long lines and/or files #119

dungpa opened this issue Jan 2, 2014 · 6 comments

Comments

@dungpa
Copy link
Contributor

dungpa commented Jan 2, 2014

We use ranges from F# ASTs in order to look up constants (numbers, strings, etc.). It is handled by https://github.com/dungpa/fantomas/blob/master/src/Fantomas/SourceParser.fs#L24. However, constant lookups are unreliable. Range values produced by F# ASTs are really strange:

  1. For multi-line ranges, F# ASTs have offsets that we have to compensate in order to get correct string values. However, this hack is not always successful (see Incorrect formatting with """ #110).
  2. F# ASTs give bizarre results on long lines (e.g. lines with more than 512 characters) (see Fantomas replacing "" with random stuff #83, "This code fragment consists of illegal F# constructs." on valid F# code #85).

This code fragment

(new CsvFile<string * decimal * decimal>(new Func<obj, string[], string * decimal * decimal>(fun (parent : obj) (row : string[]) -> CommonRuntime.GetNonOptionalValue("Name", CommonRuntime.ConvertString(TextConversions.AsOption(row.[0])), TextConversions.AsOption(row.[0])), CommonRuntime.GetNonOptionalValue("Distance", CommonRuntime.ConvertDecimal("", TextConversions.AsOption(row.[1])), TextConversions.AsOption(row.[1])), CommonRuntime.GetNonOptionalValue("Time", CommonRuntime.ConvertDecimal("", TextConversions.AsOption(row.[2])), TextConversions.AsOption(row.[2]))), new Func<string * decimal * decimal, string[]>(fun (row : string * decimal * decimal) -> [| CommonRuntime.ConvertStringBack(CommonRuntime.GetOptionalValue((let x, _, _ = row in x))); CommonRuntime.ConvertDecimalBack("", CommonRuntime.GetOptionalValue((let _, x, _ = row in x))); CommonRuntime.ConvertDecimalBack("", CommonRuntime.GetOptionalValue((let _, _, x = row in x))) |]), (ProviderFileSystem.readTextAtRunTimeWithDesignTimeOptions @"C:\Dev\FSharp.Data-master\src\..\tests\FSharp.Data.Tests\Data" "" "SmallTest.csv"), "", '"', true, false)).Cache()

always have columns smaller than 512 although it has more than 1000 characters. At some point, there are a range like "line 2 - column 500, line 2 - column 22" as if columns have been wrapped around at 512.

This is the root of bugs #83, #85 and #110.

My questions:

  1. Where shall I report this? fsbugs? F# Compiler repo? F# Compiler Service repo?
  2. Is there any better approach for getting constants? They are useful for getting semantically equivalent values but preserving original styles.
  3. If we stick with this, how do we enhance reliability of look-ups?

cc @dsyme @ovatsus

@v2m
Copy link

v2m commented Jan 8, 2014

As I understand the crux of the problem lies in the way of storing positions used by F# compiler. If you peek into https://github.com/fsharp/fsharp/blob/master/src/fsharp/range.fs#L32 you'll see that position stores everything in one int32 value and reserves 9 bits to store column - this explains why values >= 2 ^10 will overflow

@v2m
Copy link

v2m commented Jan 8, 2014

As I understand this was done to reduce memory pressure because of HUGE amount of 'pos' instances in-flight. You can alter existing implementation of 'pos' though

@dungpa
Copy link
Contributor Author

dungpa commented Jan 8, 2014

Thanks. I see it now. I'll try to increase bit count of columns to see memory pressure.

Another question:

Why do I have to patch a range when it has multiple lines and consists of "\\\n" (e.g. backslash strings) in https://github.com/dungpa/fantomas/blob/master/src/Fantomas/SourceParser.fs#L38? Is there any reason for this discrepancy?

@ghost
Copy link

ghost commented Jan 8, 2014

There will be a reason, but not a good one :) Would be good to remove the source of that problem.

@dsyme dsyme changed the title Broken ranges in F# ASTs Broken ranges in F# ASTs for very long lines and/or files Sep 10, 2017
@dsyme dsyme changed the title Broken ranges in F# ASTs for very long lines and/or files Problems with very long lines and/or files Sep 10, 2017
@cartermp
Copy link

cartermp commented Mar 5, 2019

@nojaf @jindraivanek This may be able to be closed. F# 4.6 (and corresponding FCS) now use larger ranges. Range isn't notable in memory pressure in recent measurements of the F# tools (though boxing due to comparison was until that was recently fixed...)

@nojaf
Copy link
Contributor

nojaf commented Jul 12, 2019

Ranges in newer versions of the FCS have indeed improved, I don't think this is a problem anymore.

@nojaf nojaf closed this as completed Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants