Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding problems (maybe because of Encoding.Default?) #128

Closed
MilleBo opened this issue Feb 7, 2020 · 7 comments
Closed

Encoding problems (maybe because of Encoding.Default?) #128

MilleBo opened this issue Feb 7, 2020 · 7 comments
Labels
⚠ Bug Something isn't working as expected

Comments

@MilleBo
Copy link

MilleBo commented Feb 7, 2020

I just tried to build/compile a solution that contains variables with åäö and it seems to fail with errors like: Unexpected character '�'

I then checked the document object and it seems like it put the document encoding to SBCSCodePageEncoding:

CanBeEmbedded: true
ChecksumAlgorithm: Sha1
Container: {Microsoft.CodeAnalysis.Text.SourceText.StaticContainer}
Encoding: {System.Text.SBCSCodePageEncoding}
Length: 8825
Lines: {Microsoft.CodeAnalysis.Text.SourceText.LineInfo}

So I looked through your source code and it seems like you set Encoding.Default (line 224):

https://github.com/daveaglick/Buildalyzer/blob/master/src/Buildalyzer.Workspaces/AnalyzerResultExtensions.cs

Should we really use Encoding.Default? It seems like Microsoft recommend not to:

https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding.default?redirectedfrom=MSDN&view=netframework-4.8#System_Text_Encoding_Default

And here is another discussion:

https://stackoverflow.com/questions/18866583/why-is-default-encoding-in-c-sharp-not-recommend

Maybe it should be changed to something else? Like UTF-8?

@MilleBo MilleBo changed the title Encoding problems (guess because of Encoding.Default?) Encoding problems (maybe because of Encoding.Default?) Feb 7, 2020
@daveaglick
Copy link
Collaborator

Good call - it should probably be UTF-8. I’ve blocked out some Buildalyzer time this weekend so I’ll try that change and make sure none of the integration tests break.

@daveaglick daveaglick added the ⚠ Bug Something isn't working as expected label Feb 7, 2020
@MilleBo
Copy link
Author

MilleBo commented Feb 8, 2020

Great, thanks.

@MilleBo
Copy link
Author

MilleBo commented Feb 12, 2020

I actually tried to change it to UTF-8 and it doesn't seem to fix the issue so maybe it's just better to keep it as default.

@daveaglick
Copy link
Collaborator

Even if it doesn't solve this particular problem, I still think it's better not to set the default encoding now that you pointed it out. As for this problem, is it possible the file has a totally different encoding and isn't actually ASCII or UTF-8? I'd be curious if a call to SourceText.From(File.ReadAllText(yourFile)) from a test app crashes or not. If it does that suggests the problem is actually with Roslyn.

@MilleBo
Copy link
Author

MilleBo commented Feb 14, 2020

I checked a while ago and the file was actually some kind of ANSI encoding (but all other files in the solution was UTF-8). So the only solution that actually worked was to check the encoding for each specific file and use that when reading.

@daveaglick
Copy link
Collaborator

So this is likely an error caused not by the presence of unexpected characters, but rather the file being encoded with something other than the specified encoding. I think we can resolve it by using something like StreamReader.CurrentEncoding since we're loading the SourceText from a known file.

@daveaglick
Copy link
Collaborator

Just to circle back, we actually do need to set that encoding - the workspace won't compile if someone tries to without an encoding set for the SourceText. But Encoding.Default was also wrong. The key here is that we're loading the SourceText from a string which is always Encoding.Unicode so we can use that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚠ Bug Something isn't working as expected
Projects
None yet
Development

No branches or pull requests

2 participants