Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use multibyte character path for bundle_probe #44466

Merged
merged 1 commit into from
Nov 19, 2020
Merged

Use multibyte character path for bundle_probe #44466

merged 1 commit into from
Nov 19, 2020

Conversation

am11
Copy link
Member

@am11 am11 commented Nov 10, 2020

Fixes #44098.

@am11
Copy link
Member Author

am11 commented Nov 10, 2020

cc @janvorli

Waiting on a clean CI run. Checkout leg is failing: #44472.

@ghost
Copy link

ghost commented Nov 10, 2020

Tagging subscribers to this area: @vitek-karas, @agocke
See info in area-owners.md if you want to be subscribed.


Issue meta data
Issue content: Fixes #44098.
Issue author: am11
Assignees: -
Milestone: -

@am11 am11 marked this pull request as ready for review November 11, 2020 06:24
@am11 am11 requested a review from janvorli November 11, 2020 06:24
src/coreclr/src/vm/bundle.cpp Outdated Show resolved Hide resolved
src/coreclr/src/vm/bundle.cpp Outdated Show resolved Hide resolved
src/coreclr/src/vm/bundle.cpp Outdated Show resolved Hide resolved
@dotnet dotnet deleted a comment from janvorli Nov 11, 2020
Copy link
Member

@vitek-karas vitek-karas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - let's wait for @janvorli to take a second look

@am11
Copy link
Member Author

am11 commented Nov 11, 2020

Internal to CoreCLR; SString has a concept of internal representation, so if the string was originally UTF8, it will be preserved and GetUTF8 will return it as is (without conversion). I was trying to keep the diff smaller, as it is a deep rabbit hole to update all instances; but there seems to be few places where we have both wide and narrow variants of the same string available at the call sites of bundle probes. If it deemed profitable, we can update those in a separate PR by using narrow variant when constructing SString for probe call.

src/coreclr/src/vm/bundle.cpp Outdated Show resolved Hide resolved
src/coreclr/src/vm/bundle.cpp Outdated Show resolved Hide resolved
Copy link
Member

@janvorli janvorli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

src/coreclr/src/vm/bundle.cpp Show resolved Hide resolved
@janvorli
Copy link
Member

The win x86 leg was failing due to something that looks like a bug in xunit (xunit code modifying a collection while it was being iterated). I am re-running this leg.

Copy link
Member

@elinor-fung elinor-fung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also make sure the single-file doc in dotnet/designs reflects this change.

Edit: Actually, apparently the design doc does have char, but the implementation didn't match.

@am11
Copy link
Member Author

am11 commented Nov 13, 2020

When runtime repo itself is cloned in a path containing multibyte character, I found two issues.

Steps to repro (using three-bytes characters):

$ git clone https://github.com/dotnet/runtime runtime-ⒾⓇⓇⒺⒼⓊⓁⒶⓇ␣ⓉⒺⓍⓉ
$ cd runtime-ⒾⓇⓇⒺⒼⓊⓁⒶⓇ␣ⓉⒺⓍⓉ
$ ./build.sh -c Release
$ ./build.sh Host.Tests -c Release -test
  1. First the build errors, that were trivial to fix:
    --- a/src/coreclr/src/scripts/genDummyProvider.py
    +++ b/src/coreclr/src/scripts/genDummyProvider.py
    @@ -111,7 +111,7 @@ def generateDummyFiles(etwmanifest, out_dirname, extern, dryRun):
            providerName = trimProvName(providerNode.getAttribute('name'))
            providerName_File = escapeProvFilename(providerName)
    
    -        dummyevntprov = os.path.join(out_dirname, dummyevntprovPre + providerName_File + ".cpp")
    +        dummyevntprov = os.path.join(out_dirname.decode('utf-8'), dummyevntprovPre + providerName_File + ".cpp").encode('utf-8')
    
            if dryRun:
                print(dummyevntprov)
    diff --git a/src/coreclr/src/scripts/genEventPipe.py b/src/coreclr/src/scripts/genEventPipe.py
    index d8c6cdca24f..54aa704148d 100644
    --- a/src/coreclr/src/scripts/genEventPipe.py
    +++ b/src/coreclr/src/scripts/genEventPipe.py
    @@ -384,7 +384,7 @@ def generateEventPipeImplFiles(
            providerName_File = providerPrettyName.replace('-', '')
            providerName_File = providerName_File.lower()
            providerPrettyName = providerPrettyName.replace('-', '_')
    -        eventpipefile = os.path.join(eventpipe_directory, providerName_File + ".cpp")
    +        eventpipefile = os.path.join(eventpipe_directory.decode('utf-8'), providerName_File + ".cpp").encode('utf-8')
            if dryRun:
                print(eventpipefile)
            else:
  2. ToLower() in C# is using ICU transofrmation, so HostFxrPath.ToLower() in this case becomes: /users/am11/projects/regular_text-ⓘⓡⓡⓔⓖⓤⓛⓐⓡ␣ⓣⓔⓧⓣ/runtime_pr/artifacts/tests/release/ha/nativehosting/0/valid/dotnet/host/fxr/2.3.0/libhostfxr.dylib, while C++ transformation is ASCII only, so it becomes /users/am11/projects/regular_text-ⒾⓇⓇⒺⒼⓊⓁⒶⓇ␣ⓉⒺⓍⓉ/runtime_pr/artifacts/tests/release/ha/nativehosting/0/valid/dotnet/host/fxr/2.3.0/libhostfxr.dylib and fails the Assert.Contains instances in Nethost.cs. The fix for this was to match C++, ASCII only tolower transformation: http://sprunge.us/UPR16s.

With these two patches, full runtime repo build succeeded and all installer tests passed on macOS. It seems like there are other ToLower() cases in MSBuild scripts (which also uses .NET's ICU-backed transformation), but those are unrelated to PR changes and can be repro'd with the current master branch (and probably ToLower() transformation for hostfxr). Perhaps we should try to get rid of ToLower transformation entirely and rely on underlying filesystem's case-sensitivity choices; i.e. don't care about the casing explicitly in code like the other components of runtime repo.

@janvorli janvorli merged commit 0897d4a into dotnet:master Nov 19, 2020
@am11 am11 deleted the feature/cleanups/purge-codecvt branch November 19, 2020 21:05
@vitek-karas
Copy link
Member

Thanks a lot @am11 !

ThadHouse pushed a commit to ThadHouse/runtime that referenced this pull request Nov 20, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove use of codecvt in PAL
5 participants