Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add .mvid section to PE and remove dependency on MetadataReader #19133

Merged
merged 9 commits into from
May 2, 2017

Conversation

jcouv
Copy link
Member

@jcouv jcouv commented Apr 30, 2017

The dependency on MetadataReader is causing troubles for MSBuild integration. But the MVID is in a difficult place to extract in the PE and we don't want to replicate much MetadataReader logic.

So, for ref assemblies, we make another copy of the MVID in a new Section (".mvid"). Sections can be extracted much more easily. This PR includes a simple PE reader that finds the MVID in this new location in the binary.

@tmat @dotnet/roslyn-compiler for review.

// Section: VirtualSize (4)
uint virtualSize = ReadUInt32();

// Section: VirtualAddress (4), SizeOfRawData (4)
Copy link
Member Author

@jcouv jcouv Apr 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmat One thing I noticed is that SizeOfRawData is 500. I wonder if there is a way to reduce that overhead by tweaking FileAlignment? After all, this section only contains 16 bytes. #Resolved

Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly, the PE spec requires the value of FileAlignment to be

... a power of 2 between 512 and 64K, inclusive. #Resolved


VerifyEntryPoint(metadataOutput, expectZero: true);
VerifyMethods(metadataOutput, new[] { "C..ctor()" });
VerifyMVID(metadataOutput, hasMvidSection: true);
Copy link
Member Author

@jcouv jcouv Apr 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, any additional ideas for validation are appreciated. The only validation I have so far is that I can extract the MVID and it matches what we got from PEReader before. And for a regular assembly, there is no .mvid section (no MVID by new method). #Resolved

Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also try to run peverify on the ref assembly. #Resolved

Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add a ref assembly test to deterministic tests. #Resolved

Copy link
Member Author

@jcouv jcouv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have tests that do PEVerify on ref assemblies. That's the main reason I include throw null bodies in ref assemblies.
I'll add deterministic test. #Resolved

Copy link
Member Author

@jcouv jcouv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmat I added a test in deterministic tests, but note that most ref assemblies already use that option, since that is the main purpose. #Resolved

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 30, 2017

Tagging @heejaechang Who uses mvids in the OOP scenario. #Resolved

/// Extract the MVID using two different methods (PEReader and MvidReader) and compare them.
/// We only expect an .mvid section in ref assemblies.
/// </summary>
void VerifyMVID(MemoryStream stream, bool hasMvidSection)
Copy link
Member

@jaredpar jaredpar May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use explicit accessibility modifier. #Resolved

Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: VerifyMvid instead of VerifyMVID #Resolved

{
}

return Guid.Empty;
Copy link
Member

@jaredpar jaredpar May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel like this is essentially hiding failure here. Consider using a TryReadAssemblyMvid pattern and potentially have a wrapper named ReadAssemblyMvidOrEmpty #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaredpar I renamed the methods. Let me know if that's better.


In reply to: 114092229 [](ancestors = 114092229)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the exception handler still needed?

Copy link
Member Author

@jcouv jcouv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. If the ref assembly is truncated (there are section headers, but no actual sections), we risk reading past the end of the file. #Resolved

return FindMvidInSections(sections);
}

string ReadSectionName()
Copy link
Member

@jaredpar jaredpar May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use explicit accessibility modifier. #Resolved

return new string(buffer, 0, read);
}

Guid FindMvidInSections(ushort count)
Copy link
Member

@jaredpar jaredpar May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use explicit accessibility modifier. #Resolved

/// <summary>
/// This PEBuilder adds an .mvid section.
/// </summary>
private class ExtendedPEBuilder : ManagedPEBuilder
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cl [](start = 16, length = 2)

sealed #Resolved

/// <summary>
/// This PEBuilder adds an .mvid section.
/// </summary>
private class ExtendedPEBuilder : ManagedPEBuilder
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtendedPEBuilder [](start = 22, length = 17)

Move to a separate file. #Resolved

@@ -186,7 +186,7 @@ internal static class PeWriter
debugDirectoryBuilder = null;
}

var peBuilder = new ManagedPEBuilder(
var peBuilder = new ExtendedPEBuilder(
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExtendedPEBuilder [](start = 32, length = 17)

Why not have a separate ReferenceAssemblyBuilder and only create it when emitting ref assembly? Instead of heaving a more generic builder whose behavior differs based on a boolean #Resolved

Copy link
Member Author

@jcouv jcouv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried that approach, but it made logic below more complicated (ExtendedPeBuilder has an overload of Serialize with an extra parameter). #Resolved

buffer[read++] = (char)current;
}

return new string(buffer, 0, read);
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new string(buffer, 0, read); [](start = 18, length = 29)

Encoding.UTF8.GetBytes(bytes, 0, read) #Resolved


string ReadSectionName()
{
int length = 8;
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length [](start = 16, length = 6)

const int maxLength #Resolved

for (int i = 0; i < count; i++)
{
// Section: Name (8)
string name = ReadSectionName();
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string name = ReadSectionName(); [](start = 16, length = 32)

Seems like this could be simplified quite a bit.

Something like

bytes[] name = ReadBytes(8);
if (name.Length == 8 && name[0] = '.' && name[1] == 'm' && name[2] == 'v' ...)
{
read data
}
else
{
Skip(size_of_section_header - 8);
}

#Resolved

@jcouv
Copy link
Member Author

jcouv commented May 1, 2017

@CyrusNajmabadi @heejaechang Note that the MVID is still always emitted in its regular spot. Also, the new section is currently only added for ref assemblies (emit metadata-only and strip private members), so I don't expect that would affect OOP. #Resolved

@jcouv
Copy link
Member Author

jcouv commented May 1, 2017

@KirillOsenkov FYI, one more way to get an MVID (just for ref assemblies). #Resolved

@KirillOsenkov
Copy link
Member

KirillOsenkov commented May 1, 2017

This is cool.

CC @jbevain - how does this feel? Can you think of a better/simpler place where to hide an MVID in the format? #Resolved

@jcouv
Copy link
Member Author

jcouv commented May 1, 2017

@jbevain @KirillOsenkov Just for context, there is another option that we brainstormed, but eventually rejected: part of the compilation hash gets stored in the date/timestamp.
That approach didn't seem great because this is "only" 31 bits and more importantly, we think we'll allow users to deterministically set that field in the future. #Resolved

@jcouv
Copy link
Member Author

jcouv commented May 1, 2017

@tmat I think I addressed your feedback. Let me know if good to go. Thanks #Resolved

// Section: Name (8)
byte[] name = ReadBytes(8);
if (name.Length == 8 && name[0] == '.' &&
name[1] == 'm' && name[2] == 'v' && name[3] == 'i' && name[4] == 'd')
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name[4] == 'd' [](start = 74, length = 14)

&& name[5] == '\0' so that we don't match on ".mvidXXX" #Resolved

return new Guid(guidBytes);
}
else
{
Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need for else since the if-statement returns #Resolved

@tmat
Copy link
Member

tmat commented May 1, 2017

:shipit:

@jcouv
Copy link
Member Author

jcouv commented May 1, 2017

Thanks @tmat

@dotnet/roslyn-compiler for a second review.
This change adds an .mvid section in the binary and provides a class to read the MVID out from that section. This is for ref assemblies work with MSBuild team. #Resolved

- Non-public attributes on public APIs (emit attribute based on accessibility rule)
- ref assemblies and NoPia
- `/refout` and `/addmodule`, should we disallow this combination?
- `/refout` and `/addmodule`, should we disallow this combination? (answer: yes. This will not be supported in C# 7.1)
Copy link
Member

@cston cston May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these still open questions? If not, consider moving to another section in the document. #Resolved


if (hasMvidSection)
{
Assert.Equal(mvidFromModuleDefinition, mvidFromMvidReader);
Copy link
Member

@cston cston May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we checking the mvids are non-default? #Resolved

/// </summary>
private void VerifyMvid(MemoryStream stream, bool hasMvidSection)
{
Guid mvidFromModuleDefinition;
Copy link
Member

@cston cston May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declaration can be moved into using, to match mvidFromMvidReader. Alternatively, mvidFromMvidReader and compare could be moved outside the using. #Resolved


BaseStream.Position = pointerToRawData;
byte[] guidBytes = new byte[16];
BaseStream.Read(guidBytes, 0, 16);
Copy link
Contributor

@AlekseyTs AlekseyTs May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseStream.Read(guidBytes, 0, 16); [](start = 20, length = 34)

Not checking how many bytes were read. #Resolved


builder.Add(new Section(MvidSectionName, SectionCharacteristics.MemRead |
SectionCharacteristics.ContainsInitializedData |
SectionCharacteristics.MemDiscardable));
Copy link
Contributor

@AlekseyTs AlekseyTs May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing it would be useful to test reading for scenarios when .mvid section isn't first. #Resolved

Copy link
Member

@tmat tmat May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@jcouv jcouv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmat I'm going to bake in the assumption that the .mvid section is the first one. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be better not to. Just in case someone runs another tool that adds a section. That should be perfectly fine thing to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmat The test you pointed me to doesn't dispose new PEReader(...) (line 496), as Chuck pointed out when I mirrored the test. You may want to review that file for other instances.

@AlekseyTs
Copy link
Contributor

AlekseyTs commented May 1, 2017

Done with review pass. #Closed

return FindMvidInSections(sections, reader);
}

private static Guid FindMvidInSections(ushort count, BinaryReader reader)
Copy link
Member

@cston cston May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count is not used. Should we check there is at least one section? #Resolved

Copy link
Member Author

@jcouv jcouv May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thanks #Resolved

private Blob _mvidSectionFixup = default(Blob);

// Only include the .mvid section in ref assemblies
private bool _withMvidSection;
Copy link
Member

@cston cston May 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readonly #Resolved

@jcouv
Copy link
Member Author

jcouv commented May 2, 2017

@tmat @AlekseyTs I added a test with .mvid section last.

@jcouv
Copy link
Member Author

jcouv commented May 2, 2017

@dotnet-bot test windows_release_unit64_prtest please

peBlob.WriteContentTo(peStream);

peStream.Position = 0;
var peReader = new PEReader(peStream);
Copy link
Member

@cston cston May 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using (var peReader = ...) { ... } #Resolved

for (int i = 0; i < count; i++)
{

// .mvid section must be first, if it's there
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment no longer applies.

{
return new Guid(guidBytes);
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

break;?

@cston
Copy link
Member

cston commented May 2, 2017

LGTM

}

reader.BaseStream.Seek(position, SeekOrigin.Begin);
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return true; [](start = 12, length = 12)

We probably shouldn't assume that Seek succeeds. Consider return reader.BaseStream.Seek(position, SeekOrigin.Begin) == position; instead.

MoveTo(0x3C, reader);
uint lfanew = reader.ReadUInt32();

MoveTo(lfanew, reader); // jump over the MS DOS Stub to the PE Signature
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MoveTo(lfanew, reader) [](start = 12, length = 22)

The result is not checked, we do not know where we are moving to, therefore, cannot assume that it is safe to continue after that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already checked above.

}

// DOS Header: Address of PE Signature (at 0x3C)
MoveTo(0x3C, reader);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MoveTo(0x3C, reader); [](start = 12, length = 21)

The result is not checked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already checked above:

 // DOS Header (64), DOS Stub (64), PE Signature (4), COFF Header (20), PE Header (224), 1x Section Header (40)
            if (reader.BaseStream.Length < 64 + 64 + 4 + 20 + 224 + 40)

MoveTo(lfanew, reader); // jump over the MS DOS Stub to the PE Signature

// PE Signature ('P' 'E' null null)
if (reader.ReadUInt32() != 0x00004550)
Copy link
Contributor

@AlekseyTs AlekseyTs May 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (reader.ReadUInt32() != 0x00004550) [](start = 12, length = 38)

This looks like an attempt to read at an unknown location, which can throw if we are near the end of the stream. #Closed

Copy link
Member Author

@jcouv jcouv May 2, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already checked above. #Resolved

{
return s_empty;
}
if (!ReadUInt32(reader, out uint lfanew))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the variable name means. Perhaps 'signatureOffset' instead.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what it's called in the spec, which I agree is not very informative. I'll fix.

Copy link
Contributor

@AlekseyTs AlekseyTs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jcouv
Copy link
Member Author

jcouv commented May 2, 2017

Thanks all for the reviews, feedback and patience :-)
I'll go ahead and merge.

@jcouv jcouv merged commit 32a9e28 into dotnet:features/refout May 2, 2017
@jcouv jcouv deleted the refout-mvid branch May 2, 2017 22:42
@jcouv
Copy link
Member Author

jcouv commented May 4, 2017

Relates to #18612

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants