-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get Span view of embedded resource data #26101
Comments
What is the scenario that is driving this? I have several concerns about this:
|
It does not need to read it. It can just map the file. There is a ton of code out there that expects the Stream returned by GetManifestResourceStream to be UnmanagedMemoryStream, gets a pointer out of it and parties on it. I do not think we would be ever able to change this implementation detail. I see this suggestion as a desire to make this flow more Span friendly. I agree that adding Span returning GetResource method on Assembly does not sound right. Maybe adding Span returning property on UnmanagedMemoryStream would help? |
In order to have uniform API between UWP and non-UWP, does it make sense to propose |
Let's pursue this as a potential Though it appears, the flow can already be simplified by a simple user extension method:
|
@atsushikan so the extension method should be public static unsafe ReadOnlySpan<byte> GetManifestResource(this Assembly a, string name) {
using(var s = (UnmanagedMemoryStream)a.GetManifestResourceStream(name)) {
return new ReadOnlySpan<byte>( s.PositionPointer, checked((int)s.Length) );
}
} |
That comment is stale or wrong. Stream.Dispose(bool) is a nop: |
This hasn't been touched in over a year. What's the actual API suggestion here? |
@GrabYourPitchforks To add a method to the assembly that returns directly a ReadOnlySpan of a Manifest Resource, without using hacks (or creating UnmanagedMemoryStream), but most importantly safe, so you dont have to mark your project as unsafe, just for using the proposed hack |
I don't know if it's possible to do this "safely". Imagine a scenario where an assembly is loaded, the resource is returned as a In order to move the proposal forward somebody (anybody, really) would need to write down a proposed API signature and the behaviors that we'd expect that API to have. Once that's down we can move this forward to the next step. |
@GrabYourPitchforks What if the method returned an |
Neither |
@GrabYourPitchforks said:
Unfortunately, your statement that Also the memcopy does not seem to actually be preventing access violations in the first place. Sure perhaps it would be possible to add checks for that, but right now it does not seem to be doing anything of the sort. This has also led me to discover a way to crash .NET 5 with an AV without using unsafe code, and without framework methods that are obviously unsafe (Marshal classes, and or Unsafe class, etc). This is by getting the framework to return an UnmanagedMemoryStream pointing into an assembly that gets unloaded. This was fixed for Back to the current issue: I think the bottom line for this issue is that pointer based Span cannot keep an assembly alive via a pointer into the memory mapped PE file. Technically Memory could keep the assembly alive via a custom IMemoryOwner, but that would be unsafe, as nothing will ensure the spans created from It's
I really want to say that a |
I am still using embedded resources instead of inline them in the source code (a large byte array with some thousands lines of source code). It might be stupid or pointless, but I have the feeling that they slow down the compilation and they mess with the version control. I stopped commenting on this issue because I realized after the response of @GrabYourPitchforks that I do not know many low level details. It is not clear to me the nature of the pointer returned by the Also reading the source code of the net core, I see several examples (ie here) where it seems that in-lining a large resource and getting a How this pointer differs from the pointer returned by the private static ReadOnlySpan<byte> CategoryCasingLevel1Index => new byte[2176] |
The compiler and runtime together ensure that the syntax Unfortunately there is no generalized way to accomplish the same thing for an arbitrary Note: as an implementation detail, the particular stream returned by If the very particular scenario is "I want to be able to read an embedded resource as a ROS<byte>," then propose an API specifically for that scenario. Something like |
Ah, from the sounds of things (per #40346), the entirely PE file gets treated specially for spans pointing into them. That should make such a Ideally there would a similar API for localizable resources accessed via For the case where the resource has fallen back to the main assembly it works exactly the same was as ManifestResourceSpan would. For the satellite assembly case, the code already reads into an array. For the stream version it pins the array, since GetStream requires returning an UnmanagedMemoryStream. That code could be simplified by using the pinned object heap for this array, but I'm not sure if these streams typically stick around long enough for that to make sense. In any case since the underlying memory is a managed array, returning a ReadOnlySpan for those is not a problem at all. |
Well that's what I started off by asking for, until ghost (whoever that was, I don't recall) redirect the discussion towards the already-existing UnmanagedMemoryStream object and renamed this issue... |
Its your issue, so I'm pretty sure you should be able to rename it back. And we have now determined this is actually implementable and safely, which is also good. This also only became obviously possible to implement safely when the byte array case become safe for the unload case back in August. Jan favored the something like a property on UnmanagedMemoryStream, but Levi has pointed out that is unsafe. I'm not sure if there is a better place for the method to live than Assembly. I mean RuntimeAssembly would be correct, but that is internal by design. I think proposing it as a method on Assembly, and letting FXDC decide if there is some better place for it is the best we can do unless somebody else comes along with a clever idea. So creating a formal API proposal sounds like the next step. You can update your first post to use the template from https://github.com/dotnet/runtime/issues/new?template=02_api_proposal.md to make this easy for the reviewers. If you do that this can potentially move forward. For an example of another simple API proposal that add members to an existing type, see #49407. Hope this helps. |
Done and done. |
Potentially the |
I do not think it is worth it to add a new Span returning method to Assembly for this scenario. I think that doing nothing and recommending that anybody who needs this writes a bit of unsafe code is better than adding a new method to Assembly. Note that Span returning virtual method on Assembly would not solve the scenarios where the data are needed more than once: The slow lookup of the data stream by name would have to be done each time the data is needed; or the code would have to cache the stream and use the unsafe code to convert it to Span. |
Ok, but now that we understand that it is possible to implement such a feature safely in some cases, but it is not possible to do so for a general UnmanagedMemoryStream, there are potentially more options. Both The other two subclasses in the framework at the moment are safe if you keep the stream alive for the duration of using Span. Of the uses of just plain UnmanagedMemoryStream, we have two right now that can destabilize the runtime, even when only used as a stream. (Amusingly both of which would be completely safe if exposed as a span.) The last usage I'm not entirely sure about, but I'd not be shocked if it was also at risk of causing access violations. So to summarize:
|
How we returned to streams? Can use the same trick, to return other objects? private static ReadOnlySpan<byte> CategoryCasingLevel1Index => Resources.System.Globalization.CategoryCasingLevel1.bin; where Resources is a pseudo static class that has every embedded resource of the current assembly and "System.Globalization.CategoryCasingLevel1.bin" denotes an embedded resource with the file name "CategoryCasingLevel1.bin" in the /System/Globalization folder. Also, no wasted search time to find a named resource, since all those pointers can be resolved once, at mapping time. |
That approach would basically be a C Sharp compiler feature. The compiler certainly could do something like that, where it generates an RVA static that points into the bytes of the embedded resource. This might be slightly risky if anything tried to edit the assembly afterwards, since it might not understand RVA statics pointing into .mresource data, but that ought to be possible to overcome for things like the trimmer. I’m not certain if it is possible to make that safe for an Ildasm/Ilasm round trip. Also, starting the pseudo class name “Resources” is basically guaranteed to clash with the code behind helper for the resx file used by the resources tab in the project level property page. It also just generally gives the impression of being related to “resources” which annoyingly is the complete official name of the localizable resources generated from resx files, rather than being about manifest resources. (Which is what a build action of “Embedded Resource” results in). A different alternative is not to have compiler magic, but to have a code generator that can provide a similar experience. The problem with that though, is that at best it could implement the unsafe code pattern for you. It could cache the pointer and length safely though, since by being embedded in the same assembly, it cannot be accessed unless the assembly is reachable, and thus there is no risk that the cached pointer is stale, since it property that reads from it would go away as part of assembly unloading. Such a code generator would also rely on the fact that right now the runtime implements the whole PE file as a valid target for interior pointers (for assemblies loaded from disk, as opposed to dynamically generated ones). While I think it is fine for System.Private.CoreLib to to rely on such details, I’m not sure it is reasonable to generate code into user assemblies that rely on such runtime details. |
Jan had a good comment about caching (see #26101 (comment)). Assume for the sake of argument that we want to add a new API to public class Assembly
{
public virtual EmbeddedResourceInfo? GetEmbeddedResourceInfo(string resourceName);
}
public abstract class EmbeddedResourceInfo : IDisposable
{
public abstract Stream GetResourceStream(); // creates a new Stream instance on each call
public abstract ReadOnlySpan<byte> GetResourceContents();
} The For a ( |
Or make the existing public class ManifestResourceStream : UnmanagedMemoryStream
{
public ReadOnlySpan<byte> GetContents();
} Example of use: |
@jkotas Do you know if Mono also guarantees that the returned unmanaged memory stream points to a GC-trackable region of the image? |
Mono does not support unloadable code today. |
#40394 (despite its title) tracks the various known cases that must keep an assembly alive that we don't yet have tests in the test suite for for, so that when mono implements that, they can ensure compatibility. So if we take advantage of this behavior we should document it over there. |
If we were to expose a new type for this, should we name it |
Hey stumbled on this issue, as I was looking for a code generator based solution that could bake the resource into the PE file directly for NativeAOT scenarios and this library EmbeddingResourceCSharp does the job nicely. |
Tagging subscribers to this area: @dotnet/area-system-reflection Issue DetailsBackground and MotivationI'd like to be able to get a ReadOnlySpan into an Assembly's Embedded Resources, in order to minimise memory allocations and copying when dealing with such resources. At the moment the only way to get at an Embedded Resource is either via the Stream APIs, or by using unsafe code to read from the UnmanagedMemoryStream's pointer. Proposed APInamespace System.Reflection
{
public class Assembly
{
// ...
public virtual Stream GetManifestResourceStream(string name);
public virtual Stream GetManifestResourceStream(Type type, string name);
+ public virtual ReadOnlySpan<byte> GetManifestResourceSpan(string name);
+ public virtual ReadOnlySpan<byte> GetManifestResourceSpan(Type type, string name);
// ...
} Usage ExamplesUsage mirrors existing ReadOnlySpan<byte> ros1 = GetType().Assembly.GetManifestResourceSpan("Namespace.ResourceName.txt");
// do something with ros1 ReadOnlySpan<byte> ros2 = typeof(X).Assembly.GetManifestResourceSpan(typeof(X), "ResourceName.txt");
// do something with ros2 Alternative DesignsAn alternative was discussed below where the RisksNone known. Original postHi, Would it be possible to add an API to If this has been discussed before, please just point me at that issue. Thanks.
|
@xoofx great idea but problem with Example embedded texture for
Is it correct or wrongly? But my library |
The code-gen approaches mentioned are a feasible approach for these high-performance cases. I don't see how the ability to return the raw backing memory of the assembly file on disk will work with NativeAOT and trimming. Closing since this issue is > 5 years old with no concrete proposal that addresses the original ask (safe; no Streams; no cache needed; need to expose raw memory safely). |
I don't understand the reasons for closing this - there is a concrete API proposal (although no full agreement on that and multiple alternatives also). How is having a span directly to the embedded resource in the assembly fundamentally different from having a span to a The fact that this is 5 years old just means that it hasn't been resolved in 5 years, not a reason for closing this. |
What is not working with EmbeddingResourceCSharp for your use case? |
I'm not going to add an additional package that generates an inefficient solution to replace a 3 line hack (using |
The package is only used at compile time, it doesn't flow at runtime. Also, not sure to follow what do you mean by inefficient solution? The |
I appreciate the feedback here about closing since there has been little activity or progress in the last year, and there is still not an API provided that addresses the requirements along with assembly unloading concern. I'll re-open the issue here for discussion purposes.
I was thinking about trimming in general. If there's not a reference to either a generated resource property name or a resource name as a literal string passed to well-known method(s), then the linker\trimmer would (or could) trim the resource since it wouldn't detect usage of it. However, at this time I don't think resources are trimmed but I can see wanting this in the future. In any case, a new API should consider being trimmer-friendly in these regards. So moving forward, IMO a nice approach would be to leverage the C# work to reference raw memory which supports trimming:
by creating an RVA static field for each embedded resource. This was also mentioned above in #26101 (comment) and #26101 (comment) and also has the advantage of not having to scan for a resource name. A RVA static field would work nicely with source generation of binary resources, although I imagine huge resources might slow down compilation time. |
@steveharter why do you open issue? Check my proof of my picture that's embedding tga file inside data as byte[] ( from |
I re-opened this issue since I think there is a path forward with a built-in source-gen feature and to continue discussion on that. Typically, we don't have discussions on closed issues. |
A built-in source generator for files that are already marked as embedded resources sounds like a much more intuitive system than the third-party source generator referenced here, which (as far as I can tell) requires adding an attribute to C# source code with a relative file path that isn't necessarily included in any csproj. |
@yaakov-h that's correct like I tell about packing to native aot executable. Embedded resources load inside in native executable like read data. Native executable means after If you want know to load embedded resources like texture or picture from native library. You can try out AppWithPlugin for NativeAot. That's all. But I never embed resources in native library. We could test with dotnet's native library ( static or shared ) // UPDATE: See my new repository! |
The existing source-gen as shown, which AFAICT just emits a |
Background and Motivation
I'd like to be able to get a ReadOnlySpan into an Assembly's Embedded Resources, in order to minimise memory allocations and copying when dealing with such resources.
At the moment the only way to get at an Embedded Resource is either via the Stream APIs, or by using unsafe code to read from the UnmanagedMemoryStream's pointer.
Proposed API
Usage Examples
Usage mirrors existing
GetManifestResourceStream
use, but with spans:Alternative Designs
An alternative was discussed below where the
ReadOnlySpan
could be created from theUnmanagedMemoryStream
, but this was deemed to be unsafe as the pointer from theUnmanagedMemoryStream
does not contain a live reference to the assembly. If a Span is created directly from the pointer and then the assembly is unloaded, the application can crash when accessing the Span.Risks
None known.
Original post
Hi,
Would it be possible to add an API to
Assembly
that allows applications to get aReadOnlySpan<>
view of an embedded resource, rather than a stream?If this has been discussed before, please just point me at that issue.
Thanks.
The text was updated successfully, but these errors were encountered: