-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Proposal: Add Span accessor for MemoryMapped files #37227
Comments
That's a substantial risk. Edit: Consider your earlier example, which I've copied again below. using MemoryMappedFile file = MemoryMappedFile.CreateFromFile("test.txt");
ReadOnlySpan<byte> span = file.CreateReadOnlySpan();
Console.WriteLine(Encoding.UTF8.GetString(span)); If you forget to put the using keyword in front of that, your application could AV. Just due to a single missing keyword. Nobody reviewing this code would have any indication that they're entering a world of manually managed memory. See also #33768, which any |
@GrabYourPitchforks An alternative design that ensures that the Now that you mention it though, |
@GrabYourPitchforks i've revised the proposal with |
Your "alternative design" callback-based proposal doesn't run the same risk of AVs as returning the span directly. The implementation of The |
I really hope we don’t end up with this callback design everywhere. I understand we don’t have the means to track lifetime like this outside of the stack but I fear we’ll end up with a set of overloads like this. |
@GrabYourPitchforks @VSadov @Maoni0 what if we had a Pinned Byte Array Heap backed by memory mapped files ... |
@Suchiman , you can third-party implementation like this. The library is on NuGet. My concern with memory-mapped file is different. I would like to iterate through mapped segments using public abstract class ReadOnlySequenceSegment<T>
{
internal protected virtual void Activate();
internal protected virtual void Deactivate();
}
|
@davidfowl I agree. One thing I'd like to see the runtime maintain is an arbitrary memory range to object mapping. For example: 0x00007000_10000000 .. 0x00007000_1FFFFFFF -> [object A]
0x00007000_20000000 .. 0x00007000_200007FF -> [object B] When the GC is running, for any It doesn't solve all problems. For example, it could make GC more expensive because now it's yet another location that needs to be queried during GC. Additionally, it won't solve AVs resulting from use-after-free. We still don't want to end up with an API surface where this is possible: var obj = GetSomeObject();
var span = obj.GetSpan();
obj.Dispose();
var value = span[0]; // AV? In general, AVs should only be possible within an var obj = GetSomeObject();
var span = obj.GetSpan();
obj.Dispose(); // no-ops since GetSpan() was called and the Span was returned
var value = span[0]; // no AV
// we now just have to wait for obj's finalizer to kick in |
@GrabYourPitchforks Frozen Objects do that by setting up GC segments with arbitrary ranges and set a flag that says the heap is read only. I don't know if much of that could be repurposed. Also I don't know how these segment additions scale. What happens if you have 1000 segments? |
Definitely a good question for GC folks. @Maoni0, thoughts? I'm just pulling ideas out of thin air and haven't thought any of this through. :) |
@GrabYourPitchforks assuming checking the X (1000 in the above comment) ranges in addition to stack scan that already occurs for by-ref types is prohibitive, one mitigation could be that this is only done for Gen2 GCs or something. This would of course be in addition to the basic pay for play check where we wouldn't even enter this code path if there are no active memory mapped files that were mapped using this API. |
@richlander @jkotas I said I would tag you on the issue This is a sorely missing capability only the runtime can solve. Provable safe memory mapped files very much like by-ref types or a similar mechanism to Frozen Objects of setting up new segments. Need assessment on how bad the perf impact if there are 1000s of such things. |
The check to turn I am not convinced that this design works. |
I think this overcrowding of callback APIs can be mitigated with some (arguably) clever use of public class Spanwaitable<T> : INotifyCompletion, IDisposable
{
readonly ThreadLocal<bool> inside = new ThreadLocal<bool>();
public Spanwaitable<T> GetAwaiter()
{
return this;
}
public void OnCompleted(Action continuation)
{
// check not disposed
inside.Value = true;
try{
continuation();
}finally{
inside.Value = false;
}
}
public void Dispose()
{
if(!inside.Value) throw new InvalidOperationException();
// also check if not inside on another thread
}
public bool IsCompleted => inside.Value;
public Span<T> GetResult()
{
if(!inside.Value) throw new InvalidOperationException();
return new T[10].AsSpan(); // produce the span here
}
} After all, C# already knows how to turn the code into a state machine to use for the continuation; the only downside is that you cannot create |
Just some thoughts on my current workings with memory mapped files, here's also some reference to some proposals on MMFs #59776. I'm here to find out better options to work with MMFs and just asking some critical questions and are of interest to me in general. I'm not exactly sure what the problem in general is when the MMF got disposed. Isn't that the case with many C# "structures" that you might have a reference to an item that lives inside a PARENT structure and if the PARENT structure gets disposed you ultimately get a It seems to me that this is a well accepted behavior, and MemoryMappedFiles are internally unmanaged objects that are mostly created once, you get a ViewAccessor/Stream (depending on if you map the full view or only parts) and if you close/dispose the accessor it is already an existing behavior that the SafeHandle is also disposed, right? So you already have this problem with the ViewAccessors and ViewStreams, if the MMF gets disposed the ViewAccessor/ViewStreams are also invalid. In the end if the MemoryMappedFile may only be closed if there are no references anymore, the MMF needs to keep track of its created references (and maybe also needs a reference counting)? I'm mosty using the CreateViewAccessor and write my own wrappers around the Another thing I don't really understand why it needs GC management. The MMF is already managed by the (windows) memory manager, why should the GC keep track of objects he doesn't even know? I admit that I also already had some OOM problems because the GC and the memory manager do not "communicate" if it comes to MMF I agree with @GrabYourPitchforks
that missing a Another problem with the |
Background and Motivation
.NET had acccess to memory mapped files for a long time but using them either requires unsafe pointers, a BinaryReader|Writer like API or
Stream
. With the advent ofSpan
one could access them more easily and pass them directly without intermediate copies / buffers to a larger set of APIs.Proposed API
We would add
Unlike most other
Span
APIs we allow passinglong offset
andint size
in order to work with files larger than 2GB which we cannot directly slice into due to allSpan
related classes beingint length
based. If you need to work with files larger than 2GB, you need to callCreateMemoryManager
with increasingoffset
s.Usage Examples
Alternative Designs
We could also add a
string.Create
like API to theMemoryMappedViewAccessor
whereMemoryMappedViewAccessor
manages the lifecycle and the design ensures that the Span does not outlive theMemoryMappedViewAccessor
.which would be used like
Risks
Low risk as far as only adding APIs is concerned.
Designs that allow the Span to outlive the memory mapped file could encounter an access violation if trying to use the span past that point.
The text was updated successfully, but these errors were encountered: