-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need a method similar to S.R.CS.RuntimeHelpers.InitializeArray, but for spans #24961
Comments
This is related to: |
CC:@jkotas, @marek-safar, @alrz, @jcouv , @jaredpar, @stephentoub |
Having this in .NET Core 2.1 would be very useful. |
how about stackalloc on pointers? e.g. |
@alrz - the resulting type does not matter here. From the bigendian point of view we store int32 elements with their bytes in a wrong order. The data will not look like consecutive |
I understand that, but we're going to use this for stackalloc initializers on pointers as well, I guess the second alternative could work though In fact, stackalloc initializers "only" work on pointer types, no special codegen for Span. We're just passing an already initialized localloc to the Span constructor. |
@alrz - right, |
It is fairly easy to get a pointer from a span. For example “(void*)&s[0]” in IL |
@VSadov I just don't see why there has be an indirection here, we can construct both Span and ReadOnlySpan directly from a pointer (or |
One reason is that ReadOnlySpan can be used safely in the ReadOnlySpan case, which would not require dealing with pointers, and would be verifiable. Since span is basically a range-checking reference is is safer to use. Conceptually, it is the right thing to represent a chunk of data with known length. We will see though what CLR/FX guys will say when it gets to actually implementing that API. |
currently none of Span tests are verifiable due to the unsafe nature of the constructor... I am more biased towards the current implementation of the stackalloc inits, I think that approach need more changes compared to just returning a ref here. We should see how things work out in either of these cases. |
@VSadov, what's the conclusion here? Do we need this API? If yes, please mark the issue are ready for review and let's discuss it asap. |
We need this API to implement optimizations. Considering that this will have to be implemented as JIT intrinsic and be platform specific, I am very doubtful that it can make it for 7.3 Lets put it for the review though. |
@jkotas what are your thoughts on this? |
This method is Span equivalent of the existing The end-to-end story for this requires changes in C# compiler, runtime, JIT, and potentially debugger expression evaluator to come together. We should do this post .NET Core 2.1 when we have a runway for doing the work and testing that it all fits together well. |
Moving to future. |
How about this: class RuntimeHelpers
{
ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle);
} |
Use class RuntimeHelpers
{
ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle) where T : unmanaged;
} |
How expensive is that method? Is it possible to cache the return value somehow if it’s expensive? |
The following looks good. class RuntimeHelpers
{
ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle)
} Not sure if It basically means that @ektrah Yes, runtime will have to cache the result if it did any transformations to the data - to correct endianness or alignment. The original field is technically accessible from the user code, so the transformation, if needed, cannot be done in-place. |
Alright, this is it: class RuntimeHelpers
{
ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle);
} |
Why not put it on |
It is Span equivalent of |
Is this still required since the compiler has used |
@huoyaoyuan, the compiler is only able to do that where the element size is 1 byte, anything larger would require adjusting for potential differences in endianness. The compiler uses |
Some question found while implementing it. The storage of RVA field looks to be endianess aware, and |
In CoreCLR, it would need to allocate the copy with the right endianness on the loader heap. It is not that important to implement the big-endian support for CoreCLR. CoreCLR does not run on big-endian systems today and there are number of issues that would need to be fixed before it can. In Mono, implementing the endian swapping for Mono is more important. Mono runs on big-endian systems today. @lambdageek Could you please provide guidance for Mono? |
It looks like that the reversed form depends on the type of |
@jaredpar, if the runtime were to expose this soon, would C# be able to target it for .NET 6? I expect there'd be a bit of a back and forth to get all the ducks in a row, e.g. runtime exposes the API, C# takes advantage of it, runtime updates all places that could utilize it to do so (e.g. changing some internal array static fields to be span static props). |
Seems pretty reasonable and probably low / med-low costing. Curious: how much benefit is this expected to provide here? |
I think for Mono it will be very similar to how
|
@VSadov, can you comment on this, and how the costs would compare to just using a static T[] field? |
An interesting case is Ultimately, compared to wrapping a static array this saves the allocation and initialization of that array. I would say the savings are not huge. If there are other reasons why this would require nontrivial work or extra copy even on little-endian (for example due to alignment requirements), then it might not be worth it. |
What about after the array has already been initialized? This API for span will need to be called on every access, right? How does that compare to the cost of accessing the static array field? |
This API would have to be implemented as JIT intrinsic that turns it into address constant. The current |
If the API takes a handle, there could be some extra work with figuring the location of the field, unless it is an intrinsic that does it at JIT-time. Alternatively the API may also take a ref to the field (C# compiler could provide it), then it might be possible to just use the ref and make it roughly the same as wrapping a static field (on little endian machine). |
@jkotas - are there alignment guarantees for metadata blobs? |
It is up to IL producers to guarantee the alignment for RVA statics. For example, managed C++ does emit the RVA statics with the right alignment. The IL rewriters (at least the ones we own - e.g. crossgen) do preserve it. I think we can make the API throw when the blob is not sufficiently aligned. |
Then in terms of impact, assuming it's as-fast-or-faster to access one of these, there are a bunch of places we'd used them, just as there are a bunch of places we used the support that was added for |
How does returning pointers into data section of the executable mix with unloadability? Would we need to make a copy of the data if the assembly is part of an unloadable load context? I assume there's no good way to track the reference within the span. |
In coreclr span is similar to a byref parameter in terms of GC tracking. It should keep the context alive. |
|
Closed in favor of #60948 |
@jkotas it's still open. |
Initialization of literal arrays like
new int[]{1,2,3,4,5,6}
has a special path where the blittable data is stored directly in the PE data and at runtime, instead of assigning every element of the array to a corresponding constant, we callInitializeArray
with the target array instance and the token for the field that represents the data blob.Similar technique would be very useful when initializing spans.
We already have two scenarios:
stackalloc int[] {1,2,3,4,5}
(ReadOnlySpan<int>)new int[]{1,2,3,4,5}
in fact in the case with ReadOnlySpan conversion, it would be possible and desirable to just refer to the PE data directly.
The preferred form of the API would be:
Another acceptable alternative is:
(but the one above feels more convenient since verifiability is less of a problem)
A valid question to be asked here - "if it is possible to just load a reference to the field in the first place, why there is a need for the API?"
The problem is that the blob data is always stored in littleendian format, so on a bigendian machine the blob data is valid only for 1-byte sized elements.
Similarly to the case of
InitializeArray
, this API would allow the runtime to abstract away the endianness of the blob.In a littleendian context (which is the most common case) the implementation could trivially forward to the field data and in bigendian case it may do fixups by either making a copy of the data while changing the endianness or even by performing the fixup in-place.
NOTE: possibility of in-place fixup would require that the same blob is not used to initialize span data of different sizes - say shorts and longs.
Such restriction would be acceptable on the C# side and runtime could validate that such "sharing" did not happen, or make it undefined behavior if that happens.
NOTE: the presence of the API is statically known to the compiler, so it would be ok if some runtimes do not have it right away or never. Then optimization will simply not consider 1+ element sizes.
The text was updated successfully, but these errors were encountered: