Need a method similar to S.R.CS.RuntimeHelpers.InitializeArray, but for spans #24961

VSadov · 2018-02-07T23:38:33Z

Initialization of literal arrays like new int[]{1,2,3,4,5,6} has a special path where the blittable data is stored directly in the PE data and at runtime, instead of assigning every element of the array to a corresponding constant, we call InitializeArray with the target array instance and the token for the field that represents the data blob.

Similar technique would be very useful when initializing spans.
We already have two scenarios:

stack allocated spans - stackalloc int[] {1,2,3,4,5}
optimized conversion of literal arrays to ReadOnlySpan - (ReadOnlySpan<int>)new int[]{1,2,3,4,5}

in fact in the case with ReadOnlySpan conversion, it would be possible and desirable to just refer to the PE data directly.

The preferred form of the API would be:

        ReadOnlySpan<T> GetGetReadOnlySpanFromTemplate<T>(RuntimeFieldHandle fldHandle);

Another acceptable alternative is:
(but the one above feels more convenient since verifiability is less of a problem)

        ref T GetRefToTemplateData<T>(RuntimeFieldHandle fldHandle);

A valid question to be asked here - "if it is possible to just load a reference to the field in the first place, why there is a need for the API?"

The problem is that the blob data is always stored in littleendian format, so on a bigendian machine the blob data is valid only for 1-byte sized elements.
Similarly to the case of InitializeArray, this API would allow the runtime to abstract away the endianness of the blob.
In a littleendian context (which is the most common case) the implementation could trivially forward to the field data and in bigendian case it may do fixups by either making a copy of the data while changing the endianness or even by performing the fixup in-place.

NOTE: possibility of in-place fixup would require that the same blob is not used to initialize span data of different sizes - say shorts and longs.
Such restriction would be acceptable on the C# side and runtime could validate that such "sharing" did not happen, or make it undefined behavior if that happens.

NOTE: the presence of the API is statically known to the compiler, so it would be ok if some runtimes do not have it right away or never. Then optimization will simply not consider 1+ element sizes.

The text was updated successfully, but these errors were encountered:

VSadov · 2018-02-07T23:39:26Z

This is related to:
dotnet/roslyn#24621
dotnet/roslyn#24249

VSadov · 2018-02-07T23:42:35Z

CC:@jkotas, @marek-safar, @alrz, @jcouv , @jaredpar, @stephentoub

VSadov · 2018-02-08T00:26:40Z

Having this in .NET Core 2.1 would be very useful.
Is there any chance for this API making it though at this point?

alrz · 2018-02-08T04:45:00Z

how about stackalloc on pointers? e.g. int* p = stackalloc[] { 1,2,3 };

VSadov · 2018-02-08T05:13:51Z

@alrz - the resulting type does not matter here. From the bigendian point of view we store int32 elements with their bytes in a wrong order. The data will not look like consecutive 1,2,3 on bigendian machine.

alrz · 2018-02-08T05:22:44Z

I understand that, but we're going to use this for stackalloc initializers on pointers as well, I guess the second alternative could work though ref T GetRefToTemplateData<T>(RuntimeFieldHandle fldHandle);

In fact, stackalloc initializers "only" work on pointer types, no special codegen for Span. We're just passing an already initialized localloc to the Span constructor.

VSadov · 2018-02-08T05:39:35Z

@alrz - right, GetRefToTemplateData would work equally well for stackalloc that results in a pointer or in a span.

VSadov · 2018-02-08T17:10:46Z

It is fairly easy to get a pointer from a span. For example “(void*)&s[0]” in IL

alrz · 2018-02-08T17:46:43Z

@VSadov I just don't see why there has be an indirection here, we can construct both Span and ReadOnlySpan directly from a pointer (or ref for that matter), but GetReadOnlySpanFromTemplate needs a conversion for anything other than ReadOnlySpan

VSadov · 2018-02-08T17:58:04Z

One reason is that ReadOnlySpan can be used safely in the ReadOnlySpan case, which would not require dealing with pointers, and would be verifiable.

Since span is basically a range-checking reference is is safer to use. Conceptually, it is the right thing to represent a chunk of data with known length.

We will see though what CLR/FX guys will say when it gets to actually implementing that API.

alrz · 2018-02-08T18:07:57Z

currently none of Span tests are verifiable due to the unsafe nature of the constructor... I am more biased towards the current implementation of the stackalloc inits, I think that approach need more changes compared to just returning a ref here. We should see how things work out in either of these cases.

KrzysztofCwalina · 2018-02-23T16:53:46Z

@VSadov, what's the conclusion here? Do we need this API? If yes, please mark the issue are ready for review and let's discuss it asap.

VSadov · 2018-02-23T21:03:58Z

We need this API to implement optimizations.
It is not strictly needed - it is not blocking, but it is a "very good to have" one.

Considering that this will have to be implemented as JIT intrinsic and be platform specific, I am very doubtful that it can make it for 7.3

Lets put it for the review though.

terrajobst · 2018-02-27T18:39:21Z

@jkotas what are your thoughts on this?

jkotas · 2018-02-27T19:14:32Z

This method is Span equivalent of the existing RuntimeHelpers.InitializeArray method. It is expected to be only ever be called by C# compiler generated code, never called directly by user. The proposed design (GetReadOnlySpanFromTemplate) looks fine to me.

The end-to-end story for this requires changes in C# compiler, runtime, JIT, and potentially debugger expression evaluator to come together. We should do this post .NET Core 2.1 when we have a runway for doing the work and testing that it all fits together well.

ahsonkhan · 2018-02-28T07:52:14Z

The end-to-end story for this requires changes in C# compiler, runtime, JIT, and potentially debugger expression evaluator to come together. We should do this post .NET Core 2.1 when we have a runway for doing the work and testing that it all fits together well.

Moving to future.

terrajobst · 2018-06-12T19:03:51Z

How about this:

class RuntimeHelpers
{
	ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle);	
}

benaadams · 2018-06-12T19:08:44Z

Use unmanaged constraint, so it can't refer to objects or types containing object?

class RuntimeHelpers
{
    ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle) where T : unmanaged;	
}

ektrah · 2018-06-12T19:15:45Z

How expensive is that method? Is it possible to cache the return value somehow if it’s expensive?

VSadov · 2018-06-12T22:07:45Z

The following looks good.

class RuntimeHelpers
{
    ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle)	
}

Not sure if unmanaged would be very useful. T must be blittable - in a sense that an array of bytes can be trivially mapped to T[]. unmanaged structs may still get their fields aligned, packed, reordered. May contain IntPtr...

It basically means that T must be a primitive value type of a known size like int, char or an enum. Runtime will need to validate that.

@ektrah Yes, runtime will have to cache the result if it did any transformations to the data - to correct endianness or alignment. The original field is technically accessible from the user code, so the transformation, if needed, cannot be done in-place.
If no changes are needed, it would make sense to just wrap the field in a span and return. This should generally be a case on little endian machines..

terrajobst · 2018-06-26T18:05:23Z

Video

Alright, this is it:

class RuntimeHelpers
{
	ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle fldHandle);	
}

jaredpar · 2018-06-26T20:24:17Z

Why not put it on MemoryMarshal? This where a number of unsafe Span helpers already exist. Prefer we keep all of those in one place so the entirety of the feature is easy to rationalize about.

jkotas · 2018-06-26T20:30:29Z

It is Span equivalent of InitializeArray that lives on RuntimeHelpers. There is nothing unsafe about either InitializeArray nor CreateSpan.

huoyaoyuan · 2020-10-12T13:16:39Z

Is this still required since the compiler has used ldflda for a long time?

brian-reichle · 2020-10-12T22:05:33Z

@huoyaoyuan, the compiler is only able to do that where the element size is 1 byte, anything larger would require adjusting for potential differences in endianness. The compiler uses RuntimeHelpers.InitializeArray to make the necessary adjustments when initializing an array, I believe this issue is about getting something similar for ReadOnlySpan<> so that the compiler can avoid the need to create an array when the element size > 1 byte.

https://sharplab.io/#v2:EYLgHgbALANALiAlgGwD4AEBMBGAsAKHQGYACLEgYRIG8CT6zSAlAUwEMATAeQDtkBPAMoAHNjwA8wfnBYA+EgFkAFAEoSAXnk8WAdxJSZAbQC6NEgAYYJbFcxWiV2CQCsViCQC+AbjoNiJVk5eARExcUQeOHkAOVUNLV0SCLgTM0trW3tHK1cSd28CDyA==

huoyaoyuan · 2020-10-18T15:42:09Z

Some question found while implementing it.

The storage of RVA field looks to be endianess aware, and InitializeArray has code to handle them.
Under LE runtime, returning the address of RVA field directly is OK. This also works for bytes.
Under BE runtime, reversing of endianess should be required. But where to store the reversed data? Creating an array every time regresses the performance improvements totally.

jkotas · 2020-10-18T15:50:09Z

In CoreCLR, it would need to allocate the copy with the right endianness on the loader heap. It is not that important to implement the big-endian support for CoreCLR. CoreCLR does not run on big-endian systems today and there are number of issues that would need to be fixed before it can.

In Mono, implementing the endian swapping for Mono is more important. Mono runs on big-endian systems today. @lambdageek Could you please provide guidance for Mono?

huoyaoyuan · 2020-10-18T16:45:28Z

It looks like that the reversed form depends on the type of T, so it may cause some architectural issue, although every field is supposed to have only one usage in practice.

stephentoub · 2020-11-23T14:46:43Z

@jaredpar, if the runtime were to expose this soon, would C# be able to target it for .NET 6? I expect there'd be a bit of a back and forth to get all the ducks in a row, e.g. runtime exposes the API, C# takes advantage of it, runtime updates all places that could utilize it to do so (e.g. changing some internal array static fields to be span static props).

jaredpar · 2020-11-23T19:08:34Z

Seems pretty reasonable and probably low / med-low costing.

Curious: how much benefit is this expected to provide here?

lambdageek · 2020-11-23T20:35:47Z

In CoreCLR, it would need to allocate the copy with the right endianness on the loader heap. It is not that important to implement the big-endian support for CoreCLR. CoreCLR does not run on big-endian systems today and there are number of issues that would need to be fixed before it can.

In Mono, implementing the endian swapping for Mono is more important. Mono runs on big-endian systems today. @lambdageek Could you please provide guidance for Mono?

I think for Mono it will be very similar to how ves_icall_System_Runtime_CompilerServices_RuntimeHelpers_InitializeArray works today, with the following changes:

for LE we can just return a span over the result of mono_field_get_rva
for BE, we may need to add a second field to MonoFieldDefaultValue something like char *host_data; /* RVA swapped to host endianness using the type of MonoClassField* for sizes */
in mono_field_get_rva on BE we would allocate from the class mempool and write the byte-swapped data to host_data if it's not already initialized.
(extra credit) for AOT it would be nice to store the byte-swapped data in the AOT image and refer to that instead of doing a copy at runtime. I don't think this will be easy.

stephentoub · 2020-11-23T20:39:36Z

how much benefit is this expected to provide here?

@VSadov, can you comment on this, and how the costs would compare to just using a static T[] field?

VSadov · 2020-11-23T22:28:35Z

An interesting case is (ReadOnlySpan<int>)new int[]{1,2,3,4,5} - on little-endian machine that can be a direct reference to the metadata without copying or allocating anything. We already can do this as long as the element type is 1-byte size. I am not sure how commonly this pattern is used, since requiring 1-byte elements seems quite limiting.

Ultimately, compared to wrapping a static array this saves the allocation and initialization of that array. I would say the savings are not huge.
The issue was logged with an assumption that supporting this might be fairly cheap too, especially on little-endian HW, which is nowdays almost everything.

If there are other reasons why this would require nontrivial work or extra copy even on little-endian (for example due to alignment requirements), then it might not be worth it.

stephentoub · 2020-11-23T22:33:39Z

Ultimately, compared to wrapping a static array this saves the allocation and initialization of that array.

What about after the array has already been initialized? This API for span will need to be called on every access, right? How does that compare to the cost of accessing the static array field?

jkotas · 2020-11-23T22:48:08Z

This API would have to be implemented as JIT intrinsic that turns it into address constant.

The current InitializeArray is also implemented as JIT intrinsic, for similar reason.

VSadov · 2020-11-23T22:52:30Z

If the API takes a handle, there could be some extra work with figuring the location of the field, unless it is an intrinsic that does it at JIT-time.

Alternatively the API may also take a ref to the field (C# compiler could provide it), then it might be possible to just use the ref and make it roughly the same as wrapping a static field (on little endian machine).

VSadov · 2020-11-23T22:55:57Z

@jkotas - are there alignment guarantees for metadata blobs?

jkotas · 2020-11-23T22:59:17Z

It is up to IL producers to guarantee the alignment for RVA statics.

For example, managed C++ does emit the RVA statics with the right alignment. The IL rewriters (at least the ones we own - e.g. crossgen) do preserve it.

I think we can make the API throw when the blob is not sufficiently aligned.

stephentoub · 2020-11-24T01:49:39Z

Then in terms of impact, assuming it's as-fast-or-faster to access one of these, there are a bunch of places we'd used them, just as there are a bunch of places we used the support that was added for ReadOnlySpan<OneBytePrimitive>. A quick survey suggests there are dozens of places we'd use this with private static readonly char[] and private static readonly int[] fields, where we could avoid the array allocation / copy and get potentially faster access to boot, with access sites either just indexing into them or using them with APIs that support spans.

MichalStrehovsky · 2020-11-24T08:34:21Z

An interesting case is (ReadOnlySpan)new int[]{1,2,3,4,5} - on little-endian machine that can be a direct reference to the metadata without copying or allocating anything

How does returning pointers into data section of the executable mix with unloadability? Would we need to make a copy of the data if the assembly is part of an unloadable load context? I assume there's no good way to track the reference within the span.

VSadov · 2020-11-24T09:03:15Z

In coreclr span is similar to a byref parameter in terms of GC tracking. It should keep the context alive.
Spans are also stack-only and thus relatively short-lived, so keeping the context alive for too long is unlikely to be a problem.

jkotas · 2020-11-24T09:44:13Z

In coreclr span is similar to a byref parameter in terms of GC tracking

#40346

jkotas · 2021-10-27T23:55:14Z

Closed in favor of #60948

teo-tsirpanis · 2021-10-29T15:13:28Z

@jkotas it's still open.

joshfree assigned KrzysztofCwalina Feb 20, 2018

msftgits transferred this issue from dotnet/corefx Jan 31, 2020

msftgits added this to the Future milestone Jan 31, 2020

stephentoub unassigned KrzysztofCwalina Feb 12, 2020

stephentoub mentioned this issue Nov 28, 2020

HTTP2: Optimize huffman encoding static table initialization #45297

Closed

stephentoub mentioned this issue Jan 22, 2021

Prefer static ReadOnlySpan<byte> properties over static readonly byte[] fields #33780

Open

stephentoub mentioned this issue Mar 1, 2021

Optimize asm size for the biggest corelib method - WebUtility..cctor #48906

Merged

stephentoub mentioned this issue Mar 30, 2021

[Http] Remove some unsafe code and save a string allocation dotnet/aspnetcore#31267

Merged

stephentoub mentioned this issue Apr 23, 2021

Change a byte[] in KnownColorTable to be ReadOnlySpan<byte> #51719

Merged

stephentoub mentioned this issue Jul 12, 2021

Support Invariant Mode Case Mapping #55520

Merged

stephentoub modified the milestones: Future, 7.0.0 Jul 13, 2021

gfoidl mentioned this issue Oct 13, 2021

Experiment with Roslyn optimization for ROS<T> in assembly data section #60327

Merged

stephentoub mentioned this issue Oct 27, 2021

[API Proposal]: ReadOnlySpan<T> CreateSpan<T>(RuntimeFieldHandle) #60948

Closed

jkotas closed this as completed Oct 29, 2021

ghost locked as resolved and limited conversation to collaborators Nov 28, 2021

Need a method similar to S.R.CS.RuntimeHelpers.InitializeArray, but for spans #24961

Need a method similar to S.R.CS.RuntimeHelpers.InitializeArray, but for spans #24961

Comments

VSadov commented Feb 7, 2018

VSadov commented Feb 7, 2018

VSadov commented Feb 7, 2018

VSadov commented Feb 8, 2018

alrz commented Feb 8, 2018

VSadov commented Feb 8, 2018

alrz commented Feb 8, 2018

VSadov commented Feb 8, 2018

VSadov commented Feb 8, 2018 • edited Loading

alrz commented Feb 8, 2018

VSadov commented Feb 8, 2018

alrz commented Feb 8, 2018

KrzysztofCwalina commented Feb 23, 2018

VSadov commented Feb 23, 2018 • edited Loading

terrajobst commented Feb 27, 2018

jkotas commented Feb 27, 2018 • edited Loading

ahsonkhan commented Feb 28, 2018

terrajobst commented Jun 12, 2018

benaadams commented Jun 12, 2018

ektrah commented Jun 12, 2018 • edited Loading

VSadov commented Jun 12, 2018 • edited Loading

terrajobst commented Jun 26, 2018 • edited Loading

jaredpar commented Jun 26, 2018

jkotas commented Jun 26, 2018

huoyaoyuan commented Oct 12, 2020

brian-reichle commented Oct 12, 2020

huoyaoyuan commented Oct 18, 2020

jkotas commented Oct 18, 2020

huoyaoyuan commented Oct 18, 2020

stephentoub commented Nov 23, 2020

jaredpar commented Nov 23, 2020

lambdageek commented Nov 23, 2020 • edited Loading

stephentoub commented Nov 23, 2020

VSadov commented Nov 23, 2020

stephentoub commented Nov 23, 2020

jkotas commented Nov 23, 2020 • edited Loading

VSadov commented Nov 23, 2020 • edited Loading

VSadov commented Nov 23, 2020

jkotas commented Nov 23, 2020

stephentoub commented Nov 24, 2020

MichalStrehovsky commented Nov 24, 2020

VSadov commented Nov 24, 2020

jkotas commented Nov 24, 2020

jkotas commented Oct 27, 2021

teo-tsirpanis commented Oct 29, 2021

VSadov commented Feb 8, 2018 •

edited

Loading

VSadov commented Feb 23, 2018 •

edited

Loading

jkotas commented Feb 27, 2018 •

edited

Loading

ektrah commented Jun 12, 2018 •

edited

Loading

VSadov commented Jun 12, 2018 •

edited

Loading

terrajobst commented Jun 26, 2018 •

edited

Loading

lambdageek commented Nov 23, 2020 •

edited

Loading

jkotas commented Nov 23, 2020 •

edited

Loading

VSadov commented Nov 23, 2020 •

edited

Loading