Provide support for fixed capacity, variable length value type (inline) strings. #2099

Korporal · 2018-12-21T13:04:29Z

Strings in C# are perceived as buffers with an (to all intents and purposes) unlimited capacity and for this reason cannot be stored inline as primitive types are. I'm proposing that consideration be given to introducing an additional string type which has a capacity declared at runtime, and thus a maximum possible length.

This then makes it possible to define classes or structs which contain strings yet have these string appear inline, within the datum's memory much as primitive types are.

This is a problem that came up in a sophisticated very high performance client server design in which we got huge benefits by being able to define fixed length messages that contained strings. In our case we simulated fixed capacity strings as properties that encapsulated fixed buffers (char or byte). This worked well but was messy because the language offers no way for us to 'pass' (at compile time) a length into a fixed buffer declaration, one must actually declare the fixed buffer explicitly with a constant.

As a result we created a huge family of types named like this: ANativeString_64 and UNativeString_128 (ansi and unicode variants) and so on, as I say this worked but was messy.

Each type was a pure struct (as in the new generic constraint 'unmanaged') so when used as member fields in other structs left that containing struct pure, giving us contiguous chunks of memory that contained strings.

As I say this worked very well but was messy under the hood and challenging to maintain.

So could we consider a new primitive type:

string(64) user_name;

for example?

Such strings could be declared locally resulting in a simple stack allocated chunk, or as members within classes/structs in which case they appear inline just like fixed buffers do...

(just to be clear I'm not seeking the capacity to be defined at runtime but at compile time, and I know my syntax won't work but wanted to convey the idea).

HaloFour · 2018-12-21T13:31:14Z

How do you propose something like this be implemented? Types in the CLR must have a known size, so the only method would be to emit a different (and incompatible) struct for every size of this "string".

Korporal · 2018-12-21T13:33:16Z

@HaloFour - In a similar way that fixed buffers are. In fact these could be implemented as fixed buffers but wrapped in some syntactic sugar.

Korporal · 2018-12-21T13:37:18Z

string(64)

Becomes

`struct string_64
{

int curr_length;
fixed char text[64];

}
`

plus a bunch of properties etc.

HaloFour · 2018-12-21T13:42:53Z

You mention that fixed buffers didn't work for your solution. Any similar solution for strings would have the same limitations, the length would have to be a constant known at compile time. Why weren't fixed buffers sufficient for your purposes?

HaloFour · 2018-12-21T13:57:59Z

plus a bunch of properties etc.

That's the other problem, you have to generate a bunch of separate members just to make these things workable, and they'd all be incompatible with one another, as well as all normal string APIs.

Korporal · 2018-12-21T13:59:19Z

@HaloFour

We wanted consumers of our client'server API to be able to freely declare messages, here's what a user defined message might look like (the code is unavailable to me just now so excuse typos).

public class LoginMessage : Message
{
public UNativeString_64 Name; // 'U' for unicode
public UNativeString_16 Password;
public UNativeString_32 Application;
...
}

This is how we wanted consumers to use it (and they do) but as you can see we needed a family of types (structs) for a large set of predefined capacities. We use a T4 template to generate these types. Because structs cannot inherit we could not use an abstract base class so we had to rely on an interface (INativeString).

That interface defined to/from string conversions and compare etc.

With the above design it worked well but a user could not use a UNativeString_132 if that wasn't one of the variants we created and we could not create one for every possible capacity, we stopped at like 2048 or so and went up like 4, 8, 16, 32, 64, 96, 128 etc etc.

So as you can see the consumer has no idea how UNativeString is implemented and they cannot even see the underlying fixed buffer. (There's also an ANativeString for single byte charsets).

In user code these members were interchangeable with string because of the conversion operators and so they had no idea that these were not actually strings.

At runtime a LoginMessage was a single contiguous block of memory something we can serialize very very quickly indeed. (A million instances per second on an i7 3960 CPU core).

Korporal · 2018-12-21T14:09:13Z

plus a bunch of properties etc.

That's the other problem, you have to generate a bunch of separate members just to make these things workable, and they'd all be incompatible with one another, as well as all normal string APIs.

Exactly, that's the motivation for this post, to introduce a new kind of string type that provides all this out of the box.

HaloFour · 2018-12-21T14:11:07Z

Seems like a very highly specialized solution that would have very narrow benefits but is massively complicated.

Exactly, that's the motivation for this post, to introduce a new kind of string type that provides all this out of the box.

You're not asking for one string type, you're asking for 2 billion potential string types, every single one of them with a separate set of members. That would certainly result in metadata explosion.

Korporal · 2018-12-21T14:22:53Z

Seems like a very highly specialized solution that would have very narrow benefits but is massively complicated.

Exactly, that's the motivation for this post, to introduce a new kind of string type that provides all this out of the box.

You're not asking for one string type, you're asking for 2 billion potential string types, every single one of them with a separate set of members. That would certainly result in metadata explosion.

@HaloFour

Well I'm certainly asking for something but not quite that. What we want (ultimately) is some syntactic mechanism that can convert this:

string(75) user_name;

into this:

struct string_75
{
fixed char user_name[75];

...various properties...
}

Or ideally a new type that is implemented better than this, but one that enables the user to declare the capacity at compile time without them needing any knowledge of the implementation.

C# could perhaps support the passing of constants into the declaration of type instances, this would probably be enough.

I mean support this:

public class MyClass (int size) // This language feature would require the supplied value be a compile time constant.
{

private fixed byte Name[size];

}

This way we could create types that must contain compile time constants, that let the consumer of the type provide that compile time constant.

Then a consumer could just code:

MyClass(79) SomeMessage;

This is I think the fundamental requirement here, a way to propagate compile time constants into type declarations...

HaloFour · 2018-12-21T14:36:14Z

This is I think the fundamental requirement here, a way to propagate compile time constants into type declarations...

That would involve CLR changes, and the end result is effectively the same, it's just that now the CLR has to generate potentially 2 billion different flavors of that class.

Korporal · 2018-12-21T14:47:56Z

This is I think the fundamental requirement here, a way to propagate compile time constants into type declarations...

That would involve CLR changes, and the end result is effectively the same, it's just that now the CLR has to generate potentially 2 billion different flavors of that class.

@HaloFour

Perhaps, so I guess what I'm asking is for a better way to solve this problem - on the surface this sounds rudimentary - provide support for fixed capacity (value type based and hence inline) strings - if we forget about what I've said above and what we currently do to implement this and just step back and view this as an abstract problem - are there options?

Many other languages support the idea of fixed capacity strings so in principle this isn't a major challenge...or is it?

We used fixed buffer wrapper structs only because we had no other way to deliver this, but being able to do this at a deeper langauge/CLR level might well be slicker and less messy, some of the problems you mention may be due purely to the way we implemented this and not necessarily inherent in the problem itself.

CyrusNajmabadi · 2018-12-21T15:05:05Z

Let's work backwards on this a bit. IN many cases the language has adopted these sorts of 'less easy to use' but 'much more performant' solutions when the gains were made quite explicit. Could you show some real world examples that would benefit from this (along with measurements)? Basically, a real world piece of code that you would envision using this.

To get the perf measurements, it would likely suffice to convert that real code to use fixed-size-buffers and see what the resulting difference was.

HaloFour · 2018-12-21T15:36:11Z

Many other languages support the idea of fixed capacity strings so in principle this isn't a major challenge...or is it?

The challenge here is the CLR which offers no real facility to accomplish this. Without the CLR it would be relatively easy. Languages that did support fixed-length strings, like Visual Basic, lost them in the transition to .NET.

Korporal · 2018-12-21T15:38:11Z

Let's work backwards on this a bit. IN many cases the language has adopted these sorts of 'less easy to use' but 'much more performant' solutions when the gains were made quite explicit. Could you show some real world examples that would benefit from this (along with measurements)? Basically, a real world piece of code that you would envision using this.

To get the perf measurements, it would likely suffice to convert that real code to use fixed-size-buffers and see what the resulting difference was.

@CyrusNajmabadi @HaloFour

The application is a .Net to .Net messaging platform. Because of this instances can be serialized (basically) as a memcpy, the performance of this is outstanding (as I mentioned one benchmark sees 1,000,000 instances per second on a i7 3960 core, these are small messages but did contain a few of these fixed length strings.

Other forms of serialization do not achieve these levels.

Recent advances to C# (improved ref support and generic pointers and unmanaged constraint) solve many of the problems we had to address through our lower level code and we could rewrite some of these lower layers now and simplify them quite a lot.

However the inline fixed capacity strings remain as a contrivance in our design as I explained this is quite ugly (but works very well).

Because a type layout in the CLR is the same across different machines (guaranteed if we're using the same version of the CLR at each end) it is very easy to serialize an instance (even of a class) provided all fields are inline. The recipient then simply creates an instance of the type and "overwrites" its field block with the received block of bytes that were sent.

That's the principle anyway (of course we also need to send and cache type descriptions and so on but this is all part of the low level handshaking and protocol).

All of this sits on top of a robust async socket management layer with ring buffers and stuff, but at the outer level app developers can just creates classes that inherit from Message and everything works very well and very very fast (we do runtime checks that ensure their class truly consists only of unmanaged fields, caching this info for later reuse).

Of course sending the source for this is not easy as the system contains lots of other proprietary stuff and includes some dynamic method generation for certain things.

I'm sure you get the idea though and I'm sure you can understand how this achieves very high performance.

HaloFour · 2018-12-21T15:43:05Z

@Korporal

Because a type layout in the CLR is the same across different machines

That sounds like a very dangerous assumption. From my understanding, unless you're using explicit layout, the CLR will layout the native member of that struct any way it sees fit, which can differ based on platform.

Anyhow, in regards to benchmarking, you're probably going to have to demonstrate that difference and why only this particular solution is suitable. And you're likely going to have to compare that to the various other high-performant serialization libraries out there.

CyrusNajmabadi · 2018-12-21T15:46:18Z

The application is a .Net to .Net messaging platform. Because of this instances can be serialized (basically) as a memcpy, the performance of this is outstanding (as I mentioned one benchmark sees 1,000,000 instances per second on a i7 3960 core, these are small messages but did contain a few of these fixed length strings.

Other forms of serialization do not achieve these levels.

Can you give numbers on what the values would be here with a normal string?
can you provide a small, but somewhat realistic example program? i.e. i don't want a total micro-benchmark. but i would lke to see something showing an expected usage pattern, with normal incoming data, and how this would be different.

Thanks!

Korporal · 2018-12-21T15:47:40Z

@Korporal

Because a type layout in the CLR is the same across different machines

That sounds like a very dangerous assumption. From my understanding, unless you're using explicit layout, the CLR will layout the native member of that struct any way it sees fit, which can differ based on platform.

Anyhow, in regards to benchmarking, you're probably going to have to demonstrate that difference and why only this particular solution is suitable. And you're likely going to have to compare that to the various other high-performant serialization libraries out there.

@HaloFour - All designs involve compromises and assumptions, provided one is very clear about what these are and takes steps to verify these at runtime where necessary then what we do works very very well. For users who use Windows and the same hardware architecture (Intel, AMD) on the participating nodes (not a huge requirement) they can get these gains in performance.

CyrusNajmabadi · 2018-12-21T15:48:14Z

Anyhow, in regards to benchmarking, you're probably going to have to demonstrate that difference and why only this particular solution is suitable. And you're likely going to have to compare that to the various other high-performant serialization libraries out there.

Agreed. This is very much feeling like a library problem currently. It may be necessry to elevate to a CLR/Language problem. But it would be really necessary to demonstrate why existing library solutions are insufficient.

Note: CoreFx/asp.net was pretty involved in passing feedback along about the areas they needed help for perf. That's what led to all the ref/span/readonly stuff. I don't recall any feedback about this particular area. And they're def trying to make high-perf servers.

Korporal · 2018-12-21T15:53:08Z

The application is a .Net to .Net messaging platform. Because of this instances can be serialized (basically) as a memcpy, the performance of this is outstanding (as I mentioned one benchmark sees 1,000,000 instances per second on a i7 3960 core, these are small messages but did contain a few of these fixed length strings.

Other forms of serialization do not achieve these levels.

Can you give numbers on what the values would be here with a normal string?

can you provide a small, but somewhat realistic example program? i.e. i don't want a total micro-benchmark. but i would lke to see something showing an expected usage pattern, with normal incoming data, and how this would be different.

Thanks!

I could spend time doing that but as soon as the object's fields contain reference types one must use an alternative serialization method and even protocol buffers do not come close.
Not sure exactly what you're seeking here, you mean what the user would write? or what the underlying architecture looks like?

At the outermost level an app developer creates an instance of MessageChannel that providse sync/async way to send/recv data. For example SendMessage (sync) takes a Message instance (that is a user class derived from Message).

The base message class contains a lot of low level mechanisms that enable us to get the address and size of the object's field block and then "memcpy" it to a byte[] the rest you can probably envisage).

Korporal · 2018-12-21T15:55:46Z

I should add too that the system includes various optional compression and encryption modes, but these are not rocket science as you can imagine. I cannot over stress the impact that making all data inline has, this is the key to outstanding performance (I used to work in the City of London many years ago and have a lot of experience in this area on various platforms and languages).

CyrusNajmabadi · 2018-12-21T15:56:17Z

the rest you can probably envisage).

I'd prefer if htere was a real piece of code that could be used as the exemplar case here. :)

Honestly, i'm not trying to make your life hard. I'm just pointing out that for language features that exists solely for perf needs, we need real world code to look at and understand, so we can best assess what the right sort of solutions would be (and just to validate how things would improve).

--

Another way of putting it:

Imagine if we added this feature... and then you used it... and it didn't make performance any better. The feature would be a failure at its core goal. So we actually need some way of validating things.

Furthermore, imagine if we added this feature, and you couldn't use it, because there was some limitation (akin to the ref limitations we have), and your own use case violated that limitation. This would also then fail.

CyrusNajmabadi · 2018-12-21T15:57:41Z

I should add too that the system includes various optional compression and encryption modes, but these are not rocket science as you can imagine. I cannot over stress the impact that making all data inline has,

But you need to. Because other teams that are doing precisely this are not coming to us with this being a use case that must be addressed. These other teams are working in very competitive arenas, trying to squeeze out all the perf they can. Right now, this isn't a place they are finding problematic. So it's hard to gauge for certain if what you are saying is generally applicable, or if this is a very specific problem to your domain.

CyrusNajmabadi · 2018-12-21T15:59:19Z

Another way of putting it: You're the one asking for this to be done. Like it or not, that means the legwork is on you to provide enough compelling data to make others feel like this is worthwhile. It's unlikely that anyone else is going to go do it for you. So, to maximize your change of success here, it is necessary to go beyond just saying you'd find it useful for your use case :)

Korporal · 2018-12-21T16:00:59Z

the rest you can probably envisage).

I'd prefer if htere was a real piece of code that could be used as the exemplar case here. :)

Honestly, i'm not trying to make your life hard. I'm just pointing out that for language features that exists solely for perf needs, we need real world code to look at and understand, so we can best assess what the right sort of solutions would be (and just to validate how things would improve).

--

Another way of putting it:

Imagine if we added this feature... and then you used it... and it didn't make performance any better. The feature would be a failure at its core goal. So we actually need some way of validating things.

Furthermore, imagine if we added this feature, and you couldn't use it, because there was some limitation (akin to the ref limitations we have), and your own use case violated that limitation. This would also then fail.

@CyrusNajmabadi - I understand Cyrus, I guess the only way to show the benefit would be to alter our library to support an alternative serialization method but the system is deeply predicated on this (at the core anyway) so this would be quite a lot of work.

Bear in mind that the performance gain here is pure CPU, the cost of CPU time in sending and receiving these messages is is far lower than something that uses XML serialization, MS Binary serialization or protocol buffers.

A memcpy of a message is very tiny perhaps hundreds of nanoseconds or less on an i7 3960 (our reference CPU when working on this).

Your question is interesting because we need to compare this architecture with another one and I don't have that other one.

HaloFour · 2018-12-21T16:02:25Z

@Korporal

Your question is interesting because we need to compare this architecture with another one and I don't have that other one.

What I would suggest is reimplementing a very basic form of this serialization architecture that could be compared directly to other serialization methods. Afterall, any solution here would have to be very general purpose.

Korporal · 2018-12-21T16:02:29Z

I did have benchmarks of the serialization layer, I'll see if I can dig these out - that might be a start!

Korporal · 2018-12-21T16:04:11Z

@Korporal

Your question is interesting because we need to compare this architecture with another one and I don't have that other one.

What I would suggest is reimplementing a very basic form of this serialization architecture that could be compared directly to other serialization methods. Afterall, any solution here would have to be very general purpose.

I agree, and recent support for generic pointers and unmanaged constraint could be used to remove some of our runtime verification steps (like where we check that a type contains no reference fields).

Korporal · 2018-12-21T16:04:53Z

I must get a flight soon, but thanks for taking the time to explore this.

tannergooding · 2018-12-21T16:08:44Z

This proposal is very similar to the fixed-sized buffer proposal which is already championed: #1314. However, this proposal seems more specialized.

CyrusNajmabadi · 2018-12-21T16:18:09Z

@CyrusNajmabadi - I understand Cyrus, I guess the only way to show the benefit would be to alter our library to support an alternative serialization method but the system is deeply predicated on this (at the core anyway) so this would be quite a lot of work.

Understood :) But... well... that comes with the territory. If you want to make a language change (esp. one related to perf), then this sort of comes with the territory. The only way to escape it is to get someone excited enough to do it for you :)

Korporal · 2019-01-06T18:08:33Z

@svick

@Korporal

I think the main problem with your arguments is that you're asking for a language feature that would specifically benefit your codebase. I don't think that's going to happen, not without demonstrating how that feature would benefit many other codebases.

It seems that you're correct here, also from what others say even wide appeal features stand only a small chance of getting included.

Specifically:

Anyway the main goal is to have the raw data inline - that's what enables very fast serialization, the overheads of setting getting the string is secondary.

I would like to see some evidence for that. It seems to me that you're not eliminating costs, you're just moving them around. That can be beneficial in some cases (e.g. when you're working with a single property on a large type), but are those cases widespread enough?

Consider updating say an option price, we can do it pretty much like this:

Option * option_ptr = datastore.GetItem<Option>(key); // can be updated soon to use new "ref" support.

option_ptr->bid_price = new_price;

This is a tiny cost (including the GetItem()) and enables updates to data at a very high rate and very low CPU cost, perhaps just 8 bytes change (e.g. a Decimal) despite the fact the Option might have many fields (including text fields like name, exchange etc).

The datastore incidentally is rather specialized and proprietary and local to the machine running the update operations, we can write to the store like this for example:

Option some_new_option = ...;

datastore.Write(ref some_new_option);

Because the code (a bit dated now but we can convert a ref to a ptr and vice versa with support code) can serialize very rapidly (using what I'm calling "memcpy" for ease of discussion) this too is very fast and low CPU.

We can (for example) get at message 124,236 in a file and deserialize it very rapidly indeed.

That's an argument for fixed-width serialized format, but not necessarily fixed-width in-memory format. Also, a similar effect could be achieved by using a variable-width format along with an index, or even a database.

As soon as the format begins to deviate from its in-memory layout you begin to incur significant costs. Nothing comes close to a single "memcpy" (e.g. CopyBlock). We can do this and have "strings" because of the AString_XX stuff we have.

Furthermore because the data is stored in an identical structure to its managed memory layout we can use managed code (via pointers but we could use ref more now since its been extended) to update the data because the layout is identical.

A key cost in high performance system like trading systems and so on is needlessly moving data, the less data you move and the faster you can move it the better.

That's what confuses me about your approach: you are needlessly moving data, when compared with simple string fields:

Not really, most of the work is updates and most of it to non-string fields.

When you write a property, you copy the whole string, instead of just a pointer.

When you read a property, you always allocate the string (which includes a copy), instead of allocating it only once at deserialization. (And you could probably do even better if you used Span<char> instead.)

When you copy the struct, you copy all the strings, instead of just few pointers.

This is true but as I've said earlier we don't update the "string" stuff much at all, these may be part of a lookup key or data that's used when reports are pulled for example. But 85% of the work is perhaps updating primitive numeric fields and 15% perhaps writing new items both of which operations are very fast.

Bear in mind that the datastore is part of the update service's (a Windows service) address space but not part of the AppDomain, this is a specialized datastore technology (with much of it written in C and Win32 as a native API) and without knowing that some of what I've said in this thread may not appear to make huge sense.

MillKaDe · 2019-01-06T18:19:11Z

@Korporal

Check this proposal, which would add int parameter(s) to generics: #749

If that proposal gets implemented, you could do something like this:

struct ValueString<CH, const int SZ> {
  fixed CH chars[SZ]; // fixed size inline array with SZ elements of type CH
  // misc functions, properties, operators, ...
}

ValueString<char, 16> MyStringU16; // fixed size string-like value type with 16 Unicode chars
ValueString<byte, 64> MyStringA64; // fixed size string-like value type with 64 Ansi/Ascii chars

To reduce code bloat, the functions of ValueString could be implemented in an inner private empty (field-less) static class / struct. These inner helper functions would take a Span<> (which contains size and address of the fixed array) as parameter. The functions of the outer ValueString struct would be simple (and therefore maybe inline-able) wrappers around the inner work functions ...

Note, that proposal 749 is not limited to chars, bytes, strings, one-dimensional arrays, fixed arrays ...

Korporal · 2019-01-06T18:28:08Z

@Korporal

Check this proposal, which would add int parameter(s) to generics: #749

If that proposal gets implemented, you could do something like this:
struct ValueString<CH, const int SZ> {
  fixed CH chars[SZ]; // fixed size inline array with SZ elements of type CH
  // misc functions, properties, operators, ...
}

ValueString<char, 16> MyStringU16; // fixed size string-like value type with 16 Unicode chars
ValueString<byte, 64> MyStringA64; // fixed size string-like value type with 64 Ansi/Ascii chars
To reduce code bloat, the functions of ValueString could be implemented in an inner private empty (field-less) static class / struct. These inner helper functions would take a Span<> (which contains size and address of the fixed array) as parameter. The functions of the outer ValueString struct would be simple (and therefore maybe inline-able) wrappers around the inner work functions ...

Note, that proposal 749 is not limited to chars, bytes, strings, one-dimensional arrays, fixed arrays ...

@MillKaDe - Good lord, how did I miss that (I think someone else mentioned it and I glossed over it - inexcusable).

Your are absolutely right, that is exactly what's called for. I think this would work for me, very glad you mentioned this!

Thanks

CyrusNajmabadi · 2019-01-07T02:29:29Z

If you think that LoginMessage2 is "better" than LoginMessage then we're at an impasse and I cannot force you to adjust your view.

I definitely think it's better. It's something you can do today. It uses the well-supported and understood 'Span/ReadOnlySpan' types. It's really simple (though does require some boilerplate in a few places). It will interoperate with teh rest of the high-perf, low-overhead, side of C#/.net.

Creating something new for this niche case seems pretty objectively worse. It would take years to get it. Would likely need an entirely new way of working with it. Would have to have a design around how it could work in the ref/span world, etc. etc.

CyrusNajmabadi · 2019-01-07T02:33:43Z

What I find interesting (and this is not a criticism of anyone, the team or the language) is that something that seems on the surface straightforward actually presents such big challenge.

You're proposing something that wants to introduce a very different programing model than hte one that C# has had since 1.0, while also interoperating seamlessly with 20+ years of existing APIs. That's non-trivial.

it's equivalent to me coming to rust and asking it to have a totally different ownership model than what it has today. Or going to C++ and wanting lexical scoping to work entirely differently. It may be 'something that seems on the surface straightforward', but only is that way because it can ignore the deep design decisions and history involved here.

Korporal · 2019-01-07T11:48:23Z

@CyrusNajmabadi - All I can say in response to your most recent remarks is that it seems to me you've ultimately designed yourselves into a corner. If inline string data types cannot be supported (and this is a rather trivial concept just look at strings in Pascal or PL/1) without the Herculean effort you claim, then that has to tell us something about how you've all designed this.

I can see now why you've been so critical, it's not that what I asked for is some huge piece of functionality, it's because your design and model is too restrictive, too inflexible.

YairHalberstadt · 2019-01-07T11:54:15Z

@Korporal

If inline string data types cannot be supported (and this is a rather trivial concept just look at strings in Pascal or PL/1) without the Herculean effort you claim, then that has to tell us something about how you've all designed this.

Indeed it does. It tells us that C# is a safe, garbage collected language without an ownership model.

You might as well say that if Prologue cannot support object oriented programming without herculean effort then that has to tell us something about how they've all designed it.

This is how the .Net programming model works. End of story. If you need to do something the programming model doesn't support, use a different language.

Korporal · 2019-01-07T12:04:56Z

@YairHalberstadt @CyrusNajmabadi - These analogies don't really help nor do I regard them as valid to be frank. Creating a supposed analogy (make Prolog more OO or change the scoping rules in C++) and then discrediting the analogy is referred to as a strawman argument in philosophy and logic, it has no place in a serious technical discussion.

HaloFour · 2019-01-07T12:12:19Z

@Korporal

If inline string data types cannot be supported

Span<char> inline = stackalloc char[100];

And it seems that there may be interest in treating fixed buffers as spans, which eliminates some boilerplate as you can use them in an expanding ecosystem of APIs.

These analogies don't really help nor do I regard them as valid to be frank.

Every language is a tradeoff of different philosophical concerns. Languages that allow arbitrary stack allocation and reinterpretation are inherently much less safe than C#, especially if they don't have an ownership model. C# and the CLR never has to concern itself with whether or not the memory backing a string has gone out of scope. This is why the compiler is so strict when it comes to ref locals/returns.

YairHalberstadt · 2019-01-07T12:15:40Z

It is not a strawman argument. You're arguing that something which is easy in a language with a completely different programming model is difficult in C#. Hence C# is badly designed.

We're pointing at that this is an obviously nonsensical argument, and giving some examples of the sort of nonsense conclusions you would come to if you applied that argument.

Korporal · 2019-01-07T12:20:36Z

@Korporal

If inline string data types cannot be supported

Span<char> inline = stackalloc char[100];

And it seems that there may be interest in treating fixed buffers as spans, which eliminates some boilerplate as you can use them in an expanding ecosystem of APIs.

These analogies don't really help nor do I regard them as valid to be frank.

Every language is a tradeoff of different philosophical concerns. Languages that allow arbitrary stack allocation and reinterpretation are inherently much less safe than C#, especially if they don't have an ownership model. C# and the CLR never has to concern itself with whether or not the memory backing a string has gone out of scope. This is why the compiler is so strict when it comes to ref locals/returns.

Clearly there is no prospect of getting what I sought and that's fine, if the experts see this as a huge challenge then I respect that. But I never asked for arbitrary stack allocation or reinterpretation! What I did seek was a value type mutable fixed capacity string like type which could be used in struct fields in much the same way as primitive types or fixed buffers.

Korporal · 2019-01-07T12:22:51Z

It is not a strawman argument. You're arguing that something which is easy in a language with a completely different programming model is difficult in C#. Hence C# is badly designed.

We're pointing at that this is an obviously nonsensical argument, and giving some examples of the sort of nonsense conclusions you would come to if you applied that argument.

This is getting silly, I never anywhere said C# was badly designed, this is a false statement and I'm finished with this issue now, thank you all.

YairHalberstadt · 2019-01-07T12:27:09Z

@Korporal

All I can say in response to your most recent remarks is that it seems to me you've ultimately designed yourselves into a corner. If inline string data types cannot be supported (and this is a rather trivial concept just look at strings in Pascal or PL/1) without the Herculean effort you claim, then that has to tell us something about how you've all designed this.

I can see now why you've been so critical, it's not that what I asked for is some huge piece of functionality, it's because your design and model is too restrictive, too inflexible.

Korporal · 2019-01-07T12:31:10Z

@YairHalberstadt - so quote me next time rather than paraphrasing me, my remarks are not logically equivalent to "C# is badly designed" else I would have said that.

YairHalberstadt · 2019-01-07T12:32:53Z

Ok. I'm sorry for paraphrasing you. Let's leave it here shall we?

CyrusNajmabadi · 2019-01-07T14:50:45Z

All I can say in response to your most recent remarks is that it seems to me you've ultimately designed yourselves into a corner

Yes. In the same way you would be designed into a corner if you wanted to do something against the rust ownership model, or the C++ lexical scoping model.

Korporal · 2019-01-07T15:59:57Z

All I can say in response to your most recent remarks is that it seems to me you've ultimately designed yourselves into a corner

Yes. In the same way you would be designed into a corner if you wanted to do something against the rust ownership model, or the C++ lexical scoping model.

@CyrusNajmabadi - I was done with this thread yet you persist in arguing? Do you really want me to respond to this remark? I disagree with you but lets please drop this now.

Thanks.

CyrusNajmabadi · 2019-01-07T17:24:56Z

Creating a supposed analogy (make Prolog more OO or change the scoping rules in C++) and then discrediting the analogy is referred to as a strawman argument in philosophy and logic, it has no place in a serious technical discussion.

I'm pointing out that you're asking to change something very intrinsic to the language. And yes, in that case, it's non-trivial to ask for htat. I hoped, by way of example, that might be clearer to you.

CyrusNajmabadi · 2019-01-07T17:26:28Z

I disagree with you

Could you explain what you disagree with? My point still stands and is valid. It is non-trivial to change something fundamental to how the language was designed. I hoped, by way of analogy, to help make that more understandable. I can go more in depth about this specific issue if you want. But it seems like we're somewhat at an impasse there as well.

CyrusNajmabadi · 2019-01-07T17:28:56Z

If inline string data types cannot be supported (and this is a rather trivial concept just look at strings in Pascal or PL/1) without the Herculean effort you claim, then that has to tell us something about how you've all designed this.

And there's lots of stuff htat is trivial in some languages that would be hard to express in Pascal or PL/1. What's your point? All languages make tradeoffs based on the things they find most and least valuable. It's trivial for me to do some things in C# that are much harder in other languages. And the same is true for me with Rust, Python, Go, and TypeScript (languages i use on a regular basis). All this tells you is that, in terms of design, the activity you are trying to get language support for is not something hte language designers thought was important enough to give support to. And that's something i've been telling you since just a few posts into this issue. Your use case is hugely niche. It's not at all common, and it's already supported in a manner that is felt to be "good enough" by what the language has already shipped with.

Korporal · 2019-01-07T17:36:33Z

@CyrusNajmabadi - Clearly this is not a capability that is going to get any support so there's little value in discussing the issue further. We could discuss the process we used to discuss the issue if you want, but that really isn't something I'd expect to do in an issue thread like this one.

CyrusNajmabadi · 2019-01-07T17:37:14Z

What I did seek was a value type mutable fixed capacity string like type which could be used in struct fields

You can have that today. It's call a fixed size buffer. The main problem is that when presented with this option, you have rejected it. So you want something that is more than what you specified above. Namely, you want a fixed-size-buffer with a lot of convenience in the language that helps you avoid writing the code to work with the fixed-size-buffer.

In other words, you just want things to be more pleasant. You're not asking for something that is not feasible today. This is certainly something you can want. But, then, the onus is on you to properly explain why this is so important a need. For example, there were lots of tangents about performance. However, your language proposal doesn't help with performance. i.e. it would boil down to just the same code that i showed above. So, really, it would be about just making things a little more pleasant. And, frankly, that's a hard thing to sell since you'd be asking to make an utterly niche area of the language nicer for an incredibly tiny group.

CyrusNajmabadi · 2019-01-07T17:42:13Z

Clearly this is not a capability that is going to get any support so there's little value in discussing the issue further.

The primary issue here is that you haven't really been listening to relevant feedback and advice on how best to get a language change in. The most critical thing to realize is:

Nothing gets into the language without a 'champion'.

That's really it. At the end of the day, nothing else really matters. So, if you want such a change to C#, the goal is to be able to get someone to 'champion' what you want (or something close enough). That means that the best thing you can do for the things you want is make a cohesive and compelling argument that is factual, accurate, and convincing as to why this is an exceptionally worthwhile thing to do.

The feedback you've been getting has been to help understand why your current position is not there currently. For example, the tangents about perf aren't accurate (as you can get the perf you want today). The same holds for any sort of argument that implies that this sort of thing isn't possible today. Finally, little (if any) effort has been spent explaining why it would be so valuable to save a few lines of code (that you can write today) to warrant needing a full-fledged language feature here.

So, effectively, the feature is a non-starter. Not because it's not a good idea. But entirely because the issue does not put in the necessary effort in the right ways to convince someone to champion in it.

Cheers, and i hope this helps!

CyrusNajmabadi · 2019-01-07T17:44:17Z

Note: "painted yourself into a corner" generally has a negative connotation. The idiom conveys the idea that you're now at the point where any direction you go you'll invariably have problems because you are going to walk through paint and make a mess. In other words, it directly implies that you did not think about what you were doing as you went along and now are in a situation you need to extricate yourself from.

Korporal · 2019-01-07T18:02:08Z

@CyrusNajmabadi - I disagree with some of the claims you make in your latest four posts, but as I say this isn't the place to discuss the discussion process itself. I find your tone a little rude and impatient and frankly a discouragement to open informal technical discussion. I've stated several times very recently in this thread that I recognize that what I asked people to consider is - in their informed opinion - very unlikely to ever see the light of day, I've acknowledged that and quite prepared to cease further discussion yet you persist.

CyrusNajmabadi · 2019-01-07T18:06:25Z

My advice was given in the spirit of making you the most successful at getting improvements to the language that would help you out. I tend to try to want to help steer things in that direction, and i often attempt to move things away from overfocusing on areas that go against that. Having done this a long time, my goal is toward both having things done as efficiently as possible, as well as avoiding the things that often derail a proposal entirely.

I'm sorry you took my feedback negatively. However, i do recommend you keep a lot of what i mentioned in mind for the future.

Note: if your goal is simply discuss things, i highly recommend gitter.im/dotnet/csharplang as a better venue. Github itself is a place for seriously moving proposals along to real language features. And, if that is your goal, then a lot of what i was talking about and focusing on is very relevant.

Cheers!

Korporal · 2019-01-07T18:09:59Z

@CyrusNajmabadi - Again your condescending tone is evident:

However, i do recommend you keep a lot of what i mentioned in mind for the future.

I too recommend that you do the same.

I'm sorry you took my feedback negatively.

It isn't feedback that I take negatively it is condescension, inaccuracies, false statements and inappropriate analogies and misleading paraphrasing.

CyrusNajmabadi · 2019-01-07T18:13:35Z

I will indeed work to make it clearer what i am attempting to help address, and what steerage on an issue will help give it the highest chance of success. Thanks!

Korporal closed this as completed Jan 7, 2019

VSadov mentioned this issue Nov 3, 2021

[API Proposal]: InlineArrayAttribute dotnet/runtime#61135

Closed

Provide support for fixed capacity, variable length value type (inline) strings. #2099

Provide support for fixed capacity, variable length value type (inline) strings. #2099

Comments

Korporal commented Dec 21, 2018 • edited Loading

HaloFour commented Dec 21, 2018

Korporal commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

HaloFour commented Dec 21, 2018

HaloFour commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

Korporal commented Dec 21, 2018

HaloFour commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

HaloFour commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

CyrusNajmabadi commented Dec 21, 2018

HaloFour commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

HaloFour commented Dec 21, 2018

CyrusNajmabadi commented Dec 21, 2018 • edited Loading

Korporal commented Dec 21, 2018

CyrusNajmabadi commented Dec 21, 2018

Korporal commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

CyrusNajmabadi commented Dec 21, 2018

CyrusNajmabadi commented Dec 21, 2018

CyrusNajmabadi commented Dec 21, 2018

Korporal commented Dec 21, 2018 • edited Loading

HaloFour commented Dec 21, 2018

Korporal commented Dec 21, 2018

Korporal commented Dec 21, 2018

Korporal commented Dec 21, 2018

tannergooding commented Dec 21, 2018

CyrusNajmabadi commented Dec 21, 2018

Korporal commented Jan 6, 2019 • edited Loading

MillKaDe commented Jan 6, 2019

Korporal commented Jan 6, 2019

CyrusNajmabadi commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

Korporal commented Jan 7, 2019 • edited Loading

YairHalberstadt commented Jan 7, 2019

Korporal commented Jan 7, 2019 • edited Loading

HaloFour commented Jan 7, 2019

YairHalberstadt commented Jan 7, 2019

Korporal commented Jan 7, 2019

Korporal commented Jan 7, 2019

YairHalberstadt commented Jan 7, 2019

Korporal commented Jan 7, 2019

YairHalberstadt commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

Korporal commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

Korporal commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

Korporal commented Jan 7, 2019

CyrusNajmabadi commented Jan 7, 2019

Korporal commented Jan 7, 2019 • edited Loading

CyrusNajmabadi commented Jan 7, 2019

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

CyrusNajmabadi commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Dec 21, 2018 •

edited

Loading

Korporal commented Jan 6, 2019 •

edited

Loading

Korporal commented Jan 7, 2019 •

edited

Loading

Korporal commented Jan 7, 2019 •

edited

Loading

Korporal commented Jan 7, 2019 •

edited

Loading