-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow usage of the same attribute name in different branches within variant types #19
Comments
Keep in mind that discriminators can be grouped together, though sometimes you only want to share certain fields. |
What about: of Kind1, Kind2:
someAttribute: string |
And while we are at it, should it also be possible to have attributes of the same name but with a different type depending on the branch? This question was also raised in IRC and would improve this code for example, i.e. |
No.
No, it wouldn't. |
IRC discussion here (Starts at 12:38:06) |
Conclusion: Alternatively we can weaken the disjoint condition for 'case' in 'object'. case kind: NodeKind
of foo, bar, baz: children: seq[Node]
of foo: additonalFieldForFoo: int
of bar: onlyvalidForBar: float then it's much more obvious that 'children' is actually a shared field |
👍 for this variant |
Wouldn't this cause divergence in the case statement syntax for type variants and conditions though? Why not just allow if statements then? In addition, it still doesn't resolve the ambiguities created by an identifier being two types - What happens if |
@Varriount, there is no divergence in Araq's variant. |
@vegansk Ah, I see. However there is still the issue that we now have two different, subtle behaviors for case statements. Enacting this change means that case blocks accept duplicate labels iff they are in a type definition, but not when they are in regular code. That's what I meant by divergent behavior. This adds yet another piece of subtle complexity and inconsistency to Nim's syntax, which already has enough of those. I'd much rather allow duplIcate member names in each |
The limitation that two cases must either share all fields or none seems unreasonable. |
Allowing to do this would allow much nicer type definition and usage for types (in some cases) |
👍 to this and 👎 to @Araq's conclusion |
I thought @Araq's reply was completely unproductive in the first place. IIRC he seemed more pissed off because he didn't want to consider implementing it. But I'll let this rest, it's nice that this is finally getting attention. |
Dunno why you think that. My proposal would actually work, the other proposals are just nuts. I don't care how many thumbs down I get for it, you cannot vote for nonsense. |
@Araq What's the problem with the other proposal? If the fields with a shared name are required to also have the same type, then it accomplishes the same thing as your proposal, but with a more obvious syntax (at least to me). |
The proposal in the first post is how things are done in Haskell, and it frequently became useful: Entry = ref object
case kind
of File:
name: string
modified: datetime
size: int64
of Dir:
name: string
modified: datetime
of Device:
name: string |
Under my proposal this would become: Entry = ref object
case kind
of File, Dir, Device: name: string
of File, Dir:
modified: datetime
of File:
size: int64 which is clearer. IMO anyway. |
IMO the original proposal is better and more easily readable than @Araq 's proposal as over there I can easily visually parse which fields a particular branch has by just looking at that branch, whereas in Araq's proposal I have to look at the complete object to know that. |
Yeah and in my proposal you can easily see the "shared" aspect of the fields. |
Less code duplication wins in my book. So I have to vote for @Araq's proposal. Especially since I assume implementing it is significantly easier. |
Pro @Araq proposal:
Pro classic syntax:
So i feel that Araq proposal avoids redundancy (which is source of errors), but makes the type definition less readable. But all problems mentioned can be fixed with macroses. And while theoretically I can imagine various problems, practically it should be rarely an issue. The real definition in my archiver actually looks better with Araq style: Entry = ref object
case kind
of File,Dir:
name: string
modified: datetime
attr: uint32
group: uint32
of Dir:
files: seq[Entry]
of File:
size: uint64
crc32: uint32 So, my voice is to implement any proposal - this feature is important to have, while concrete syntax isn't of much issue. If users will be really unhappy with chosen syntax, alternative one may be added much later. |
I should mention that patty already supports a slightly more convenient syntax for defining variant types, and when this feature is available, it will be extended to support it, so it will translate variant Entry:
File(name: string, modified: datetime, size: int64)
Dir(name: string, modified: datetime)
Device(name: string) into whatever syntax is chosen, for instance type Entry = object
case kind
of File, Dir, Device: name: string
of File, Dir:
modified: datetime
of File:
size: int64 Whatever concrete syntax is chosen, one can always write a macro to improve on that, so I wouldn't worry too much about which syntax is best (both have advantages) |
The problem with the non-Araq syntax is that it is totally misleading, in ML/Rust/etc the |
First, both proposals are semantically equivalent and can be mechanically translated to each other (as @andreaferretti plans to implement in patty). You can put restrictions on top of it according to your taste. I worked only with Haskell, where data Entry = File {name::String, size:: Int, attr:: Int}
| Dir {name::String, attr:: Int, files:: [Entry]}
BTW, I just realized that people married to FP like me will anyway use patty and don't care about compiler syntax. And if classic syntax is really harder to deal, implementing @Araq syntax in compiler and classic syntax in library - will mean shifting one more complexity from compiler to library. |
How about this one? type Entry = object
kind: EntryKind
case kind
of File, Dir, Device:
name: string
case kind
of File, Dir:
modified: datetime
case kind
of File:
size: int64 It's a bit cumbersome, but I suspect codegen could be easier that way. Things like the GC traversal will just generate as many |
@Araq so is there an agreed design here? I remember you proposed using |
I prefer Araq's design. The pros of deduplication (which help prevent errors), clarity of shared fields, and potentially easier to implement outweighs the cons of slightly unfamiliar syntax IMO. It's too bad that it differs from regular |
@PMunch I don't think unfamilar syntax is the problem: one can use Deduplication might be nice, but on the other hand you'll easily forget to add all subbranches for the same value . Otherwise, it's just a matter of how often does one need to
I think the second usecase is way more useful and important: otherwise I need to always read the whole type to find out all possible fields for However that's something that need measure |
It does not work for my use case. I am parsing JSON-serialized data that looks like this:
The only way to properly represent this in nim objects is by using the same attribute name in different variants. To use your pattern the JSON data would have to look like this:
|
@petabyteboy That looks to me more like an issue for the json (de)serializer. |
The problem I keep thinking in the back of my head is "how are the fields actually positioned in memory" for case sections with common fields? Suppose we have this object: type Object* = object
case kind: Kind
of File:
modified: Date
size: int64
name: string
of Directory:
modified: Date
name: string
of Device:
uuid: UUID What is the underlying organization? Is it: union {
struct {
Date modified;
int64 size;
string name;
} u_File;
struct {
Date modified;
string name;
} u_Directory;
struct {
UUID uuid;
} u_Device;
}; Where we "just don't care" about the underlying layout? Or is it more like: union {
struct {
Date modified;
int64 size; // access prohibited whilst kind == Directory?
string name;
} u_File_Directory;
struct {
UUID uuid;
} u_Device;
}; where we make an attempt to keep the location of mutual fields unified? |
As long as it gets in, I don't mind which proposal is accepted. I think having macros for creating ADTs and pattern matching in the standard library would be nice. |
I agree, having either one would be better than nothing. Why not both, though? Opinions are divided and I can see why: the initial proposal seems more readable for simpler objects, while Araq's seems more readable (and more handy) for complex ones. As been said, one is more reliable to implement than the other, but from the user standpoint this discussion makes it seem like both are equally valid. Imo, user comfort should be a core principle in the production of any tool (and that seems to be true in Nim, in many cases), so this should weigh in the decision. So unless the hard option is actually unreasonably unreliable to implement, why not go for both? (Since Araq's proposal is easier, it could be implemented first, and then the other one could be implemented in a "will be ready when it's ready" kind of way.) |
I would love to have this feature and I expect I'll use variant types more often.
Either proposal is fine with me. Sure there are readability considerations on both sides. But the bigger question to me is how much additional custom code will I have to write to do normal conversions with each new variant type I add?
|
@hiteshjasani from Nim 1.2.0 your gist is wrong, it produces the expected result. |
Today I also hit the same issue as @petabyteboy when considering using object variants for TDLib JSON object deserialization - objects in TDLib scheme can share some fields under the same name and type, so that even Araq's proposal would've worked fine. Example of tdlib schema: //@class AuthenticationCodeType @description Provides information about the method by which an authentication code is delivered to the user
//@description An authentication code is delivered via a private Telegram message, which can be viewed in another client @length Length of the code
authenticationCodeTypeTelegramMessage length:int32 = AuthenticationCodeType;
//@description An authentication code is delivered via an SMS message to the specified phone number @length Length of the code
authenticationCodeTypeSms length:int32 = AuthenticationCodeType;
//@description An authentication code is delivered via a phone call to the specified phone number @length Length of the code
authenticationCodeTypeCall length:int32 = AuthenticationCodeType;
//@description An authentication code is delivered by an immediately cancelled call to the specified phone number. The number from which the call was made is the code @pattern Pattern of the phone number from which the call will be made
authenticationCodeTypeFlashCall pattern:string = AuthenticationCodeType;
//@description Information about the authentication code that was sent @phone_number A phone number that is being authenticated @type Describes the way the code was sent to the user @next_type Describes the way the next code will be sent to the user; may be null @timeout Timeout before the code should be re-sent, in seconds
authenticationCodeInfo phone_number:string type:AuthenticationCodeType next_type:AuthenticationCodeType timeout:int32 = AuthenticationCodeInfo;
In JSON that looks something like that: {
"@type": "authenticationCodeInfo",
"phone_number": "+123456789",
"type": {
"@type": "authenticationCodeTypeCall",
"length": 15
},
"next_type": {
"@type": "authenticationCodeTypeFlashCall",
"pattern": "somestring"
},
"timeout": 15
} I guess I'll have to generate a bit more sophisticated deserialization than just using |
@Clyybber has a good proposal: type
NodeKind = enum
foo
bar
bif
baz
bad
Node = object
case kind: NodeKind
of foo, bar, baz:
children: seq[Node]
of bad, bif:
dad: Node
case kind
of foo:
additionalFieldForFoo: int
of bar:
onlyValidForBar: float
else: discard
I like this because it's explicit, behaves as you might expect for a case |
Ironically, I prefer Araq's proposal (#19 (comment)) over mine :D |
@disruptek seems like this is the same as in #19 (comment) :) |
Is this going to happen? |
Yes. Eventually. |
I like the brevity of @Araq's version but c'mon, do we really want |
What do you think about @Araq's variant, which is unwrapped into @Bulat-Ziganshin variant, but it will raise an error if we there is the same field with different type? I started to look at the RFC, and it is clear that @Araq's version is more type-safe, but during the implementation I found that it is necessary:
After it I decided that it is probably a bit more natural (and easier) to build structs like it is done already, but to raise an error if type is not the same. It is not necessary to create and explain the special case for the type kinds, at least that coverage check is a bit different. Also, it would help to have a bit more clear mapping from Nim to C code. Entry = ref object
case kind
of File:
name: string
modified: datetime
size: int64
of Dir:
name: string
modified: time // errormsg: "type for the same field cannot be redefined"
of Device:
name: string Correct me if I am wrong - it is my first 60minutes in Nim's compiler code. |
|
Can we have both mine and Araq (#19 (comment)) proposals implemented? I bet his version would be useful for code generation and code clarity since it ensures that each field has its type written only once - it may be important for anonymous types. Also, can we allow both
As Pascal and ML languages shown, the second version is enough in most cases. And I just realized that we can't have guarantee fixed field position with any proposal - sometimes we just don't have layouts for all cases that's compatible for same-named fields. |
I fully support this. For me this is the biggest stumbling block with case variant objects and the reason I don't really use them. If this was fixed much of my code could be converted to them. |
type Object* = object
case kind: Kind
of File:
modified: Date
size: int64
name: string
of Directory:
modified: Date
name: string
uuid: UUID
of Device:
uuid: UUID The only solution I see is to have all available fields in the struct, and restrict access to separate fields, but I suppose it is not good idea and brakes the main idea of unions. It would help to save field during conversion from one kind to another, but it is prohibited. I think that the anonymous types is much longer story that they would have the same type-safe problem @Araq mentioned in the RFC until Nim has access to the fields(or the anonymous-types) without patter-matching guards. |
I agree with @petabyteboy:
So from the principle of least surprise, I would be in favor of the original proposal. That being said, I'd be fine with Araq's proposal, while I'm less enthusiastic about @Clyybber's proposal. |
I just recalled that Pascal allows to specify common fields BEFORE the case part, which is enough for most use-cases. So our example can be written as: Entry = ref object
name: string
modified: datetime
attr: uint32
group: uint32
case kind
of File:
size: uint64
crc32: uint32
of Dir:
files: seq[Entry] Moreover, Pascal also allows to make nested variable part, thus specifying common part for several variants Type
MyRec = Record
X : Longint;
Case byte of
2 : (Y : Longint;
case byte of
3 : (Z : Longint);
);
end; |
The more I used object variants the more I prefer @krux02 proposal. Eventually it would be nice to have sum types in the language. Note, @mratsim has already made an excellent module that emulates classes/interfaces with this exact concept. Now the only thing missing, is to make a patty fork that works with these proposal. |
#368 is the winning proposal. |
The following code snippet raises an error during compilation:
Error: redefinition of 'someAttribute'
.Since only a single branch, depending on
kind
, can be active at once, it would seem logical for this to work -- since no actual redefinition ofsomeAttribute
ever takes place. Currently I need to work around this by naming the fields for each branch differently and creating a custom umbrella accessor proc that has to check the object kind at runtime and return the correct field, which is associated with its own problems: nim-lang/Nim#3628.The text was updated successfully, but these errors were encountered: