Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Autovivification #549

Closed
rdje opened this issue Feb 7, 2022 · 18 comments
Closed

[FR] Autovivification #549

rdje opened this issue Feb 7, 2022 · 18 comments

Comments

@rdje
Copy link

rdje commented Feb 7, 2022

The idea here is to be able to automatically build deeply nested data structures made of Arr and Hash for intermediate nodes.
The leaves could be anything of course (Any).

Assuming z is a variable, writing

   z.a[7]{A}{B}{C}[8]{mykey}="Hello World"

would automatically create all the intermediate Arres and Hashes until reaching the final Hash with 'mykey' as key to store the "Hello World" Str.

Note that in the above example I am assuming that the things inside {...} are stringified.

No matter how it is actually done, the idea is to provide a way to create deeply nested data structures based on a few well known types (Arr and Hash here) by simply referencing the leaves starting from the top level variable and letting NGS figure out (DWIM) what the intermediates are (Arr, Hash or something else) and initializing them as appropriate.

I used autovivification systemacally when parsing files and storing the extracted information in a structured way using deeply nested data structure (DNDT) made of Perl5 lists and hashes without worrying about predefining and initializing all the intermediate nodes (list / hash).

I just relied on Perl5 to DWIM to help me with constructing my model view dynamically.

People will find their own use cases but the bottom line is to write things as if the referenced terminal leaves were already present, NGS would then fill the gap (DWIM) and build the route or path to reach those final or terminal leaves.

The way I see it is that autovivification may be seen as a generalization of the exemple you gave me when I asked in a previous post about how to define data/variable member of a type MyType, you answered by writing

t.anything = aValue, assuming t:MyType

Here 'anything' may literally be anything, from a simple identifier to an Arr element reference to a Hash element reference or even to a full-fledged autovivified path to a leaf node value.

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 8, 2022

The idea here is to be able to automatically build deeply nested data structures made of Arr and Hash for intermediate nodes.

Sounds fine. In my opinion it should not be the default behavior as it can hide errors when you try to access something that you think should exist.

The way I see it is that autovivification may be seen as a generalization

I understand why but from my perspective it's very hard to see it the same way because right now it's special syntax E1.field_name = E2 and E1[E2] = E3.

Background

  • I am very reluctant to add new syntax in general. Let's put {key} aside for now?

Design alternative 1 for discussion:

z.a[7]{A}{B}{C}[8]{mykey}="Hello World"

would be

z.deep_set(7, 'A', 'B', 'C', 8, 'mykey', "Hello World")

what I like about this solution is it does not involve any new syntax and should be relatively easy to implement. Up for discussion:

  • The whole idea of having this facility as a function/method
  • The name deep_set
  • Parameters - maybe more idiomatic (but less convenient?) z.deep_set([7, 'A', 'B', 'C', 8, 'mykey'], "Hello World")
  • Return value: z or "Hello World" (z would be more idiomatic)

@rdje, let us know what you think

@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 8, 2022

I see why you would want orthogonal (A) description of what to look at and (B) what to do with that (get/set).

There is implementation problem with the suggested above (references). Assignments are of the forms (syntactically):

  • var_name = expr - implemented as built in
  • expr1.field_name = expr2 - calls .= method
  • expr1[expr2] = expr3 - calls []= method

There is no support for arbitrary expr1 = expr2. I also don't know at this point in time how to do it without performance penalty. Another thing to figure out: implementation of such reference.

Additionally, NGS already has get() and set() methods and the newly suggested method would fit in.

@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 8, 2022

I'm still thinking about refinements. Considering additional methods in the get() and set() multimethods. That would be more consistent with the rest of the language:

  • get(x:Any, path:Arr)
  • set(x:Any, path:Arr, val:Any)

An LHS means a memory location

It's better to see at as a value. It can be on stack for example if it's a result of computation. Think a + 2 = 3.

I totally understand having more thoughts with time. I frequently have that too. Feel free.

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 8, 2022

... except that get(x:Any, path:Arr) collides with get(x:Any, dflt:Any) so need to think more

@ilyash-b ilyash-b changed the title Thoughts about autovivification support Autovivification support Feb 8, 2022
@ilyash-b ilyash-b pinned this issue Feb 8, 2022
@rdje
Copy link
Author

rdje commented Feb 8, 2022 via email

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 9, 2022

a + 2 = 3

I was trying to show assignment to LHS which is not assignable.

would be to detect whether or not expr1 is mutable

Currently have no idea how to do it, especially at parsing time. I wouldn't like to have a function call for each assignment. That's a performance penalty which in current implementation is heavy.

_internal_hash.deep_set("varname", expr)

Nope. There are 3 possible different opcodes which are generated by assignment to a variable: OP_STORE_LOCAL, OP_STORE_UPVAR, and OP_STORE_GLOBAL. None of them is using "varname". They use indexes to avoid the penalty of runtime lookup.

expr1.field_name = expr2 ≡ expr1.deep_set("fieldname", expr2)

While the idea of having one generic implementation is understandable, it's probably not the best tradeoff with performance. If we already syntactically know what we are looking at, we should call the more specific function.

but is it really the case in reality, I doubt ! :)

:)


If you would like, @rdje, I could walk you through the implementation so you could get some feeling in general and about the involved mechanisms specifically.

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 9, 2022

I was talking about performance but I wouldn't like to project the wrong idea. NGS has very little performance optimizations and is generally pretty slow. I'm just trying not to make it worse. The worst (probably) offender is the naive implementation of a method call: the bottom-to-top scanning algorithm and type checking.

@rdje
Copy link
Author

rdje commented Feb 9, 2022 via email

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 9, 2022

Requesting feedback on the early stage experimental implementation - https://gist.github.com/ilyash-b/fd87050fd55e5d0de65a19cbee4b3ceb

@ilyash-b
Copy link
Contributor

ilyash-b commented Feb 10, 2022

Moving discussion to #550

@ilyash-b ilyash-b changed the title Autovivification support [FR] Autovivification Feb 11, 2022
@ilyash-b
Copy link
Contributor

ilyash-b commented Apr 9, 2022

@rdje , clarifying the plan. I would like to add something like this to the language only after feedback on real usage (of the prototype we have here).

@ilyash-b
Copy link
Contributor

ilyash-b commented May 1, 2022

Closing. Please re-open when real-world usage feedback is available.

@ilyash-b ilyash-b closed this as completed May 1, 2022
@ilyash-b ilyash-b unpinned this issue May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants