[superseded] ref syntax for lvalue expressions: `byRef: myref=x[1].foo[2]` #11824

timotheecour · 2019-07-25T01:41:16Z

this PR introduces a ref syntax for lvalue expressions, similar to what you can do in C++ with auto& a = someExpr()

compared to #11686 and prior attempts, there are 4 advantages:

better syntax: byRef: foo = expr vs alias(foo, expr)
side effect safe: the lvalue expression is evaluated only once, at declaration time, so any side effect will be evaluated only once. This avoids the pitfall mentioned here Alias sugar #11686 (comment)
type safe: CT errors are caught at declaration time, not on 1st use
allows export

See example2 vs example 3 below for the comparison

By design, it only works with lvalues and will give a CT error on non-lvalues (for which ref makes no sense).
It is a pure library solution.

example 1: simple example

import std/macros
var x = @[1,2,3]
byRef: x1=x[1]
x1+=10
doAssert type(x1) is int
doAssert x == @[1,12,3]

example 2 illustrating side effect safety, type safety

  import std/macros
  var count = 0
  proc identity(a: int): auto =
    block: count.inc; a # introduces a side effect (or an expensive computation etc)
  var x = @[1,2,3]
  byRef: x1=x[identity(1)] # the lvalue expression is evaluated only here
  doAssert count == 1 
  x1 += 10
  doAssert type(x1) is int # use x1 just like a normal variable
  doAssert x == @[1,12,3]
  x1 += 100
  doAssert count == 1 # count has not changed
  # byRef: x2=y[0] # correctly gives CT error: undeclared identifier: 'y'

example 3: comparison vs #11686

  var count = 0
  proc identity(a: int): auto =
    block: count.inc; a
  var x = @[1,2,3]
  alias(x1, x[identity(1)])
  doAssert count == 0
  x1 += 10
  doAssert count == 1 # count has changed
  x1 += 100
  doAssert count == 2 # count has changed again => bad
  doAssert x == @[1,2+10+100,3]
  alias(x2, y[identity(1)]) # doesn't give CT error even though `y` undefined

note

see also [superseded] alias: myecho=echo to alias any symbol #11822 which defines an alias syntax that works with symbols instead of lvalues, it's a different concept
a byPtr is analogously defined in this PR for cases where unsafeAddr is needed instead of addr, eg for unsafe access to non-var params; it is useful in some situations

[EDIT]

this PR provides a safer pattern than directly calling addr or unsafeAddr since the address doesn't escape the scope where it's defined
also allows export, eg: byRef: foo*=bar

ghost · 2019-07-25T06:44:49Z

These seem out of place in macros. sugar would be more fitting.

zah · 2019-07-25T11:03:12Z

There is a way to implement this in the compiler while preserving memory-safety. It would be a larger reform that would also allow a more relaxed usage of the var, lent and openarray types as fields in objects, as long as they appear only in stack locations. Our code in Nimbus has numerous occasions where having such a support will bring significant optimisations and I plan to write a detailed RFC about it as time permits. The proposal will be somewhat inspired by Herb Sutter's plans for introducing lifetime analysis in C++.

Araq · 2019-07-25T11:55:21Z

There is a way to implement this in the compiler while preserving memory-safety. It would be a larger reform that would also allow a more relaxed usage of the var, lent and openarray types as fields in objects, as long as they appear only in stack locations.

Just to ensure we're on the same page: "stack location" is the wrong idea for this, there must be some kind of borrowing going on. I propose to re-use the "location is derived from the first parameter" idea that we use for var T return types, we can expand it via from later but so far we've done well without from.

lib/core/macros.nim

zah · 2019-07-25T14:17:00Z

I'll try to clarify:

Let's use the term "referenced object" for an object owning memory that can be pointed to by a var/lent/openarray pointer. When I say "stack location", I'm referring mostly to the lifetime aspects. To ensure safety, you need to know that the lifetime of the referenced object is strictly enclosing the lifetime of the pointer. This is ensured when the pointer is a part of a newly created stack variable, because the lexical scopes ensure that this stack variable will be destroyed before the referenced object is destroyed.

The pointed memory itself can live on the heap (as it would be the case when you obtain an openarray view into a sequence). Most of the complications of the feature come from such containers that offer manual control over the referenced memory (i.e. you can call setLen on a sequence before it goes out of scope and this will invalidate all existing pointers pointing to it).

It the RFC, I'll explain how there is an isomorphism between the new rules and the current language features - every code written with the relaxed rules can already be written by extracting the continuation of a proc at a given line to a separate proc where the lent, var and openarray pointers can appear as parameters. Thus, to ensure memory safety, we need to implement the required lifetime analysis anyway.

dom96

better syntax: byRef: foo = expr vs alias(foo, expr)

I prefer the latter. My favourite would be:

alias:
  foo = expr

I guess you've decided to use byRef because you wanted to implement byPtr too, but AFAIK that separation is unnecessary.

lib/core/macros.nim

timotheecour · 2019-07-25T22:32:36Z

@dumjyl

These seem out of place in macros. sugar would be more fitting.

done.

PTAL, all comments addressed, except for @zah's comment (#11824 (comment)); my understanding is this would be a much more involved change, and that if it materializes it can be done in a backward compatible way to support more things inside byRef

timotheecour · 2019-08-23T17:34:06Z

@dom96 this is PR is still marked as "change requested", I believe I've addressed all comments so far; in general "changes requested" option is overkill compared to just "leave a comment" option, as it tends to stay there even after the comments were addressed

Araq · 2019-08-23T20:56:47Z

IMO the only thing left is bikeshedding about the names and whether it shouldn't be a Nimble module first, before we add hardly proven "syntax sugar". I quite like it though and maybe the chances we got it wrong are slim?

timotheecour · 2020-01-12T02:00:27Z

/cc @Araq ping

I've rebased to latest devel
added since + changelog
moved back to tsugar the test that was moved from distinctBase type trait for distinct types #13031 by @cdome (it had been moved because tsugar was too small, but now i'm adding more to tsugar so makes sense to add it back)
using a separate nimble module would just cause too much friction (we'd end up with sugar.byRef vs somepkg.byRef for more harm then benefit); plus it'd prevent its use in compiler sources etc)

Araq · 2020-01-12T12:35:09Z

Time moved on since then and I don't like it much anymore. Firstly the name should not contain ref, ref is already a heap pointer in Nim! Secondly, byPtr vs byRef is the wrong idea, the macro enables a safer idiom because the underlying pointer cannot escape and so unsafeAddr should be fine. Why not drop the distinction and use the name alias for it?

timotheecour · 2020-01-13T01:31:47Z

it's not just about escaping: the point is to use byRef in most cases and only byPtr when needed, for the same reasons as the split addr vs unsafeAddr; not only it helps when auditing code, but it is safer too as byRef prevents you from modifying let variables, unlike byPtr, see example I have in unittest:

block byPtrfBlock:
  type Foo = object
    x: string
  proc fun(a: Foo): auto =
    doAssert not compiles (block: byRef: x=a.x) # good, byRef prevents you from modifying immutable variable `a`
    byPtr: x=a.x # byPtr is more dangerous, allows you to modify immutable `a`
    x[0]='X'
  let foo = Foo(x: "asdf")
  fun(foo)
  doAssert foo.x == "Xsdf"

also, alias is different, reserved for future introduction of symbol alias as was already discussed elsewhere.

Araq · 2020-01-13T09:15:14Z

Fair enough but then the unsafeAddr variant doesn't need macro sugar, if it comes up, write the unsafeAddr out.

timotheecour · 2020-01-13T10:17:28Z

the situation is exactly symmetrical with byRef (via addr) vs byPtr (via unsafeAddr); If you just use unsafeAddr you'll need deref on each access:

byPtr: x = foo.bar[2].baz
x+=12
fun(x)
echo x

with byPtr macro you need deref on each access:

let x = foo.bar[2].baz.unsafeAddr
x[]+=12
fun(x[])
echo x[]

Araq · 2020-01-13T16:49:02Z

Yes, I know. unsafeAddr comes up rarely though and doesn't deserve sugar. Look, the macro encodes a safe idiom of addr. That's the only reason I like it so much, it adds something on top of addr, improved safety. The variant that uses unsafeAddr does no such thing.

timotheecour · 2020-01-13T18:39:02Z

PTAL; removed byPtr from this PR to unblock this (I hear your arguments and it's a sugar vs bloat tradeoff)

Araq · 2020-01-13T21:09:47Z

Well now it needs a good name, byRef is confusing since it has nothing to do with Nim's ref keyword.

timotheecour · 2020-01-24T00:35:05Z

can whoever (anonymously) marked my PTAL comments as spam kindly let me know why they did so?

a PTAL (please take another look) comment is commonly used to indicate that all reviewer comments were addressed (or need reviewer input) and that the PR is ready again for the reviewer to take a look. It also makes it visually clear when looking at a PR whether it's pending on reviewer or PR author.

This practice is common both in tech companies as well as in open source, eg:

Araq · 2020-01-24T11:04:59Z

can whoever (anonymously) marked my PTAL comments as spam kindly let me know why they did so?

Be assured it wasn't anybody from the core team. :-)

timotheecour · 2020-01-29T17:51:09Z

ping @Araq

Araq · 2020-01-30T20:36:54Z

The feature is fine now except I think it's ugly. Not your fault, but I want something like alias x = y.

timotheecour · 2020-01-31T02:36:09Z

Not your fault, but I want something like alias x = y.

but that's impossible, the colon is needed; did you see my answer here #11824 (comment) ?

byAddr x = expr is parsed as (byAddr x) = expr and changing that would require a parser change, eg a builtin pragma {.parse_as_var_definition.} which would parse byAddr x = expr instead as byAddr(x = expr)

macro byAddr*(def: untyped): untyped {.parse_as_var_definition.} = ...

But I don't see how something like that could be implemented robustly, as the parser would have to have access to semantic analysis when resolving something like foo x = expr to check whether foo is annotated with {.parse_as_var_definition.}, I don't see how that could work with symbol aliasing/renaming (from sugar import byAddr as byAddr2), or fully qualified sugar.byAddr2 etc)

And I don't think we should introduce a new builtin keyword for that feature, which is just a library thing, because other similar constructs may be needed (eg byUnsafeAddr as mentioned above) could be similarly defined (in nimble package / user code / future stdlib addition).

I think byAddr: x = expr is bearable, I got used to it easily

Araq · 2020-01-31T07:41:16Z

Well we can add a keyword to the language. Or maybe we can/should do let x {.byAddr.} = y

timotheecour · 2020-02-06T10:02:51Z

Well we can add a keyword to the language. Or maybe we can/should do let x {.byAddr.} = y

let x {.byAddr.} = y

that's more ugly, longer to type, and is confusing (given that you can write x+=2 even though it's shown as a let)
given that I expect to use this feature a lot (thanks to its performance advantages over let x = y; but also all other points mentioned in top post), we need a clean syntax.

so I opted for a new keyword:

byaddr x = y

=> see #13342

will close this PR if other PR is greenlighted

timotheecour · 2020-03-23T22:54:36Z

superseded by #13508

This was referenced Jul 25, 2019

[superseded] alias: myecho=echo to alias any symbol #11822

Closed

Alias sugar #11686

Closed

Araq reviewed Jul 25, 2019

View reviewed changes

lib/core/macros.nim Outdated Show resolved Hide resolved

dom96 requested changes Jul 25, 2019

View reviewed changes

lib/core/macros.nim Outdated Show resolved Hide resolved

lib/core/macros.nim Outdated Show resolved Hide resolved

timotheecour force-pushed the pr_byRef branch from 7d99e47 to e662217 Compare July 25, 2019 22:19

timotheecour force-pushed the pr_byRef branch 2 times, most recently from e8b060f to b437242 Compare July 30, 2019 18:34

timotheecour changed the title ~~nim now has a ref syntax for lvalue expressions: byRef: myref=x[1].foo[2]~~ [feature] ref syntax for lvalue expressions: byRef: myref=x[1].foo[2] Aug 2, 2019

timotheecour force-pushed the pr_byRef branch from b437242 to 3c51aec Compare September 8, 2019 20:55

timotheecour force-pushed the pr_byRef branch from 3c51aec to 0896919 Compare January 12, 2020 01:45

timotheecour closed this Jan 13, 2020

timotheecour reopened this Jan 13, 2020

timotheecour mentioned this pull request Jan 13, 2020

[CI] tnetdial flaky test #13132

Closed

This was referenced Jan 13, 2020

successX now correctly shows html output for nim doc, nim jsondoc; fix #13121 #13116

Merged

clarify spec/implementation for let: move or copy? #13140

Closed

timotheecour added 8 commits January 23, 2020 02:43

byRef with option to export

042fd91

fixup

cc37767

fixup

cda97f1

add changelog + since

a9cd644

removed byPtr

87d6cf9

byRef => byAddr

ad8a731

_

6b48b9d

fix comments

0e66b9a

timotheecour force-pushed the pr_byRef branch from 051a689 to 0e66b9a Compare January 23, 2020 10:43

timotheecour added 4 commits January 23, 2020 02:45

address comment

1d6b6bd

disallow export: byAddr: barx*=foo.bar.x

edf6960

now using explicit export in test

2bec2dc

improve splitDefinition

b1e338f

This comment has been minimized.

Sign in to view

timotheecour mentioned this pull request Feb 6, 2020

[superseded] new syntax for lvalue references: byaddr x = expr #13342

Closed

timotheecour mentioned this pull request Feb 26, 2020

new syntax for lvalue references: var b {.byaddr.} = expr #13508

Merged

timotheecour changed the title ~~[feature] ref syntax for lvalue expressions: byRef: myref=x[1].foo[2]~~ [superseded] ref syntax for lvalue expressions: byRef: myref=x[1].foo[2] Mar 23, 2020

timotheecour closed this Mar 23, 2020

timotheecour deleted the pr_byRef branch March 23, 2020 22:54

timotheecour restored the pr_byRef branch March 23, 2020 22:54

timotheecour deleted the pr_byRef branch March 23, 2020 22:55

timotheecour mentioned this pull request Apr 13, 2020

fix #13848: make var result work with nim cpp #13959

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[superseded] ref syntax for lvalue expressions: `byRef: myref=x[1].foo[2]` #11824

[superseded] ref syntax for lvalue expressions: `byRef: myref=x[1].foo[2]` #11824

timotheecour commented Jul 25, 2019 •

edited

Loading

ghost commented Jul 25, 2019

zah commented Jul 25, 2019 •

edited

Loading

Araq commented Jul 25, 2019

zah commented Jul 25, 2019 •

edited

Loading

dom96 left a comment

timotheecour commented Jul 25, 2019

timotheecour commented Aug 23, 2019

Araq commented Aug 23, 2019

timotheecour commented Jan 12, 2020

Araq commented Jan 12, 2020

timotheecour commented Jan 13, 2020

Araq commented Jan 13, 2020

timotheecour commented Jan 13, 2020 •

edited

Loading

Araq commented Jan 13, 2020

timotheecour commented Jan 13, 2020 •

edited

Loading

Araq commented Jan 13, 2020

This comment has been minimized.

timotheecour commented Jan 24, 2020

Araq commented Jan 24, 2020

timotheecour commented Jan 29, 2020

Araq commented Jan 30, 2020

timotheecour commented Jan 31, 2020 •

edited

Loading

Araq commented Jan 31, 2020

timotheecour commented Feb 6, 2020

timotheecour commented Mar 23, 2020

[superseded] ref syntax for lvalue expressions: byRef: myref=x[1].foo[2] #11824

[superseded] ref syntax for lvalue expressions: byRef: myref=x[1].foo[2] #11824

Conversation

timotheecour commented Jul 25, 2019 • edited Loading

example 1: simple example

example 2 illustrating side effect safety, type safety

example 3: comparison vs #11686

note

[EDIT]

ghost commented Jul 25, 2019

zah commented Jul 25, 2019 • edited Loading

Araq commented Jul 25, 2019

zah commented Jul 25, 2019 • edited Loading

dom96 left a comment

Choose a reason for hiding this comment

timotheecour commented Jul 25, 2019

timotheecour commented Aug 23, 2019

Araq commented Aug 23, 2019

timotheecour commented Jan 12, 2020

Araq commented Jan 12, 2020

timotheecour commented Jan 13, 2020

Araq commented Jan 13, 2020

timotheecour commented Jan 13, 2020 • edited Loading

Araq commented Jan 13, 2020

timotheecour commented Jan 13, 2020 • edited Loading

Araq commented Jan 13, 2020

This comment has been minimized.

timotheecour commented Jan 24, 2020

Araq commented Jan 24, 2020

timotheecour commented Jan 29, 2020

Araq commented Jan 30, 2020

timotheecour commented Jan 31, 2020 • edited Loading

Araq commented Jan 31, 2020

timotheecour commented Feb 6, 2020

timotheecour commented Mar 23, 2020

[superseded] ref syntax for lvalue expressions: `byRef: myref=x[1].foo[2]` #11824

[superseded] ref syntax for lvalue expressions: `byRef: myref=x[1].foo[2]` #11824

timotheecour commented Jul 25, 2019 •

edited

Loading

zah commented Jul 25, 2019 •

edited

Loading

zah commented Jul 25, 2019 •

edited

Loading

timotheecour commented Jan 13, 2020 •

edited

Loading

timotheecour commented Jan 13, 2020 •

edited

Loading

timotheecour commented Jan 31, 2020 •

edited

Loading