[SUGGESTION] Named return values can possibly make `out` parameters redundant #626

nmsmith · 2023-07-13T01:36:42Z

nmsmith
Jul 13, 2023

Motivation

An out parameter works as follows:

The caller passes a pointer to a memory location.
The callee must treat the memory location as uninitialized, and must initialize it before the function returns.

This is a very useful pattern, but notably, this is very similar to how return values already work in most ABIs. (In most ABIs, whenever a return type requires more than a few registers, the caller must provide a pointer to the memory location that the return value should be written to.)

Consequently, I propose making a slight tweak to out parameters to unify them with the notion of return values. The end result would be that cpp2 has one fewer concept, without sacrificing expressive power or performance. In fact, it will likely make cpp2 programs more performant, because the proposed solution implies NRVO.

The proposal

In today's design, out parameters look like this:

foo: (out s1: std::string, out s2: std::string) = {
    s1 = "hello";
    s2 = "world";
}

main: () = {
    s1: std::string;
    s2: std::string;
    foo(s1, s2);
}

I am proposing to write them like this instead:

foo: () -> (s1: std::string, s2: std::string) = {
    s1 = "hello";
    s2 = "world";
}

main: () = {
    s1: std::string;
    s2: std::string;
    s1, s2 = foo();
}

The differences are:

out parameters are moved to the right side of the ->. In other words, they are treated as part of the return type.
The caller uses the syntax s1, s2 = foo(), instead of foo(s1, s2).

But notably, the compilation strategy remains the same:

Small return values are returned via registers.
Large return values are written directly to a memory location provided by the caller.

Basically, what we end up with is an intuitive syntax for guaranteed NRVO. Accordingly, it should be possible to return immovable types (e.g. std::atomic) via this approach. (Because in the style of C++17, we're not "optimizing away" a move. Instead, we're saying that no moves are required.) Ultimately, this is equivalent to out parameters—all we've done is change the syntax.

The likely benefits of this approach include:

Better composability. With return values, it is possible to nest function calls (e.g. g(h(...), j(...))), but with out parameters (where functions return void) it is not.
Fewer concepts. We wouldn't need out parameters. Instead, we would just need to allow return values to be given names. (Indeed, cpp2 already has syntax for this.)
C++ would finally have the equivalent of "guaranteed NRVO"—a useful performance optimization.

JohelEGP · 2023-07-13T01:44:07Z

JohelEGP
Jul 13, 2023

Cpp2 already has what you're suggesting:

So what would be left of your suggestion is: "remove out parameters to reduce concept count".
But I don't know if that's desirable, given that out represents "out parameters" (F.21).
Named (multiple) return values can't replace the "Exception" in F.21,
but out parameters can, so I doubt it's worth removing them.

EDIT: Actually, the above can be done with inout parameters.

0 replies

JohelEGP · 2023-07-13T02:00:35Z

JohelEGP
Jul 13, 2023

Actually, I don't think the current implementation of named return values can guarantee copy elision.
IIRC, it essentially lowers the named return values like out parameters, but on the callee.
So it really does assign to local variables, and doesn't return a prvalue.

Herb has previously stated his intention to support named parameters.
It might make sense to wait until then to be able to support copy elision on named return values.
Although that would still require reworking the current named return values feature.

0 replies

nmsmith · 2023-07-13T02:11:10Z

nmsmith
Jul 13, 2023
Author

Yeah, the big difference between this proposal and the current semantics for named return values in cpp2 is that in the latter case, the return variables are not interpreted as references to a memory location provided by the caller. This is a missed opportunity, IMO.

But once you make that adjustment, out parameters seemingly become redundant. (Which is a good thing if true, because it would constitute a simplification.)

0 replies

hsutter · 2023-07-13T15:10:15Z

hsutter
Jul 13, 2023
Maintainer

Thanks! Note that Cpp2 does have two different features here (both of which Cpp1 also has, but here they're generalized with more language support and the ability to declare intent):

out parameters, which can accept an initialized or uninitialized argument (not just an unione). For an uninitialized argument, any function with an out argument acts as another (delegating) constructor. For an initialized argument, it assigns over the existing value.
Return values, which always produce a new value. These are always initialized by the callee, and always returned by value (if there are multiple return values they are in a generated struct returned by value).

They are similar but do have different use cases. I agree that more often you would just use return values, but out parameters are also useful and cover cases return values don't.

Doing more for copy elision in the implementation of multiple return values is interesting though, good suggestion.

0 replies

nmsmith · 2023-07-13T22:37:05Z

nmsmith
Jul 13, 2023
Author

@hsutter Thank you for your reply!

Your comment seems to mostly focus on describing how out parameters and return values work currently, and that description is consistent with my understanding. But you don't seem to have provided any reasoning as to why the two constructs cannot be merged into one.

Do you not believe it is possible to merge the two constructs into one, as I have proposed? It looks very possible to me.

0 replies

jcanizales · 2023-07-13T23:54:34Z

jcanizales
Jul 13, 2023

I don't see either what the difference in use cases is.

In the caller I can assign the return value of a function to either an initialized variable or an uninitialized one, same as I can pass either as an out argument.

In the callee, the body has to produce a new value for each out parameter, same as for the returned one.

0 replies

SebastianTroy · 2023-07-14T07:17:01Z

SebastianTroy
Jul 14, 2023

I believe the important aspect is that named return values have a fixed spot in memory, their scope starts and ends within the scope of the function call. Out parameters may be declared in another scope, and initialised with a function call within a smaller scope. On 14 July 2023 00:54:50 Jorge Canizales ***@***.***> wrote: I don't see either what the difference in use cases is. In the caller I can assign the return value of a function to either an initialized variable or an uninitialized one, same as I can pass either as an out argument. In the callee, the body has to produce a new value for each out parameter, same as for the returned one. — Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQNN7VP42YIGGDBTKKTXQCDEPANCNFSM6AAAAAA2IHSZYA>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

0 replies

nmsmith · 2023-07-14T07:27:50Z

nmsmith
Jul 14, 2023
Author

named return values [...] their scope starts and ends within the scope of the function call. Out parameters may be declared in another scope.

I think you're mixing up the notions of variables and memory locations:

The scope of both NRVs and out parameters (the variables) is that of the callee's function body.
With guaranteed NRVO, the memory locations that NRVs and out parameters refer to are provided by the caller.

So there is no distinction there.

0 replies

SebastianTroy · 2023-07-14T07:34:59Z

SebastianTroy
Jul 14, 2023

Surely the stack location where the out parameter is declared can be in a longer lived scope than the L value created by the return. Won't this affect where the stack portion of the values memory exists, can't this prevent potential move or copy assignments? On 14 July 2023 08:28:09 Nick Smith ***@***.***> wrote: named return values [...] their scope starts and ends within the scope of the function call. Out parameters may be declared in another scope. I think you're mixing up the notions of variables and memory locations: * The scope of both NRVs and out parameters (the variables) is that of the callee's function body. * With guaranteed NRVO, the memory locations that NRVs and out parameters refer to are provided by the caller. So there is no distinction there. — Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQJVMVZIU7G4ZIK7DFDXQDYIDANCNFSM6AAAAAA2IHSZYA>. You are receiving this because you commented.Message ID: ***@***.***>

0 replies

nmsmith · 2023-07-14T07:45:37Z

nmsmith
Jul 14, 2023
Author

I'm not sure exactly what you're asking, but I can give you a blanket answer: a function that declares a named return value will behave exactly how a function that declares an out parameter will behave.

In my original post, I mentioned that what I am proposing is essentially just to adjust the syntax of out parameters such that they look like (and compose like) return values.

So by definition, the two features will behave the same.

0 replies

realgdman · 2023-07-14T08:08:42Z

realgdman
Jul 14, 2023

How about interoperability with c++1? Suppose I need to call some API function with out param.

0 replies

nmsmith · 2023-07-14T08:16:06Z

nmsmith
Jul 14, 2023
Author

I'm not sure what you mean. C++1 doesn't have out parameters, it just has references. (References can be used to simulate out parameters, but the compiler can't tell the difference.) You can pass a reference to a C++1 function call the same as always.

0 replies

SebastianTroy · 2023-07-14T08:24:33Z

SebastianTroy
Jul 14, 2023

outerScopeString: string; // uninitialised { stringInitFunc(outerScopeString); // via out param } //now have an initialised string in another scope with no move or copy Vs { lValueString = stringInitFunc() // must move or copy this string elsewhere or it will go out of scope and destruct } These are different capabilities that maybe important in non-trivial use cases On 14 July 2023 08:45:51 Nick Smith ***@***.***> wrote: I'm not sure exactly what you're asking, but I can give you a blanket answer: a function that declares a named return value will behave exactly how a function that declares an out parameter will behave. In my original post, I mentioned that what I am proposing is essentially just to adjust the syntax of out parameters such that they look like (and compose like) return values. So by definition, the two features will behave the same. — Reply to this email directly, view it on GitHub<#540 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AALUZQP52F5Q3KZLEGXUY73XQD2KZANCNFSM6AAAAAA2IHSZYA>. You are receiving this because you commented.Message ID: ***@***.***>

0 replies

nmsmith · 2023-07-14T08:31:35Z

nmsmith
Jul 14, 2023
Author

@SebastianTroy Irrespective of whether the function accepts an out parameter or returns a value, you can store the result in a variable declared in an outer scope:

outerScopeString: string; // uninitialised
{
    outerScopeString = stringInitFunc(); // via named return value
}

With NRVO, this doesn't require any copying or moving. So it behaves exactly the same as an out parameter.

0 replies

JohelEGP · 2023-07-14T12:29:49Z

JohelEGP
Jul 14, 2023

It seems like #123 touches upon this, too:

Out and inout (not referring to cppfront) parameters were born out of legacy requirements as far as I can tell. The problems out parameters are a solution for are much better solved with new features; structured bindings for multiple returns (and of course cppfront's solution for multiple returns) and return value optimization.

0 replies

nmsmith · 2023-07-15T00:55:35Z

nmsmith
Jul 15, 2023
Author

@hsutter Yes, I think the code transformation would work roughly as you describe. But you wouldn't need to apply it to every cpp2 function definition and call. You'd only need to apply it to definitions that use named return values:

cpp2_func: () -> (w: widget) = {...}

And you're right: this implies that the cpp2 compiler needs to be able to tell which calls are to NRV functions.

That said, GCC and Clang have recently implemented guaranteed NRVO (following P2025), so if you're compiling your C++ code using those compilers, you can do a much simpler code transformation: declare the return variable in the first line of the function body, rather than in the signature:

// cpp2 code
cpp2_func: () -> (w: widget) = {
    w = foo();
}

// translated into cpp1, assuming NRVO is available
auto cpp2_func() -> widget {
    auto w = foo();
    return w;       // We also need to make sure that returns are explicit
}

This transformation is easier to implement, because it means that call sites don't need to be altered.

Of course, if you need this transformation to guarantee that no copies/moves are performed, P2025 (or something similar) would need to be standardized. Until then, the only transformation that is portable is the one you described.

0 replies

JohelEGP · 2023-07-15T01:24:32Z

JohelEGP
Jul 15, 2023

I think this is a more correct translation:

// translated into cpp1, assuming NRVO is available
struct cpp2_func__ret {
    widget w;
}
auto cpp2_func() -> cpp2_func__ret {
    auto res = cpp2_func__ret{foo()};
    return res;       // We also need to make sure that returns are explicit
}

Although to not require that all data members are initialized at the same time
(to not break today's use cases for the existing feature):

// cpp2 code
cpp2_func: (b: bool) -> (w: widget, v: widget) = {
    w = foo();
    stuff();
    if b {
        v = bar();
    } else {
        v = baz();
    }
}

// translated into cpp1, assuming NRVO is available
struct cpp2_func__ret {
    union { widget w; };
    union { widget v; };
}
auto cpp2_func(bool const& b) -> cpp2_func__ret {
    auto res = cpp2_func__ret{};
    res.w = foo();
    stuff();
    if (b) {
        res.v = bar();
    } else {
        res.v = baz();
    }
    return res;       // We also need to make sure that returns are explicit
}

Different from today is that those initializations are not runtime checked.

0 replies

JohelEGP · 2023-07-15T11:28:09Z

JohelEGP
Jul 15, 2023

FWIW, this works (https://compiler-explorer.com/z/oKEd86qfY):

#include "https://raw.githubusercontent.com/hsutter/cppfront/main/include/cpp2util.h"
#include <cassert>

namespace cpp2 {
template<typename T>
class out_member {
    T* t;

    //  Each out in a chain contains its own uncaught_count.
    int  uncaught_count   = Uncaught_exceptions();
    bool called_construct = false;

public:
    out_member(T* t_) noexcept : t{t_} { Default.expects(t); }

    //  In the case of an exception, if the parameter was uninitialized
    //  then leave it in the same state on exit (strong guarantee)
    ~out_member() {
        if (called_construct && uncaught_count != Uncaught_exceptions()) {
            std::destroy_at(t);
        }
    }

    auto construct(auto&& ...args) -> void {
        if constexpr (requires { std::construct_at(t, CPP2_FORWARD(args)...); }) {
            Default.expects( t );
            std::construct_at(t, CPP2_FORWARD(args)...);
        }
        else {
            Default.expects(!"attempted to copy assign, but copy assignment is not available");
        }
    }

    auto construct_list(auto&& ...args) -> void {
        if constexpr (requires { std::construct_at(t, T{CPP2_FORWARD(args)...}); }) {
            Default.expects( t );
            std::construct_at(t, T{CPP2_FORWARD(args)...});
        }
        else {
            Default.expects(!"attempted to copy assign, but copy assignment is not available");
        }
    }

    auto value() noexcept -> T& {
        Default.expects( t );
        return *t;
    }
};
}

struct widget {
  std::string x;
  ~widget() { }
};
widget foo() { return {"1"}; }
widget bar() { return {"2"}; }
widget baz() { return {"3"}; }
void stuff() { }

struct cpp2_func__ret;
// Reenable structured bindings support.
#include <tuple>
template<> struct std::tuple_size<cpp2_func__ret> : std::integral_constant<int, 2> { };
template<> struct std::tuple_element<0, cpp2_func__ret> : std::type_identity<widget> { };
template<> struct std::tuple_element<1, cpp2_func__ret> : std::type_identity<widget> { };
struct cpp2_func__ret { // No longer an aggregate.
    union { widget w; }; // Now an anonymous union member.
    union { widget v; };
private: // For `cpp2_func` access.
    friend auto cpp2_func(bool const& b) -> cpp2_func__ret;
    cpp2_func__ret() { }
private:
    template<class T> void move_construct(T&& that) noexcept(std::is_rvalue_reference_v<T&&>) {
      auto _w = cpp2::out_member<widget>{&w};
      auto _v = cpp2::out_member<widget>{&v};
      _w.construct(std::move(that).w);
      _v.construct(std::move(that).v);
    }
    template<class T> cpp2_func__ret& move_assign(T&& that) noexcept(std::is_rvalue_reference_v<T&&>) {
      w = std::move(that).w;
      v = std::move(that).v;
      return *this;
    }
public:
    cpp2_func__ret(const cpp2_func__ret& that) { move_construct(that); }
    cpp2_func__ret(cpp2_func__ret&& that) noexcept { move_construct(std::move(that)); }
    cpp2_func__ret& operator=(const cpp2_func__ret& that) { return move_assign(that); }
    cpp2_func__ret& operator=(cpp2_func__ret&& that) noexcept { return move_assign(std::move(that)); }
    ~cpp2_func__ret() {
      std::destroy_at(&v);
      std::destroy_at(&w);
    }
    void f() {
      auto[a,b] = *this; // Test that structured bindings works at type scope.
    }
    // Reenable structured bindings support.
    template<int I, class T> requires (I==0) friend auto get(T&& x) -> decltype((CPP2_FORWARD(x).w)) { return CPP2_FORWARD(x).w; }
    template<int I, class T> requires (I==1) friend auto get(T&& x) -> decltype((CPP2_FORWARD(x).v)) { return CPP2_FORWARD(x).v; }
};

auto cpp2_func(bool const& b) -> cpp2_func__ret {
    auto __res = cpp2_func__ret{};
    auto w = cpp2::out_member<widget>{&__res.w};
    auto v = cpp2::out_member<widget>{&__res.v};
    w.construct(foo());
    stuff();
    if (b) {
        v.construct(bar());
    } else {
        v.construct(baz());
    }
    return __res;
}

int main() {
  auto [w, v] = cpp2_func(true);
  assert(w.x == "1");
  assert(v.x == "2");
}

I don't know what else could be broken by switching to anonymous union members.

0 replies

JohelEGP · 2023-07-15T12:07:15Z

JohelEGP
Jul 15, 2023

Summarizing Herb's #540 (comment):

cpp2_func: () -> widget = { /*…*/ }
[…] the suggestion […] for cpp2_func to emit Cpp1 auto cpp2_func( widget& ) […]

Although the issue author's reply #540 (comment) is:

Yes, I think the code transformation would work roughly as you describe.

Because (N)RVO happens via the return value,
as the sample code that follows it suggests,
it should have been something like:

cpp2_func: () -> (w: widget) = { /*…*/ }
[…] the suggestion […] for cpp2_func to emit Cpp1
struct __ret_t { widget w; };
auto cpp2_func() -> __ret_t
[…]

FWIW, as a thought experiment, what if there were no return values at all and only out parameters were allowed, either as they are currently implemented or possibly with an updated implementation -- what would/wouldn't work in the use cases you have in mind?

operator=: (out this, …) would be unified with any other
f: (out x, …):

x: t = (…);
x    = (…);
x: t = f(…); // Now possible.
x    = f(…);

Whereas on the main branch:

// `operator=` remains the same.
x: t; // Can't initialize with `f(…)`.
f(out x);

0 replies

jcanizales · 2023-07-17T15:40:14Z

jcanizales
Jul 17, 2023

what would/wouldn't work in the use cases you have in mind?

This one's easy. You can easily/naturally chain function calls when f(x) represents the output values (g(f(x))), but not when they're written result: T; f(x, out result)

A follow-up that's pertinent to the multiple named return values case, is that it opens the language syntax to the possibility of splatting when chaining function calls. This is something other languages have had for a while (e.g. in Julia: f(g(x)...) will fill up multiple parameters of f correctly if g returns a named tuple).

0 replies

svew · 2023-11-20T00:37:24Z

svew
Nov 20, 2023

I've been having similar thoughts recently about out parameters vs. return values.

I don't expect this idea to be popular, but I'm leaning towards what Herb thought about earlier, that the way forward is no return values, just out parameters. I think it appropriately reduces complexity, unifies existing concepts, and does a better job of expressing the inputs vs. the outputs of a function by defining them in the same space.

By only using out parameters, there are still ways of progressively omitting parts of the function call that we don't need:

// Starting point:

func: (in a: int, out b: int) = { b = a * 5; }

// Don't need to name the return type, omit it:

func: (in a: int, out _: int) = {  return a * 5; }

// Don't need the statement, we're only returning one expression:

func: (in a: int, out _: int) = a * 5;

// We're going to write many functions where we don't care to name the return value,
// why force ourselves to write a discard identifier every time? Omit it:

func: (in a: int, out: int) = a * 5;

// Return type can be deduced, omit it:

func: (in a: int, out) = a * 5;

To handle calling syntax, let's invent Universal Function Return Syntax (UFRS) (no idea if this has already been thought of, props to whoever did), which says that out parameters can be treated as return values of a function call, so that the above function can be called like so:

result := func(2); // 10

I think for larger function calls, it does a great job of laying out what the inputs and outputs of the function are. It's nice that inout parameters are placed alongside the out parameters (because conceptually, they are both outputs of the function), so you can see clearly that the below function has 3 inputs, and 4 outputs:

reencrypt: (
    in      encrypted_data:    std::span<std::byte>,
    inout   temp_buffer:       std::span<std::byte>,
    inout   reencrypted_data:  std::span<std::byte>,
      out   was_reencrypted:   bool,
      out   reencrypted_size:  size_t,
) = { ... }

As for cpp1 generation, there's a few different ways I could imagine this working, either emitting out parameters as return values and structs instead of out parameters, or making a CPP2_UFCS_0 function or macro, but for function returns.

0 replies

svew · 2023-11-20T01:27:45Z

svew
Nov 20, 2023

Ultimately though, I don't think it matters whether we toss out out parameters in favor of return types, keep our return types and ditch out parameters, or whatever other options.

The most pressing issue that makes this worth talking about is the fact that we have two very different language constructs that express the exact same intent (output of a function) for both the function author and function caller, and the differences between them are usage-specific micro optimizations that I think cppfront can find ways of detecting and fixing automatically without additional cognitive overhead.

0 replies

msadeqhe · 2023-11-21T10:55:02Z

msadeqhe
Nov 21, 2023

@hsutter wrote:
FWIW, as a thought experiment, what if there were no return values at all and only out parameters were allowed, either as they are currently implemented or possibly with an updated implementation -- what would/wouldn't work in the use cases you have in mind?

@JohelEGP wrote:
operator=: (out this, …) would be unified with any other f: (out x, …):
x: t = (…);
x    = (…);
x: t = f(…); // Now possible.
x    = f(…);
Whereas on the main branch:
// `operator=` remains the same.
x: t; // Can't initialize with `f(…)`.
f(out x);

Good point. Do you mean we may put out parameters before other parameters in parameter list (except this parameter)?

cls1: type = {
    // declarations...

    //  : (inout this, in a, in b) -> (r)
    fnc1: (inout this, out r, in a, in b) = {
        r = a + b;
    }

    // declarations...
}

var1: cls1 = ();

var2: = var1.fnc1(0, 1);
// Parameters:
//   this = var1
//   r = [temporary object r]
//   a = 0
//   b = 1
// Assignment:
//   var2 = r

var1.fnc1(out var2, 0, 1);
// Parameters:
//   this = var1
//   r = var2
//   a = 0
//   b = 1

So we can think in this way, if out parameters are placed at the start of parameter list, they are implicitly similar to return values:

cls1: type = {
    // Multiple `out` parameters (return values)
    operator=: (out this, out a, out b, in x, in y, in z) = {
        // statements...
    }

    // `out m` cannot be implicitly a return value!
    fnc1: (inout this, out a, in x, out m) = {
        // statements...
    }
}

(var1, a1, b1): cls1 = (0, 1, 2);
// Parameters:
//   this = [temporary object this]
//   a = [temporary object a]
//   b = [temporary object b]
//   x = 0
//   y = 1
//   z = 2
// Assignment:
//   var1 = this
//   a1 = a
//   b1 = b

m1: i32;
var2: = var1.fnc1(0, out m1);
// Parameters:
//   this = [temporary object this]
//   a = [temporary object a]
//   x = 0
//   m = m1
// Assignment:
//   var2 = a

In the last line, we have to write out m1 to pass it as a parameter, because out m parameter is not next to this and out a parameters, in x parameter is between them.

6 replies

msadeqhe Nov 21, 2023

In a nutshell, the function declaration syntax will be like this in which <this-parameter> is optional:

NAME: (<this-parameter>, <out-parameters>..., <non-out-parameter>, <rest-of-the-parameters>...)

If out parameters are within <rest-of-the-parameters>..., they will not return values implicitly.

But if out parameters are within <out-parameters>..., they will be implicitly return values.

Also a single unnamed out parameter is like an unnamed return value, and a simple return value; can be used to set its value as suggested by @svew.

msadeqhe Nov 21, 2023

But there is a problem

inout this parameter returns a value in operator=.

It's inconsistent with how out parameters are going to mean return values.

So other alternative solutions may be considered.

msadeqhe Nov 21, 2023

In fact every function with this signature (including operator+ and other operators), are inconsistent with making only out parameters to return values:

CLASSNAME: type = {
    NAME: (<this-parameter, ...) -> CLASSNAME = { ... }
}

When a member function returns an instance of the class, this parameter is like a return value regardless of if its out this, inout this or even in this.

So it doesn't make this parameter in operator= to be any different than other operators like operator+.

jcanizales Nov 21, 2023

inout this parameter returns a value in operator=.

It's inconsistent with how out parameters are going to mean return values.

I think the fundamental issue there is that out parameters are semantically equivalent to return values, but inout parameters are not!

msadeqhe Nov 22, 2023

Thanks, so I'll make out this to be a special case.

msadeqhe · 2023-11-22T08:36:02Z

msadeqhe
Nov 22, 2023

@hsutter wrote:
FWIW, as a thought experiment, what if there were no return values at all and only out parameters were allowed, either as they are currently implemented or possibly with an updated implementation -- what would/wouldn't work in the use cases you have in mind?

I think your idea (having out parameters only instead of return values) is working well.

Also it easily solves the surprising fact that why UFCS on operator= ignores out this parameter. The answer will be: while it's this parameter, but it's out parameter too, and out parameters are imiplicit return values. So out this in operator= simply means the function returns this object, and it doesn't have this object similar to how a static member function doesn't have this object too.

What if we remove return values in favor of out parameters by assuming this parameter is always the first one in member functions, and out parameters are return values if they are before other (non-out) parameters?

NAME: (<this-param>, <out-params>..., <non-out-param>, <rest-params>...)
= {
    /* statements... */
}

In which we have the followin parameters:

<this-param> can be either in this, inout this or move this, but it cannot be out this!
Because out this belongs to <out-params> instead.
<out-params>... are a list of out parameters.
They are return values. Therefore out this belongs here.
<non-out-param> is a regular parameter except that it cannot be out parameter.
It can be either in, inout, move or forward parameter.
<rest-params>... are a list of other parameters. They can be in, inout, move, forward or out parameter, but out parameter is not return value here, because there is <non-out-param> before it.
So out parameter in <rest-params>... are not return values.

By making out this not to be similar to other this parameters, it simply means it returns this object, because the object is not constructored yet! So out this is a special case.

Now let's explore what can be done in this way:

example: type = {
    // It's the constructor with multiple `out` parameters (return values).
    operator=: (out this, out a, in x, out b) = {
        // statements...
    }
}

b1: i32;
// Here, we don't set `a` parameter explicitly.
(var1, a1): example = (0, out b1);
// Parameters:
//   this = [temporary object]
//   a = [temporary object]
//   x = 0
//   b = b1
// Assignment:
//   var1 = this
//   a1 = a

// Here, we set `a` parameter explicitly.
var2: example = (out a1, 0, out b1);
// Parameters:
//   this = [temporary object]
//   a = a1
//   x = 0
//   b = b1
// Assignment:
//   var2 = this

For example, the signature of operator+ will be like this:

cls1: type = {
    // declarations...
    
    //       : (inout this, in that) -> (r: cls1)
    operator+: (inout this, out r: cls1, in that) = {
        r = cls1();
        r.value = this.value + that.value;
    }

    // declarations...
}

a: cls1 = ();
b: cls1 = ();
c: _ = a + b;

IMO this change is worth it.

4 replies

msadeqhe Nov 22, 2023

For example, the signature of operator+ will be like this:
...
operator+: (inout this, out r: cls1, in that) = {
    r = cls1();
    r.value = this.value + that.value;
}
...

r: cls1 is simply this, it can be just out this.

Also to make it distinguish from inout this, we may change inout name to ref:

...
operator+: (ref this, out this, in that) = {
    r: = cls1();
    r.value = this.value + that.value;
    return r;

    /*
    this.value = this.value + that.value;
     */
}
...

We have to use return, because we used out this instead of out r: cls1 (named out parameter). Alternatively (as it's commented in the example), this may be returned automatically as we have written out this in the parameter list.

msadeqhe Nov 22, 2023

Also static member functions can return an instance of the type with out this:

cls1: type = {
    // declarations...

    //  : (a, b) -> cls1 = {
    fnc1: (out this, a, b) = {
        // statements...
    }

    // declarations...
}
...

So static member functions and operator= are consistent in this manner.

jcanizales Nov 23, 2023

assuming this parameter is always the first one in member functions, and out parameters are return values if they are before other (non-out) parameters

Style guides already suggest to group all out parameters together (usually at the end): https://google.github.io/styleguide/cppguide.html#Inputs_and_Outputs

I'd like that to just be enforced by the language, although I realize it would prevent compatibility with existing C++ code.

JohelEGP Nov 23, 2023

In the C++ Core Guidelines: F.21: To return multiple “out” values, prefer returning a struct or tuple.

msadeqhe · 2023-11-22T08:53:19Z

msadeqhe
Nov 22, 2023

Return values can be either move or forward too:

fnc1: () -> (        r: i32) = { ... }
fnc2: () -> (move    r: i32) = { ... }
fnc3: () -> (forward r: i32) = { ... }

We may make move to be the default return value as described in paper d0708.

Hence fnc1 and fnc2 may use out parameter instead of return values:

//  : () -> (r: i32) = { ... }
fnc1:   (out r: i32) = { ... }

//  : () -> (move r: i32) = { ... }
fnc2:       (out  r: i32) = { ... }

But we still need forward return values, maybe keyword forwardout or outward (!) can be used in this case:

//  : () -> (forward r: i32) = { ... }
fnc3:    (forwardout r: i32) = { ... }

IMO the name of inout may be changed to ref to make it clear that they're different to out parameters. Also it seems we need a better keyword than forwardout or outward (!).

0 replies

msadeqhe · 2023-11-22T10:59:09Z

msadeqhe
Nov 22, 2023

But from the readability point of view, IMO return types are better than out parameters.

Considering out parameters:

cls1: type = {
    operator=: (out this, out a, out b, x, y) = {
        // statements
    }

    fnc1: (inout this, out a, out b, x, y) = {
        // statements
    }
}

This equal return types are more expressive:

cls1: type = {
    // `out this` is changed to be a return value.
    // It reduces concept count. It's the point of this topic here.
    operator=: (x, y) -> (this, a, b) = {
        // statements
    }

    fnc1: (inout this, x, y) -> (a, b) = {
        // statements
    }
}

So out this and other out parameters have to be placed after -> as return values.

Although we write them as return values, but the concept still can be the same as out parameters.

Because we can explicitly pass out arguments to them:

(var1, a1, b1): cls1 = (0, 1);
// Parameters:
//   this = [temporary object]
//   a = [temporary object]
//   b = [temporary object]
//   x = 0
//   y = 1
// Assignment:
//   var1 = this
//   a1 = a
//   b1 = b

(var1, a1): cls1 = (out b1, 0, 1);
// Parameters:
//   this = [temporary object]
//   a = [temporary object]
//   b = b1
//   x = 0
//   y = 1
// Assignment:
//   var1 = this
//   a1 = a

 var1:  cls1 = (out a1, out b1, 0, 1);
// Parameters:
//   this = [temporary object]
//   a = a1
//   b = b1
//   x = 0
//   y = 1
// Assignment:
//   var1 = this

cls1(out var1, out a1, out b1, 0, 1);
// Parameters:
//   this = var1
//   a = a1
//   b = b1
//   x = 0
//   y = 1

(a1, b1): = var1.fnc1(0, 1);
// Parameters:
//   this = var1
//   a = [temporary object]
//   b = [temporary object]
//   x = 0
//   y = 1
// Assignment:
//   a1 = a
//   b1 = b

a1: = var1.fnc1(out b1, 0, 1);
// Parameters:
//   this = var1
//   a = [temporary object]
//   b = b1
//   x = 0
//   y = 1
// Assignment:
//   a1 = a

var1.fnc1(out a1, out b1, 0, 1);
// Parameters:
//   this = var1
//   a = a1
//   b = b1
//   x = 0
//   y = 1

It's suggested by @nmsmith but in a way that we can optionally pass them as out arguments (as @svew suggested).

4 replies

msadeqhe Nov 22, 2023

I've to mention to make it work consistently, issue #823 (about "Inconsistent Initialization") must be resolved.

msadeqhe Nov 22, 2023

Alternatively, out arguments may be after all arguments:

(var1, a1, b1): cls1 = (0, 1);
(var1, a1): cls1 = (0, 1, out b1);
 var1: cls1 = (0, 1, out a1, out b1);
cls1(0, 1, out var1, out a1, out b1);

(a1, b1): = var1.fnc1(0, 1);
 a1: = var1.fnc1(0, 1, out b1);
var1.fnc1(0, 1, out a1, out b1);

msadeqhe Nov 22, 2023

Both move and forward will still be valid return values (comparing if we drop return values in favor of out parameters):

// With return values:
fnc1: () -> (a, move b, forward c) = {...}

// But with `out` parameters:
fnc1: (out a, out move b, out forward c) = {...}
fnc1: (out a, out b, forwardout c) = {...}
fnc1: (out a, out b, outward c) = {...}
...?

msadeqhe Nov 22, 2023

Interestingly, out will be for return values only, and in cannot be for return values, so there will be a separation between in and out stuff in function signatures:

fnc1: (in a, inout b, move c, forward d) -> (out x, move y, forward z) = {...}

fnc2: (in name, in age) -> (out id, out result) = {...}
fnc3: (   name,    age) -> (    id,     result) = {...}

Return values are out by default. Parameters are in by default.

msadeqhe · 2023-11-22T12:37:46Z

msadeqhe
Nov 22, 2023

Implementation

There may be many ways to achieve this unification. Here's my 2 cents.

Return Types are `out` Parameters

Regardless of the syntax if Cpp2 uses either named return values or out parameters, internally the code should declare functions with out parameters. So for example:

fnc1: (x, y) -> (a, b) = {...}

It will be transpiled to this code (although the programmer cannot write out parameters directly):

fnc1: (x, y, out a, out b) -> void = {...}
// or alternatively:
// fnc1: (out a, out b, x, y) -> void = {...}

And when we call the function:

(a1, b1): = fnc1(0, 1);

It will be transpiled to:

temp_a1: <T> T;
temp_b1: <T> T;

fnc1(0, 1, out temp_a1, out temp_b1);
// or alternatively:
// fnc1(out temp_a1, out temp_b1, 0, 1);

(a1, b1): = (temp_a1, temp_b1);

I know it's too much simplified, but in reality it involes too much implementation details.

It allows Cpp2 functions to keep the return values for error codes too.

0 replies

msadeqhe · 2023-11-22T13:39:57Z

msadeqhe
Nov 22, 2023

Alternative syntax instead of `out` argument

Considering we write out to pass out parameters or return values:

a: i32;
var1: = function(0, 1, out a);

What if we directly use -> instead of out keyword?

a: i32;
var1: = function(0, 1)->(a);

Although it may feel alien, but -> directly means we put the values of the next identifiers from the return values of the function call. It's another syntax that visually shows what happens.

2 replies

JohelEGP Nov 22, 2023

From #77 (comment), responding to replacing forward t.first with t>>.first:

For Cpp2 though, one of my stakes in the ground is that declaration syntax is consistent with use syntax, so for Cpp2 I'd want to change the syntax in both places (declaration and use) or neither rather than introduce an asymmetry there.

msadeqhe Nov 23, 2023

This is an example with operator*:

fnc1: (x, y) -> (m, a) = {...}
fnc2: (x, y) -> (n, b) = {...}

(m, a): = fnc1(0, 1);
(n, b): = fnc2(0, 1);
var1: i32 = m * n;

var2: i32 = fnc1(0, 1, out a) * fnc2(0, 1, out b);
var2: i32 = fnc1(0, 1)->(a) * fnc2(0, 1)->(b);
var2: i32 = fnc1(0, 1)->a * fnc2(0, 1)->b;
//     m: = fnc1(0, 1)->a
//                     n: = fnc2(0, 1)->b

// var1 == var2

I assumed -> has higher precedence than operator* in an expression.

jcanizales · 2023-11-23T00:45:33Z

jcanizales
Nov 23, 2023

In C++, a semantic difference between out parameters and return values, that I hadn't considered before, is that the the types of arguments are an input to overload resolution, while the return type is an output. I haven't thought about how that could affect cpp2.

6 replies

msadeqhe Nov 23, 2023

Another Implementation for Cpp2 return values in Cpp1

Cppfront may generate two overloads for Cpp2 functions. For example, when I write this function:

fnc3: (x, y) -> int = {...}

Cppfront may generate these overloads from it:

fnc3: (x, y) -> int = {...}
fnc3: (x, y, out int) -> void = {...}

Of course we cannot directly write out parameters, and cppfront generates them for use from return values. So in Cpp1, we can call fnc3 without noticing any different:

// Cpp1
int var3;
fnc3(0, 1, var3);

// Cpp1
// Also this one works too:
var3 = fnc3(0, 1);

So it seems this feature (unifying return types and out parameters) is usable in Cpp1 too, if we write the function in Cpp2.

JohelEGP Nov 23, 2023

You don't need to hack cppfront to make an implementation.
You can take the proposed Cpp2,
write how you think it should lower to Cpp1,
and test that with the many corner cases brought up.

That works best for weeding out ideas.
I'm personally not enthused, since the unification is hard, and I don't see an obvious path forward.

msadeqhe Nov 23, 2023

@JohelEGP, you're right. Maybe I should fork and pull request the changes too.

I hope but I don't know if I can understand the source code.

msadeqhe Nov 23, 2023

Cppfront may generate these overloads from it:
fnc3: (x, y) -> int = {...}
fnc3: (x, y, out int) -> void = {...}

~~The second overload conflicts with int& parameters which are generated from inout parameters. So this implementation needs workarounds too.~~ I think I shouldn't write my thoughts immediately here 😅.

EDIT: Well, it works. The type of inout parameter in generated code is cpp2::out<T> in current Cpp2 (today). I though its type is T& (as it's explained in paper d0708). So it doesn't need any workarounds.

AbhinavK00 Nov 23, 2023

out parameters are T&. Also inout parameters are T&, but they have to be initialized before. Cpp2 can treat all T& parameters from Cpp1 API as inout parameters in Cpp2.

That'd break initialisation guarantee when the intent on cpp1 side was to use as an out parameter.

Btw, I don't think we should go from changing returns to out parameters but the other way around. Johel already pointed out cpp core guidelines which favour return values and that should be right practice but out parameters NEED to exist due to current cpp1 code which uses them.

msadeqhe · 2023-11-25T06:21:50Z

msadeqhe
Nov 25, 2023

Alternative Syntax

It has less parentheses. It removes the need of out keyword (in comparison to if we use out parameters to mean named return values). It helps in unifying out parameters and return values, because it favors named return values to return types:

fnc1: (a: i32, b: i32 -> x: i32, y: i32) = { x = a + b; y = 0; }
// It's equal to:
// fnc1: (a: i32, b: i32) -> (x: i32, y: i32) = { x = a + b; y = 0; }

fnc2: (a: i32, b: i32 -> x: i32) = { x = a + b; }
// It's equal to:
// fnc2: (a: i32, b: i32) -> (x: i32) = { x = a + b; }

fnc3: (a: i32, b: i32) = {}
// It's equal to:
// fnc3: (a: i32, b: i32) = {}

When we call functions:

(x, y): = fnc1(0, 1);
fnc1(0, 1 -> x, y); // similar to declaration
// It's equal to:
// fnc1(0, 1, out x, out y);

x: = fnc2(0, 1);
fnc2(0, 1 -> x); // similar to declaration
// It's equal to:
// fnc1(0, 1, out x);

fnc3(0, 1);

Also declaration syntax is consistent with use syntax.

3 replies

msadeqhe Nov 25, 2023

The signature of operator= will be like this:

...
operator=: (a, b -> this) = {...}
// It's equal to:
// operator=: (out this, a, b) = {...}
...

It's more expressive to explain what operator= is doing... it actually returns this object (implicitly) in current Cpp2 (today). That's why UFCS on operator= completely ignores this object like if it was a static member function and out this is not a parameter.

MaxSagebaum Nov 28, 2023

I do not understand why it is so important to "save" some brackets. Could you please explain this to me?

In my view, it is usually more important to focus on readability and not on removing characters to type.

The out keyword can not really be removed, since you need it to declare functions, that are compatible with Cpp1.

I do not have the full overview, but your proposed syntax clashes a little bit with function arguments e.g.:

fnc1: (a : (i32 -> i32) -> y: i32) = ...
// in contrast to
fnc1: (a : (i32) -> i32)  -> i32 = ...

svew Dec 1, 2023

I'm with @MaxSagebaum on this, it's a nice thought but ultimately putting -> inside the function signature with the other parameters doesn't seem any different than just specifying a return value or return struct.

[SUGGESTION] Named return values can possibly make out parameters redundant #626

Motivation

The proposal

Replies: 32 comments · 25 replies

nmsmith Jul 13, 2023 Author

hsutter Jul 13, 2023 Maintainer

nmsmith Jul 13, 2023 Author

nmsmith Jul 14, 2023 Author

nmsmith Jul 14, 2023 Author

nmsmith Jul 14, 2023 Author

nmsmith Jul 14, 2023 Author

nmsmith Jul 15, 2023 Author

But there is a problem

Implementation

Return Types are out Parameters

Alternative syntax instead of out argument

Another Implementation for Cpp2 return values in Cpp1

[SUGGESTION] Named return values can possibly make `out` parameters redundant #626

Replies: 32 comments 25 replies

nmsmith
Jul 13, 2023
Author

hsutter
Jul 13, 2023
Maintainer

nmsmith
Jul 13, 2023
Author

nmsmith
Jul 14, 2023
Author

nmsmith
Jul 14, 2023
Author

nmsmith
Jul 14, 2023
Author

nmsmith
Jul 14, 2023
Author

nmsmith
Jul 15, 2023
Author

Return Types are `out` Parameters

Alternative syntax instead of `out` argument