Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUGGESTION] Treat a keyword as an identifier #392

Closed
msadeqhe opened this issue Apr 24, 2023 · 22 comments
Closed

[SUGGESTION] Treat a keyword as an identifier #392

msadeqhe opened this issue Apr 24, 2023 · 22 comments

Comments

@msadeqhe
Copy link

msadeqhe commented Apr 24, 2023

Preface

I suggest to use "keyword"$ syntax to treat keywords as identifiers in Cpp2.

Suggestion Detail

Some identifiers in Cpp2 are keywords in Cpp1 such as and, or, etc. These identifiers aren't valid identifiers in Cpp1. Therefore Cpp2 appends cpp2_ prefix to these identifiers during code generation for Cpp1 as discussed in this issue. For example identifiers and, or, etc in Cpp2 will respectively become identifiers cpp2_and, cpp2_or, etc in generated Cpp1 code.

On the other hand, some keywords in Cpp2 aren't keywords in Cpp1 such as type, next, etc. These keywords are valid identifiers in Cpp1. Therefore it's not possible to use such identifiers in Cpp2 when dealing with Cpp1 API:

// ERROR! `type` is a keyword, whereas it's an identifier in Cpp1.
v0: = type.name();

// ERROR! `next` is a keyword, whereas it's an identifier in Cpp1.
while next < 10 next next++ {
    //: statements...
}

I suggest to use a syntax such as "keyword"$ to access identifiers from Cpp1 API in which they are keywords in Cpp2:

// OK. `type` is an identifier.
v0: = "type"$.name();

// OK. `next` is an identifier.
while "next"$ < 10 next "next"$++ {
    //: statements...
}

I think syntax "keyword"$ is good enough because Cpp2 will have similar syntax for reflections and code generations as described in this page of Wiki. And semantically this is related to them.

Your Questions

Will your feature suggestion eliminate X% of security vulnerabilities of a given kind in current C++ code?

No.

Will your feature suggestion automate or eliminate X% of current C++ guidance literature?

No.

Considered Alternatives

I considered to use syntax @keyword but it would resemble meta-class functions. Another possible syntax is @"keyword", which doesn't resemble meta-class functions in addition to its difference from capture syntax.

Also we can use a prefix such as cpp1_. For example:

// OK. `cpp1_type` will become `type` in Cpp1.
v0: = cpp1_type.name();

// OK. `cpp1_next` will become `next` in Cpp1.
while cpp1_next < 10 next cpp1_next++ {
    //: statements...
}

In this way, cpp1_ prefix in addition to cpp2_ prefix should be reserved for Cpp2 compiler, and user-defined identifiers with those prefixes shouldn't be allowed.

By the way, treating a keyword as an identifier, isn't frequency needed to dedicate a new symbol to it.

EDIT 1: keyword<> is another alternative solution as described in this comment.

@JohelEGP
Copy link
Contributor

JohelEGP commented Apr 24, 2023

Therefore it's not possible to use such identifiers in Cpp2 when dealing with Cpp1 API

I've actually found the Cpp2-specific keywords to be contextual, except for the <cstdint> shorthands (e.g., i8), and inspect. So this works: https://godbolt.org/z/vYPf6MG6h.

int next = 0; // Cpp1.
f: () = {
    while next < 10 next next++ {
    //: statements...
    }
    while 10 < next next next++ {
    //: statements...
    }
}
main: () = {
    :(is) = { [[assert: (is)is(is)]] }(:() = {});
}

Including the :(is) = { [[assert: (is)is(is)]] }(:() = {});.

@msadeqhe
Copy link
Author

You're right 😬, that's very nice but it doesn't work in every situation, for example:

type: type = {}

main: () = {
    variable: type = 0;
}

@JohelEGP
Copy link
Contributor

I prefer cpp2_type
over "type"$
to lower to Cpp1 as type.

@msadeqhe
Copy link
Author

Also Cpp2 may have more keywords during language evolution in the future, therefore keywords such as result in this issue are already valid identifiers in Cpp1.

@JohelEGP
Copy link
Contributor

For variable: type = (), you want type to be a Cpp1 identifier. In this case, it clashes with Cpp2 syntax itself. It can be solved with an indirection: std::type_identity_t<type>: https://godbolt.org/z/TGoqsebrW.

@JohelEGP
Copy link
Contributor

This are the sets of Cpp2-specific keywords:

{i8,       i16,       i32,       i64,       u8,
 u16,      u32,       u64,       i8_fast,   i16_fast,
 i32_fast, i64_fast,  u8_fast,   u16_fast,  u32_fast,
 u64_fast, i8_least,  i16_least, i32_least, i64_least,
 u8_least, u16_least, u32_least, u64_least, inspect});
{next, copy, move, forward, pre, post, final,
 in, 
 as, is, type, 
 assert, throws,  
 implicit, out,   
 inout});            

I can't come up with another example like #392 (comment)'s.
A simpler alternative for variable: type = ()
is qualifying type: variable: ::type = () https://godbolt.org/z/W3v48se5K.

@msadeqhe
Copy link
Author

Thanks. It's a good idea to use helper type std::type_identity_t<type> or write the qualified name ::type. But ::type doesn't work for local types:

main: () = {
    type: type = {}
    variable: ::type = ();
}

@JohelEGP
Copy link
Contributor

JohelEGP commented Apr 24, 2023

Well, the only reason to name it type in that case, would be for some Cpp1 library that reflects the identifer, or a wicked Cpp2 one.

@MichaelCook
Copy link

Perhaps a backslash prefix would be a little more readable.

v0: = \type.name();
while \next < 10 next \next++ {
    //: statements...
}```

@JohelEGP
Copy link
Contributor

Makes me think if raw string literals could just be backslash-escaped (see #302).

@msadeqhe
Copy link
Author

msadeqhe commented Apr 24, 2023

@MichaelCook, I like the idea to use backslash syntax \keyword.

Escape sequences in string literals start with a backslash to change the meaning of a character, why not just use it to change the meaning of a keyword and treat it as an identifier?

@JohelEGP, Good idea but in my opinion, raw (non-interpolated) string literals break the general capture syntax (thing)$ everywhere in the language. Also string literals without prefix or suffix will make it possible to have operator'' and operator"" or Tagged Template Strings or any other versatile syntax in Cpp2.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 24, 2023

@MichaelCook, I like the idea to use backslash syntax \keyword.

Because it only works on keywords, if Cpp2 has binary operator \, it wouldn't conflict with it:

// Treats `type` keyword as identifier.
x: \type = ();

// binary operator \
y: = var0 \ var1;

Also unary prefix and postfix operators \ wouldn't conflict with \keyword because they would be applied to identifiers.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 24, 2023

These are other use cases and reasons why it does matter in addition to C and Cpp1 language interop:

  • When Cpp2 introduces a new keyword in the future, a simple find and replace from keyword to \keyword (or whatever else) helps programmers to migrate the source code of their programs and libraries.
  • \keyword (or whatever else) is more readable than workarounds to treat a keyword as an identifier.
  • It allows to use keywords for the name of interfaces and functions in APIs:
variable.\if(count < 10).\return();

@jcanizales
Copy link

Therefore it's not possible to use such identifiers in Cpp2 when dealing with Cpp1 API

Could you give an example C++ API that would not be usable from Cpp2? Naming variables is not such a case.

@MichaelCook
Copy link

How's this:

struct myapi_t {
    int type;
};

@JohelEGP
Copy link
Contributor

JohelEGP commented Apr 25, 2023

See #392 (comment).
https://godbolt.org/z/GoK86hn53:

struct myapi_t {
    int type;
};
main: () -> int = myapi_t().type;

@msadeqhe
Copy link
Author

msadeqhe commented Apr 25, 2023

Also Cpp1 type traits in standard library have a member type named type. For example:

std::is_integral<T>::type

@msadeqhe
Copy link
Author

msadeqhe commented Apr 29, 2023

Also const doesn't work as a type name, although both Cpp1 and Cpp2 have const keyword:

const: type = {}

main: () = {
    x: const = "";
}

But it works with qualified name. By the way Cpp2 generates invalid Cpp1 code:

const: type = {}

main: () = {
    x: ::const = "";
}

@msadeqhe
Copy link
Author

msadeqhe commented Apr 29, 2023

According to @JohelEGP's helpful comment:

...
I can't come up with another example like #392 (comment)'s.
A simpler alternative for variable: type = () is qualifying type: variable: ::type = () ...

If Cpp2 accepts all keywords as identifiers with qualified names, and if Cpp2 would allow a way to have qualified names for local declarations (e.g. _::identifier), and if Cpp2 could treat keywords as identifiers in expressions where there is an operator, I think it's going to solve the issue, because it's a general rule and easy to understand/follow:

// `while` is not a keyword, because it's left-hand-side of `:`.
while: type = {
    // `return` is not a keyword, because it's left-hand-side of `:`.
    return: () = {}
}

main: () = {
    // `do` is not a keyword, because it's left-hand-side of `:`.
    // `do` is a local type.
    do: type = { /* definition */ }
    // `_::do` is a qualified name.
    variable: _::do = /* definition */

    // `for` is not a keyword, because it's left-hand-side of `:`.
    // And `while` is an identifier.
    for: ::while = ();
    // `for` is an identifier becuase there is an operator dot.
    // And `return` is an identifier too.
    for.return();

    // Also to clarify, `for` may be required to be quialified, but it's a little restrict.
    _::for.return();

    // `for` is a keyword, because there isn't any operator between `for` and `args`.
    for args do (arg) { /* statements */ }

    // `return` is not a keyword, because it's left-hand-side of `:`.
    return: (param) -> ::while = ();

    // AMBIGUOUS!
    // When there is an ambiguous, they are keywords.
    // So `return` is a keyword here.
    return(/* something */);

    // This `return` is not a keyword, because its name is qualified.
    _::return(/* something */);

    // `if` and `for` are not keywords, because there is an operator between them.
    value: = if * for;

    // `do` is not a keyword, because it's already a qualified name.
    my_namespace::do();
}

In this way, keywords would be fully contextual (except for operator() becuase it means both for function call and grouping), and any new syntax won't be required to treat keywords as identifiers.

EDIT

This way has a problem if Cpp2 supports operator overloading for '' and "" in the future (of course perhaps Cpp2 never supports them but it's a possibility to consider):

// This is AMBIGUOUS:
//   Is `return` an object with `operator""`?
//   Or is `return` a keyword and returns string `"something"`?
return "something";

// This is OK. `return` is an object with `operator""`.
_::return "something";

It is similar to how operator() is ambiguous.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 29, 2023

To simplify the rule, maybe Cpp2 should require that every identifier to be qualified in which their name are equal to keywords:

// `_::` must be omitted for naming declarations.
while: type = /* definition */
do: _::while = /* definition */

// `_::` must be omitted for accessing members with :: or dot.
// Because these are already qualified names.
_::do::type.as.forward;

_::return(args);

// OK. Keywords cannot be qualified names.
value: = _::if * _::for;

// `inspect` is not a keyword, because it's already a qualified name.
my_namespace::inspect();

So they must be always qualified either namespace::keyword for identifiers inside a namespace or ::keyword for global identifiers or _::keyword for local identifiers (if Cpp2 supports something similar).

Instead of "keyword"$ or @"keyword" or \keyword, it's going to be namespace::keyword or ::keyword or _::keyword. Those are alternative ways to achieve the same thing...

@msadeqhe
Copy link
Author

msadeqhe commented Jun 17, 2023

Instead of "keyword"$ or @"keyword" or \keyword, it's going to be namespace::keyword or ::keyword or _::keyword. Those are alternative ways to achieve the same thing...

Another option to consider is keyword<>.

Types, functions and variables can be templates already. So keyword<> would mean a type, function or variable.

if: (condition: bool, forward yes_value: int, forward no_value: int) -> forward int = {
/* Or force the template notation <> within declaration:
if: <> (condition: bool, forward yes_value: int, forward no_value: int) -> forward int = {
*/
    if condition {
        return yes_value;
    }
    else {
        return no_value;
    }
}

if: <T> (condition: bool, forward yes_value: T, forward no_value: T) -> forward T = {
    if condition {
        return yes_value;
    }
    else {
        return no_value;
    }
}

main: () = {
    // if<> is not a keyword here.
    x: = if<>(2 * 2 == 4, 1, 0);

    // if<bool> is not a keyword here.
    y: = if<bool>(2 * 2 == 4, true, false);

    // if<> is not a keyword here.
    z: = (2 * 2 == 4).if<>(1, 0);
}

But keywords cannot be templates, so template arguments after keywords, mean they are not keywords.

This requires empty template argument list as described in this bug.

@msadeqhe
Copy link
Author

msadeqhe commented Jun 17, 2023

But keywords cannot be templates, so template arguments after keywords, mean they are not keywords.

Also it depends if Cpp2 would have similar Cpp1-style static_cast<X>(arg), dynamic_cast<X>(arg) and etc (keywords which look like templates) or not.

Or would Cpp2 have stand-alone <something> as an expression (without identifier before it)?

if <something> { ... }

That's suggested in this issue.

Repository owner locked and limited conversation to collaborators Aug 30, 2023
@hsutter hsutter converted this issue into discussion #642 Aug 30, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

4 participants