Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator definitions (in standard library) #1267

Closed
skyfex opened this issue Jul 20, 2018 · 7 comments
Closed

Operator definitions (in standard library) #1267

skyfex opened this issue Jul 20, 2018 · 7 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@skyfex
Copy link

skyfex commented Jul 20, 2018

This proposal is on the surface similar to previous proposals of operator overloading (see #427 and #871). But I think there's a case to be made for a related proposal

This is also related to proposals like complex numbers and vector operations in the standard library (see #947)

Proposal

There should be a syntax for defining operators and the name of the function that implements them (in the cases where the operator is not implemented by a single instruction). This syntax could/should only be permissible in the standard library.

Example:

operator (a / b) __udivdi3(a: u64, b: u64) u64 { ... }
operator (a + b) addVec3(a: Vec3, b: Vec3) Vec3 { ... }

An operator should ideally be verified by the compiler to have no side-effects (see #520) of any kind. Possibly there should be other even stricter requirements (no non-inline calls?)

Rationale

(Not in order of importance)

  1. Niceness - I've always found it nice when operators are defined in the standard library (see Julia and Nim). It makes the language/compiler feel less magic, and you document which operators are present and which types they are defined for in a way that's guaranteed to be correct and up to date.

  2. Explicit over implicit and magic - Which operators have implementation in the compiler_rt library (and the mapping from operator to function name) feels awfully magic. I think all this is closely tied to LLVM? If this could be handled more explicitly in the standard library, it could make it easier when porting to other platforms, implementing other backends than LLVM, etc.

  3. Easier to add operators on other types - There's good arguments against operator overloading in general, but if we can add operators to the standard library, we could more easily add operators on complex numbers and vectors. Only supporting it for your standard ints and floats is pretty arbitrary from a computer architecture standpoint after all. Not really a good reason for it other than "C did it".

  4. More future-proof. Both GCC and Clang/LLVM has support for vector extensions to C. This says a lot about the desire for vector operations in C, but because it can't be added in a standardised way, we've ended up with ugly extensions.

  5. Easier to better support various CPU and DSP architectures.

Discussion

The only reason I could see why this proposal would be negative is that people might be confused that operator definitions exist, but are only allowed in the standard library. I think that's a small price to pay.

If Zig only allows operator on a certain set of types, it should have a clear philosophy behind it. Why only ints and floats? Because they are implemented by machine instructions? But some architectures don't have instructions for float types, and some have them for vector and complex types. I think Zig should implement them in all cases where there exists a (non-obscure) architecture that implements them.

As an aside: It'd be kind of cool if the compiler could take some Zig source code and translate all the operators into explicit function calls and write out the resulting code. It'd also be cool if the standard library defined assembly for an operator even if it's just a single instruction. That would make it nice for people who are just starting to learn about assembly and processor architecture, even if the code is ignored by the compiler. You could also imagine a super-simple compiler backend (like TCC for C) that actually used those definitions. Such a backend could make it easier to get started on new and custom architectures maybe?

@andrewrk andrewrk added this to the 0.4.0 milestone Jul 20, 2018
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jul 20, 2018
@thejoshwolfe
Copy link
Contributor

This proposal seems to be proposing adding a lot of complexity to the language for the sake of better documentation for a fixed set of functions. This doesn't seem necessary.

I've always found it nice when operators are defined in the standard library (see Julia and Nim). It makes the language/compiler feel less magic, and you document which operators are present and which types they are defined for in a way that's guaranteed to be correct and up to date.

I think we're going to have to settle for regular ol' documentation for this. If it makes you feel any better, these aren't names invented by zig (if they were, they would be named better.); they're used by gcc and llvm. https://gcc.gnu.org/onlinedocs/gccint/Integer-library-routines.html

The standard library should be as un-special as possible. I don't like the idea of adding not just syntax but also restrictions on the syntax for a subset of zig code.

There's good arguments against operator overloading in general,

And that argument is in the zen of zig: no hidden control flow. This has been discussed before and is unlikely to change in any future version of zig.

@skyfex
Copy link
Author

skyfex commented Jul 24, 2018

This proposal seems to be proposing adding a lot of complexity

A lot? That seems like an exagerration to me.

If it makes you feel any better, these aren't names invented by zig (if they were, they would be named better.); they're used by gcc and llvm.

Zig shouldn't be bound by C/GCC/LLVMs bad decisions. I also think it's a mistake to be too bound to what LLVM is doing just because it's the first backend. Hopefully Zig will have more compilers/backends at some points. It would be great to have the mapping of operators to functions defined strictly somewhere so that you ensure interoperability between compilers, no?

The standard library should be as un-special as possible.

But it is already special. It defines the compiler_rt functions whos use is incredibly magical and special. Yeah, you could in theory define them outside the standard library, but that would be very problematic.

I do agree with the statement though. Maybe we could make a "language definition file" of some sort instead. Or what about a "global scope library" file? Operators are essentially global functions. Perhaps people could change this file if they want to use Zig for very special-purpose code like shader languages, DSP code, etc. where they may want to provide a bunch of global functions for the user.

And that argument is in the zen of zig: no hidden control flow.

But that's clearly a lie. A convenient lie is OK, but if you're going to do it you should go out of your way to make it clear, explicit and out in the open. There's also a pretty simple solution to this: don't allow control flow in operator functions, and force them to be inlined. You could still define all the operators that most people want (algebraic operators on vectors and complex numbers).

If you don't ban control flow inside operators, but simply issue a warning, you would actually get some really useful warnings. It should be possible to suppress those warnings, but for inexperienced programmers working on microcontrollers and such it'd be very useful to know that the compiler has magically introduced functions that has hidden control flow and consume a boat-load of CPU cycles.

I agree that it seems like a lot of effort for a small thing, but the way I see it the effort is to fix one of the very few part of Zig that's ugly and tied to "whatever C was doing".. I think it's worth it.

Here's a pretty clear argument why something like this (doesn't have to be this exact proposal) is necessary:

Look at how many C based shading languages there are. I think Zig would make a great shading language, but you absolutely need vector operators for this application. If Zig doesn't add operators for vectors, you'd need to fork the language itself. You'd need to maintain an entire copy of the language specification just to redefine the operators. In that case I think people are more likely to continue to use the existing C based languages. But if you only need to swap out the standard library or some other special file (which is pretty reasonable for such a special case as a shader language) then you could use Zig out of the box.

Now, you could say that it'd be easier to solve the problem by just adding operators on vectors into the language. This is problematic for two reasons. One reason is that int and float is special to the language, a vector type would probably not be. It's logical that hardcoded operators are only defined on special built-in types, but if you support them for vectors it seems arbitrary which types have operators. The other reason is: which architectures are you going to support built-in operators for? Whatever you choose, it's going to seem entirely arbitrary. I mean, the current selection is really just based on what the common architectures supported over two decades ago.

Making such arbitrary decisions in the core language is pretty bad, but such decisions are pretty common in the standard library.

@skyfex
Copy link
Author

skyfex commented Jul 24, 2018

I changed the example from operator "+" to operator (a + b).. this should require less changes to the parser to support, and it's a more flexible and explicit syntax.

@0joshuaolson1
Copy link

0joshuaolson1 commented Jul 24, 2018

DSLs don't make much sense to me if they share the base language's grammar, in a which case purely syntactical convenience doesn't get you very far away in design space from just calling library functions anyway.

Also, are the current 'special' operations guaranteed atomic in some situations? Vector addition would lose that.

Unless SSE or whatever LLVM supports. Which is good - LLVM's performance-enhancing features aren't standardized with say GCC, but (touching another topic) neither are its alternatives I don't foresee needing to prematurely abstract Zig for.

@skyfex
Copy link
Author

skyfex commented Jul 24, 2018

DSLs don't make much sense to me if they share the base language's grammar

We're not really talking about enabling DSLs though. We're talking about supporting architectures other than your standard x86 and ARM. Or even just supporting x86 and ARM better than C does. Remember that modern versions of these have single instructions for much more than just int and float.

Also, are the current 'special' operations guaranteed atomic in some situations? Vector addition would lose that.

Not that I'm aware of. That's an impossible guarantee to make, as even 32-bit integer additions are not atomic in an 8-bit architecture for instance.

Which is good - LLVM's performance-enhancing features aren't standardized with say GCC, but (touching another topic) neither are its alternatives I don't foresee needing to prematurely abstract Zig for.

Not sure what you're getting at here. Both Clang/LLVM and GCC supports vector intrinsics and support C extension for using operators on vectors (MSVC does not though)

@BarabasGitHub
Copy link
Contributor

BarabasGitHub commented Jul 24, 2018

I'm always a bit torn on this issue. On the one hand it's nice to have arithmetic operators on vector units or maybe even arrays (like in Fortran), but on the other hand it quickly becomes ambiguous or non-trivial what the result will have to be.

For example for vectors would a multiply (*) be element-wise or a dot or cross product? What happens when you do equals (==)? Should it return a single boolean or a boolean vector?

There's a discussion about vectors here #903

@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Sep 28, 2018
@andrewrk
Copy link
Member

andrewrk commented May 3, 2019

But it is already special. It defines the compiler_rt functions whos use is incredibly magical and special. Yeah, you could in theory define them outside the standard library, but that would be very problematic.

This is a misunderstanding - compiler_rt is not actually part of the standard library. Zig ships with compiler_rt in source form and builds it from source lazily for the specified target because LLVM generates calls to these functions for some operations. The way things are set up is required in order to use LLVM. GCC does an equivalent thing with libgcc. But it doesn't restrict the Zig language; the language defines how the operators shall work, and compiler_rt is an implementation detail.

As an aside: It'd be kind of cool if the compiler could take some Zig source code and translate all the operators into explicit function calls and write out the resulting code.

This is equivalent to building the code into machine code, isn't it? The operators and instrinsics (e.g. @sqrt) directly correlate to hardware instructions on some targets, and to function calls on other targets.

If Zig only allows operator on a certain set of types, it should have a clear philosophy behind it. Why only ints and floats? Because they are implemented by machine instructions?

Yes, because they are implemented by machine instructions.

But some architectures don't have instructions for float types, and some have them for vector and complex types. I think Zig should implement them in all cases where there exists a (non-obscure) architecture that implements them.

I agree - and that's the philosophy behind it. Note that since this proposal was made, Zig now has Vectors / SIMD (see #903).

As for the overall proposal, thank you for taking the time to type it up, explain the context, and defend it.

The proposal has some ambitious ideas about how programming languages should work and be future proof. However, I'm fairly confident that status quo will be better for the success of the Zig project. You mentioned some interesting use cases along the way such as an experimental new architecture. I would be interested in discussing such use cases separately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

5 participants