Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining zero() seriously #34003

Open
jiahao opened this issue Dec 2, 2019 · 45 comments
Open

Defining zero() seriously #34003

jiahao opened this issue Dec 2, 2019 · 45 comments
Labels
julep Julia Enhancement Proposal maths Mathematical functions

Comments

@jiahao
Copy link
Member

jiahao commented Dec 2, 2019

(This issue is a follow-up to #34000 #28854 #31303)

The docstring for zero(x) says

Get the additive identity element for the type of x

However, defining this additive identity precisely seems to be confusing, especially for union types, and in the presence of type promotion rules. The rest of this issue is dedicated to exploring more precisely what the semantics of zero(T) is (to help figure out what exactly the semantics should be).

  1. In many situations, zero(::Union) is undefined when it could be defined. Intuitively, one might expect that the presence of multiple zeros might complicate the definition. However, type promotion seems to allow for a uniquely defined result. If we take the definition above seriously, the statements
julia> Float16(0.0) + Float32(1.0) === Float32(1.0) #works for Float32
true

julia> Float16(0.0) + Float16(1.0) === Float16(1.0) #and for Float16
true

would argue that zero(Union{Float16,Float32}) === Float16(0.0) (it is currently undefined).

  1. A counterargument to the above would be that Int(0) is also a valid choice for zero(Union{Float16,Float32}), since
julia> 0 + Float16(1.0) === Float16(1.0) # works because of type promotion
true

However, the statement above would also imply that Int(0) is a valid choice for zero(Float16).
We can exclude this value by requiring zero() to be endomorphic, i.e. zero(::T) :: T, and by extension zero(T) :: T.

  1. Int(0) appears to be the result of zero(T) wheneverT <: Union{Missing,Number,Complex}, even though it is not always the unique (or even correct) additive identity in T:
julia> zero(Union{Int,Complex})
0

julia> zero(Union{Real,Complex}) #due to type promotion, see #2
0

julia> 0 + Complex{Bool}(0) === Complex{Bool}(0) # Counterexample: actually Complex{Int}(0), so Int(0) is not an additive identity over Complex
false

julia> Int(0) + Int16(0) === Int(0) #Counterexample: Int16 <: Real but adding Int(0) to it promotes to Int
true

julia> Int(0) + false === Int(0) #Counterexample: Bool <: Real but adding Int(0) to it promotes to Int
true

Proposal

  1. zero(T) for non-union types should be defined as the unique additive identity such that
    a. zero(T)::T, i.e. zero() is an endomorphism, and
    b. The identity zero(T) + zero(T) === zero(T) is satisfied for all T.

  2. zero(Union{T,S}) === zero(T) when zero(T) + zero(S) === zero(S), taking into account type promotion rules.

  3. If no unique endomorphic zero exists, then zero(T) should be undefined, except for zero(Bool) === false as a special case.

This definition would preserve the endomorphic property for leaf types, like zero(Float16) === Float16(0.0).

This definition would preserve the result for zero(Bool) === false, to be consistent with current arithmetic semantics of Bool.

This definition would preserve the semantics of #28854 for zero(Missing) === missing.

This definition would define zero(Union{Float16,Float32,Float64}) to be Float16(0.0) (currently undefined).

This definition would define zero(Union{Float16,Int16}) to be Int16(0) (currently undefined).

This definition would change zero(Real) to be false (and similarly in Case 3 above; currently Int(0)); arguably a bug fix.

This definition would change zero(Union{Bool,Int}) to be false (currently Int(0)); arguably a bug fix.

@jiahao
Copy link
Member Author

jiahao commented Dec 2, 2019

The two cases that would change under the proposal above would be preserved if we defined zero(Bool) === Int(0), which is arguably the actual arithmetic semantics being described. Might be less of a breaking change to pick Int(0) over false.

@StefanKarpinski StefanKarpinski added julep Julia Enhancement Proposal maths Mathematical functions labels Dec 2, 2019
@oscardssmith
Copy link
Member

Would it make sense to define a zero singleton that just adds to everything the way we want it to, or would that break type stability?

@piever
Copy link
Contributor

piever commented Dec 3, 2019

I hope this does not add too much noise to the discussion, but a similar need (of a "symbolic zero" object z, such that a+z === a for all a), was expressed in FluxML/Zygote.jl#275. I imagine this would be more useful now that in the past, as the type instability is not so bad for performance due to small union optimizations.

@StefanKarpinski
Copy link
Member

Would it make sense to define a zero singleton that just adds to everything the way we want it to

That's already essentially what false is.

@MasonProtter
Copy link
Contributor

MasonProtter commented Dec 3, 2019

As it happens I was just thinking about this here: FluxML/Zygote.jl#329 (comment)

I think it would make sense to have an analogue to LinearAlgebra.I, perhaps LinearAlgebra.Z (which I guess we can't export until 2.0) which has the right properties of zero just like UniformScaling. Call it UniformZero or whatever.

@StefanKarpinski

That's already essentially what false is.

Maybe for Numbers, but not anything else:

julia> false + [1 0; 0 1]
ERROR: MethodError: no method matching +(::Bool, ::Array{Int64,2})

Also, as @MikeInnes has pointed out, you can't dispatch on false which is sometimes desirable.

@jiahao
Copy link
Member Author

jiahao commented Dec 7, 2019

Another use case I ran into is defining zero(::Nothing) = nothing - permissible if considering that Nothing has only one element (nothing), then one can define the trivial algebra over it. It turned out to be useful in an edge case with the method

ForwardDiff.partials(x, i...) = zero(x)

@nalimilan
Copy link
Member

I'd rather not define any arithmetic operation on nothing. Its main raison d'être is that it throws an error as soon as you try to do something with it, which helps catching bugs. We have missing when you need propagation.

@kmsquire
Copy link
Member

kmsquire commented Dec 8, 2019

Would it make sense to define a zero singleton that just adds to everything the way we want it to

That's already essentially what false is.

So, this suggests that, when writing code that is supposed to be generic over numbers, we should always use false instead of 0? 🧐

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Dec 9, 2019

Yes, that's essentially what it suggests. And true for the generic unit value.

@kmsquire
Copy link
Member

kmsquire commented Dec 9, 2019

But doesn’t using false to mean “the additive identity” (0) and true as “the additive unit value” (1) seem like we’re overloading the meaning of what are traditionally just Boolean values?

@MikeInnes
Copy link
Member

I think it seems more sensible if you think of Bool as equivalent to UInt1; boolean operations are just special cases of the bitwise operations.

@kmsquire
Copy link
Member

kmsquire commented Dec 11, 2019

Even if it can be justified, it still seems strange to me to write

a += true

in order to add 1 to a in a generic way.

@StefanKarpinski
Copy link
Member

I don't disagree; what would you prefer to write?

@MasonProtter
Copy link
Contributor

MasonProtter commented Dec 11, 2019

We already have a better generic 1: LinearAlgebra.I.

julia> using LinearAlgebra

julia> x = 5f0 + im; x + I
6.0f0 + 1.0f0im

julia> julia> rand(5, 5) + I
5×5 Array{Float64,2}:
 1.78862   0.772261  0.661623  0.913183  0.390289
 0.366553  1.78299   0.545654  0.433197  0.36714
 0.969424  0.646091  1.02947   0.68133   0.284309
 0.14118   0.273732  0.127835  1.35712   0.424081
 0.180714  0.120144  0.712769  0.548235  1.18592

I think moving UniformScaling to Base and adding an analogous UniformZero would be a nice solution.

@mcabbott
Copy link
Contributor

Note that I imposes some restrictions on shape, e.g. UniformScaling(0) + rand(3,4) is an error. And its type doesn't contain its value. Unlike the Zero() singleton linked above.

@kmsquire
Copy link
Member

Taking into account the discussion above (and trying to link it back to the original issue):

Regarding the zero singleton (for number types and ignoring matrices), my preference would be to simply use 0 and 1 for the additive identity and additive unit value, respectively.

This is currently problematic, of course, since these are Ints, and standard promotion rules apply (hence the existence of zero(T), one(T) and oneunit(T) and the original proposal by @jiahao ).

Some proposals to address this:

  1. Set zero() = false and one() = true, and use zero() and one().

  2. Create Zero() and One() (and maybe Oneunit()) singletons, as proposed above.

In either case, fix the issues raised by @jiahao in the original issue.

From a user’s perspective, there isn’t much difference between these.

An additional nicety would be to have the compiler convert actual 0s and 1s to zero(), one(), or oneunit() of it can determine that the corresponding 0 or 1 is being added or multiplied with something other than an Int. Would this be too hard, fragile, or magical?

@tkf
Copy link
Member

tkf commented Dec 12, 2019

If you are going with singleton zero and one, I think it is worth considering supporting arbitrary monoids, as doing so does not require much more effort. I did this in InitialValues.jl and it simplified code doing reduction a lot. The idea is to define something like

abstract type Monoid <: Function end

struct Id{F}
    op::F
end

struct Add <: Monoid end
const + = Add()
struct Mul <: Monoid end
const * = Mul()

(::F)(::Id{F}, x) where {F <: Monoid} = x
(::F)(x, ::Id{F}) where {F <: Monoid} = x
(::F)(x::Id{F}, ::Id{F}) where {F <: Monoid} = x

although Julia does not seem to support the code like the last three lines. But you can use macro to automate it (and defining structs like Add).

Regarding the zero singleton (for number types and ignoring matrices), my preference would be to simply use 0 and 1 for the additive identity and additive unit value, respectively.

One possibility to making code concise with those constants is to use Unicode:

const 𝟘 = Zero()
const 𝟙 = One()

@matthieugomez
Copy link
Contributor

@jiahao You also need to think about the case where there is no endomorphic zero, such as for Date. Note that zero(x::Date) is defined to be Day(0) in #35554

@goretkin
Copy link
Contributor

xref: #18367 (comment) (and onward)

@oschulz
Copy link
Contributor

oschulz commented Jun 11, 2020

I think it would be very cool if we had true singletons for one and zero, and maybe for +/- infinity, too.

I recently came across a need for this with samples and weights. In some use cases, the samples are unweighted, in others not, and I want to write some generic code that can handle both scenarios. So it would be nice to have weights = fill(One(), n) use zero memory, and for operations like all(isone, weights) to run in zero time.

Also, since we already have singleton for things like π and , we could have sin(π) == 𝟙, log(ℯ) == 𝟙, log(𝟘) == MinusInf(), and so on (if that would be considered not too breaking).

@stevengj
Copy link
Member

As I noted elsewhere (#36103 (comment)) adding a Zero() singleton type would tend to lead to type-unstable code, since the main reason to use it seems to be for accumulation where you aren't sure if the collection is empty.

@oschulz
Copy link
Contributor

oschulz commented Jun 13, 2020

I would rather see it's main use in generic code, where you'd use an array of One in one scenario, and an array of some numeric type in another (e.g. my example with statistical weights). But which of the two it is would be known at dispatch time, and the code would be type stable.

I wouldn't use it for accumulation, etc.

@nalimilan
Copy link
Member

Note that UnitWeights in StatsBase is essentially the array of ones what you describe.

@oschulz
Copy link
Contributor

oschulz commented Jun 13, 2020

Yes, but constructs like UnitWeights don't allow for single-element representation (which then later can be combined in an array again). It's actually one thing I really dislike about the StatsBase weights implementation - no way to have single elements. So I can never represent a single variate with it's weight without loosing what kind of weight it is. It also leads to behavior like broadcast(identity, uweights(10)) isa Array{Int64}. I prefer the way Unitful does this: tagging values, not arrays explicitly (but implicitly, of course).

But I don't want to focus too much on a specific use case here - I think having zero-mem-cost arrays of ones and zero that can preserve their property under broadcasting (e.g. fill(𝟘, 10) .* rand(10) isa Array{Zero}), possibly with special optimizations (one could consider things like fill(𝟙, 10) .* A === A) would very useful for generic algorithms that produce optimal code.

@oschulz
Copy link
Contributor

oschulz commented Jun 13, 2020

StaticNumbers.jl provides some of this, but it's quite heavyweight (has to be) and also can't define things like sin(π) = 𝟘 without pirating Base. And zero, one and +/- infinity are just special, so I think there's a good case to be made for having singletons for them as part of Julia itself.

@mcabbott
Copy link
Contributor

A nice idea for One from @YingboMa (on slack) would be to do this:

_strides(A::DenseArray) = (One(), Base.strides(A)[2:end]...)

Perhaps combined with #30432, as a lighter-weight alternative to ArrayLayouts.jl.

@tkf
Copy link
Member

tkf commented Jun 13, 2020

As I noted elsewhere (#36103 (comment)) adding a Zero() singleton type would tend to lead to type-unstable code, since the main reason to use it seems to be for accumulation where you aren't sure if the collection is empty.

@stevengj Base now has a singleton _InitialValue which is like the singleton identity element but "untyped" (in the sense it is not aware of the reducing function op; so, somewhat less type-safe). This generates very efficient code thanks to union-splitting and it's possible to improve this much more if you use the tail-call function-barrier pattern #34293 (which is like forced union-split). This is something you need to have anyway to implement something like collect and map based on mutate-or-widen strategy. So, I think singleton identity elements actually play nicely with well-written generic code in Julia.

@oschulz
Copy link
Contributor

oschulz commented Jun 22, 2020

Will _InitialValue become part of the official API at some point, if it works out well?

@tkf
Copy link
Member

tkf commented Jun 22, 2020

Yeah, it'd be nice to turn this into an API. But, for defining foldl etc., I think it'd be much better to provide overload API like _foldl_impl so that the implementers do not care about the initial values in the first place. (But then if we are adding this API it'd be better to add the full transducer-compatible foldl API...)

@oschulz
Copy link
Contributor

oschulz commented Jun 22, 2020

I have a feeling these are really separate things:

  • A kind of generic neutral element, for things like reductions, etc. - it's not really a zero, always, semantically

  • Singleton types for special values like one, zero and possibly +- infinity.

@tkf
Copy link
Member

tkf commented Jun 22, 2020

My intention was to share that there is enough usage experience for supporting the usefulness of singleton identity elements. It just happened to be reductions (although it's not a coincidence; that's where you really need to take "zeros" seriously).

For other special values like oneunit, infinity, etc. I think we need to accumulate usage experience before considering adding it to Base. So, somebody has to write a package and use it in the wild; e.g., make sure it works well with FillArrays.jl. (Yes, it'd be nice sin(π) works as-is. But you can always wrap π etc.)

@oschulz
Copy link
Contributor

oschulz commented Jun 22, 2020

Oh, sorry if my statement sounded critical - it wasn't meant to be. I absolutely do agree with you, I think having singleton identity elements would be extremely useful!

I was just trying to disentangle them from the mathematical one and zero singletons, since I had a feeling that some doubts expressed here about having Zero() were that it wouldn't a good identity element.

So, somebody has to write a package and use it in the wild

Yes, I was actually thinking about putting together a "OnesAndZeros.jl" or "SpecialNumbers.jl", as a proof of concept. I'm not sure yet how much is possible without committing type piracy regarding to Base, though. But maybe quite a bit. :-)

Maybe it can have a commit_base_type_piracy() function that enables some extra functionality. :-)

@oschulz
Copy link
Contributor

oschulz commented Jun 22, 2020

As for sin(π) == 𝟘, I'm not sure how many real-world use cases there would be that would actually profit from it, performance-wise. But it just would be so elegant ...

@tkf
Copy link
Member

tkf commented Jun 24, 2020

I was just trying to disentangle them from the mathematical one and zero singletons, since I had a feeling that some doubts expressed here about having Zero() were that it wouldn't a good identity element.

Got it. Thanks for clarifying this.

My worry is that the same "doubt" could be brought up even for other generic special singleton uses other than the initial values of reductions. If it's in a common API, it could be end up in the state variables of the loops. Of course, this may not happen so frequently because the special singletons are mostly for dispatches (IIUC). Having something like "SpecialNumbers.jl" in the ecosystem would be great for understanding this.

@oschulz
Copy link
Contributor

oschulz commented Jun 24, 2020

Ok, I'll go ahead with that, then, hopefully soon.

@oscardssmith
Copy link
Member

I hate to suggest broadening scope, but given this issue and the massive number of trig+pi (sinpi, cospi etc) functions we are starting to get, should this issue instead be taking symbolic numbers seriously and add a Symbolic type which subtypes Number which is similar (and possibly deprecates for 2.0) current Irrational numbers? Then we could have make a non symbolic times a symbolic number yield a special type so sinpi(3) be written as the imo more elegant sin(3*pi).

@tkf
Copy link
Member

tkf commented Jun 25, 2020

I think it makes sense to explore the design of singleton/partially-typed numbers here.

By the way, the topic in the OP is still important but unfortunately not much discussed. I think it's "orthogonal" to singleton numbers.

@jiahao How about re-posting the OP as a new issue? Something similar is required for, e.g., #31635. It'd be nice to have a dedicated place to discuss this.

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

By the way, the topic in the OP is still important but unfortunately not much discussed. I think it's "orthogonal" to singleton numbers.

I fully agree!

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

@perrutquist just pointed me to Zeros.jl so my ideas for a SpecialNumbers.jl are largely realized, already!

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

Zeros.jl even includes has a Zeros.@pirate Base, akin to the commit_base_type_piracy() I had been speculating about.

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

It's really cool to see that

using Zeros, BenchmarkTools
A = fill(One(), 10^6)
@benchmark all(isone, $A)

gives a mean run time of 1 ns, even though Zeros.jl doesn't include a specialized method for that. Kudos to @perrutquist and the Julia compiler team!

@StefanKarpinski
Copy link
Member

It would be weird if it took time to do a computation on something that takes no memory to store 😁

julia> sizeof(A)
0

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

It would - still, it's cool to see how Julia optimizes the for-loop in Base._all(isone, A, :) away completely. :-)

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

Even count(isone, A) runs in 1 ns - I didn't realize that Julia could turn a counting for-loop into a predictable value. Or is that LLVM itself?

@oschulz
Copy link
Contributor

oschulz commented Jun 25, 2020

It would be weird if it took time to do a computation on something that takes no memory to store

It's not quite true, tough. :-)

using Zeros, BenchmarkTools
A = fill(One(), 10^6)
@benchmark sum($A)

takes 10 μs.

But foldl(+, A, init = 0) and mapreduce(identity, +, A, init = 0) do run in 1 ns.

It's strange - Base.sum(A; dims = :) takes 10 μs, but mapreduce(identity, +, A, init = 0, dims = :) takes 1 ns. mapreduce(identity, +, $A, init = 0, dims = 1) takes 10 μs, though, like sum. But doesn't sum(A) use mapreduce with dims = :, internally?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
julep Julia Enhancement Proposal maths Mathematical functions
Projects
None yet
Development

No branches or pull requests