Untyped number literals #2995

oprypin · 2016-07-14T14:24:32Z

I suggest that number literals like 7 should not be immediately bound to the type Int32, but instead they should be implicitly convertible to any Number. Similarly, fractional number literals like 3.43 should be implicitly convertible to any Float. To be more exact, no "conversion" would ever take place, ideally the literals would be untyped until they're actually used.

In situations where a type is required (e.g. when writing a = 0) the literals default to Int32/Float64 like they used to.
Typed literals like 1.2f32 keep working like they used to.

This change is expected to be backwards compatible, just leads to less verbose code and more permissive compilation — with no downsides (other than compiler complexity), in my opinion.

In other words, I'm sick of these errors:

def shoot(x : Float32, y : Float32)
end

shoot(1.23, 4.56)
# no overload matches 'shoot' with types Float64, Float64

record Point, x : Float64, y : Float64

Point.new(0, 0)
# no overload matches 'Point.new' with types Int32, Int32

record Color, r : UInt8, g : UInt8, b : UInt8, a : UInt8 = 0

Color.new(255, 128, 0)
# no overload matches 'Color.new' with types Int32, Int32, Int32

Color.new(255u8, 128u8, 0u8) 
# still broken, can you see why?
# instance variable '@a' of Color must be UInt8, not Int32

This change would make these examples work.

The text was updated successfully, but these errors were encountered:

refi64 · 2016-07-14T14:31:35Z

To an extent, this would probably be classified as a basic form of type deduction.

oprypin · 2016-07-14T14:39:03Z

This probably shouldn't change (still a union of two types):

a = 5i8
a = 10 if true

An interesting complicated case:

struct Point(T)
  def initialize(@x : T, @y : T)
  end
end
Point.new(1, 0.5)  # Should be Point(Float64) - this is arguable

asterite · 2016-07-14T15:29:33Z

Yes, literals should be special. It's a bit hard to implement, but it's definitely a good idea, and we'd also like the language to work this way.

Question: should 1 pass as a BigInt too? :-)

oprypin · 2016-07-14T15:31:25Z

@asterite I think we should consider only numbers that have literals.

asterite · 2016-07-14T15:32:10Z

I mean:

def foo(x : BigInt)
end

foo(1)

asterite · 2016-07-14T15:33:19Z

We could probably make it work in those cases too, by hardcoding the rules in the compiler. That should cover most of the cases. I still need to think how to implement this...

oprypin · 2016-07-14T15:34:28Z

@asterite I mean you can't create a BigInt directly with a literal, so it does not apply. It's not the reason why it shouldn't apply, just a criterion.

I think data types that are not tied to the compiler shouldn't be considered. It's very easy for this to get out of hand with all these special cases.

ozra · 2016-07-14T18:36:49Z

I wouldn't mind also seeing BigInt / BigFloat get some more status - it would make scientific computing with Crystal a breeze!

trans · 2016-09-25T16:21:43Z

Could there be a set of literal types, e.g. Int::Any, Float::Any, etc. These types are never used concretely but are coerced using conversion methods defined in them. That way it would be extensible. Is that a helpful way to approach it?

ozra · 2016-09-25T21:43:38Z

I think a compiler level, pragmatic, "literal to type"-mapping procedure would be most appropriate, instead of involving types and conversion methods. This would be deterministic and reasonable enough imo:

Subject: number literal without kind-suffix:

Order of Choice:

If it's a real number (has . or e): try the literal as one of these, in order: Float64, Float32, BigDecimal?
If it's a whole number, try, in order: Int32, Int64, Float64, Float32, UInt64, UInt32, Int8, UInt8, BigInt?, Int16, UInt16

Contexts:

When used as arg in call: there could of course be a bunch of permutations and signature match attempts when there are several arguments, unless the first match immediately. Often Int is used as restriction, and then, there will be no retries, so compilation time will hardly be bogged down. As for combinations, same type on all loose args should be tried first, step by step, before different types combinations (most reasonable earliest match, less processing)
When the lvalue is an ivar: look up the possible types on the ivar (which is available since the top-level phase), select first matching type according to above precedence list (normally int-ivars are likely to have just one type, not being a union of ints though). If none matching, the ivar is obviously of no number type, error as usual.

I find the orders above to be reasonable for preferred match (should there exist several possible signature matches). 16-bits really are in the second room, since they're not likely to ever be used in signatures except for in cases where there is an overload for every int-type (low-level stuff) - and then it's still not the preferred one when slapping on an unspecified literal. The reasonable use-case for those are just compatct data in structs, arrays and such.
Int32 or Int64 first could be debatable. Right now it's Int32 "all the way", so that's why I imagined it like that.

As I've already mentioned, I like the idea of letting BigNums in on the game here :-)

What do you think?

trans · 2016-09-25T22:51:56Z

Thinking about it some more, me thinks it's complicated 😉

For instance, would the compiler be smart enough to know:

200.times do |i|
    i  # is Int8
end

But then

200.times do |i|
    j = i + 57  # ruroh
end

And it would have no idea for:

def x(n)
  n + 1
end
x(1)

And then

def x(n)
  n + 5000000000
end
x(1)

If n is int32 we are almost certainly going to get the wrong answer there. We could put the literal first, but most literals are put at the end in my experience. And in any case it is weird that + would not be commutative.

Maybe this is all easier to do than I realize, but it sure seems like a hairy mess.

Maybe a simple way to handle it to always default to the smallest size (for speed) than operations always up-size to the next size (for accuracy),

Int8 + Int8 -> Int16

And then at certain points in execution the compiler can override this if it can figure where they can be reduced. But the programmer can also help by telling it where to do it, e.g.

    def x(n : Int8)
       ...
    end

    a = 10         # Int8
    b = a + a      # Int16
    x(b)           # error no x(Int16)
    ~b             # pseudo-code for type reduce b, if possible
    x(b)           # ok, x(Int8)

I would expect the upscaling to max out at UInt64, but BigInt could get in on the act if a compiler flag is set?

Just sort of thinking out loud here.

ozra · 2016-09-25T23:38:51Z

That's why I specifically left local variables out of possible contexts: Only call-signature and ivar assign cases makes it simple.

lbguilherme · 2016-09-26T10:45:55Z

Order of Choice:

If it's a real number (has . or e): try the literal as one of these, in order: Float64, Float32, BigDecimal?

If it's a whole number, try, in order: Int32, Int64, Float64, Float32, UInt64, UInt32, Int8, UInt8, BigInt?, Int16, UInt16

I would go as far as saying there are multiple overloads taking different types of Int and the literal fits in more than one of them, then there should be an ambiguity error, not a predefined order. This is what I mean:

def foo(x : Int)
  p typeof(x)
end

foo(5) # Int32
foo(3.1) # Float64
foo(999999999999999999) # Int64

def bar(x : Int8)
  p typeof(x)
end

def bar(x : UInt16)
  p typeof(x)
end

def bar(x : Float32)
  p typeof(x)
end

bar(5) # ambiguos (between Int8 and UInt16)
bar(500) # UInt16
bar(-10) # Int8
bar(1.5) # Float32
bar(99999999) # Float32

def baz(x : Int8)
  p typeof(x)
end

baz(99) # Int8
baz(9999) # no overload matches

RX14 · 2016-09-26T12:25:48Z

Shouldn't Int64 be the default now 64bit is the standard word size in most of the world's computers?

lbguilherme · 2016-09-26T16:28:54Z

@RX14 There is still advantage for using Int32 instead of Int64. They take less memory on classes/structs, they are more cache friendly and integer multiplication is a little faster on int32 than on int64, even on 64-bit hardware. I don't think those are reasons to keep using int32, but I don't see a reason to change the default.

ozra · 2016-09-27T22:23:41Z

@lbguilherme I don't agree it should error in that situation.

I think @RX14 is on the right track with Int64 as default, but, this would lead to classes, arrays, etc. in existing crystal programs to grow space-wise to potentially the double, as @lbguilherme mentions, since many people rely on compiler guessing of ivars.

Personally I'd like:

ivars must be type declared or default assigned - guessing will only be done from that.
if default assigned, thereby resulting in a guess: any number literals in the expression used to init with must be suffixed to explicit type.
literal numbers can now be tried as Int64 first.

The literal-typing should also apply to container-literals (array, etc.) that might grow substantially in size. As long as at least one literal is typed, the rest would be inferred from that in a given (T)-context (an array literal for instance).

A small amount of work in type-defs (class, struct, container-literals) lets us get away with less unnecessary explicitness in general code. It also makes it clearer what the type members are.

Regarding Int64 vs Int32, @lbguilherme - I did extensive benchmarks with real programs (doing copious amounts of calcs on large amounts of data, thereby affecting cycles and caching) in C++ a whole bunch of years ago, keeping all data-structures the same, only switching main int-size for code, and I found no measurable differences - except for integer division, where Int32 was noticeably faster. That last point might have changed since then with new processors though. (Important note: x86 tests only, and only three different machines)

So in general, using Int64 in code is only a pro ("higher roof").

RX14 · 2016-09-27T22:32:18Z

Hmmn, yes. I can't imagine space is really an issue because we're only changing the typing of local vars. They live on the stack or in registers. Anything that makes it off the stack's size has to be defined anyway (class, etc).

ozra · 2016-09-28T20:29:16Z

@RX14 - no, as mentioned: with the current way of crystal guessing (preliminary inference) ivar types in classes, it would affect size of a lot of classes in existing code bases. Or did you mean in the context of a change similar to what I mentioned?

In that case, yep. The biggest jump in space eating on 64 bit arch, is the unavoidable pointers since they make up a huge amount of the data structures as "references". But the joy of 64 bit (48, really) addressing does make that a price worth paying.

RX14 · 2016-09-28T20:38:00Z

ivars must be type declared or default assigned - guessing will only be done from that.

Isn't this currently the case?

ozra · 2016-09-28T21:17:47Z

No, they're also guessed from assigns in initialize() etc, and literals are not required to be specifically typed in that context. That would make all @my_ivar = 1 initializations change its' type from In32 to Int64, if the literal would be deemed "Int64 primarily", without any other changes.

faustinoaq · 2017-09-25T04:05:44Z

Why not use Num as alias?

I mean

alias Num = Int | Float

def shoot(x : Num, y : Num)
end

shoot(1.23, 4.56)

record(Point, x : Num, y : Num)

Point.new(0, 0)

record(Color, r : Num, g : Num, b : Num, a : Num = 0)

Color.new(255, 128, 0)

Currently the above alias outputs:

Error in line 1: can't use Int in unions yet, use a more specific type

However I can use Int | Float union in def parameters:

def shoot(x : Int | Float, y : Int | Float)
end

shoot(1.23, 4.56)
shoot(1.23, 4)
shoot(1.23_f32, 4u8)

https://carc.in/#/r/2sis

lbguilherme · 2017-09-25T12:44:07Z

@faustinoaq This is slow! The compiler will have to keep runtime information about which type of int it is, and for every method call on it, check all possible runtime types.
For method arguments this works because they are restrictions, not casts. But then the literal values will simply be Int32 and Float64 and nothing changes from the original issue here.

andy-twosticks · 2018-03-15T10:41:14Z

There's a lot of clever stuff going on here, but I all I really want, personally, is this:

def foo(x : Int8)
  puts x
end

foo(4)

4 is a perfectly valid Int8, and the compiler knows that foo takes an Int8. So why do I need to put 4_i8?

watzon · 2018-03-15T11:18:54Z

Really the same could be said for many literals. Take this simple example:

class FooClass
  
  def initialize(@foo : Hash(String, String | Int32))
  end
  
  def puts_foo
    puts @foo
  end
  
end

foo = FooClass.new({"hello" => "world"})
foo.puts_foo

which throws the error

Error in line 12: instantiating 'FooClass:Class#new(Hash(String, String))'

in line 3: instance variable '@foo' of FooClass must be Hash(String, Int32 | String), not Hash(String, String)

even though the hash clearly should fit the constraints. The same thing happens with Arrays.

straight-shoota · 2018-03-15T12:30:39Z

@watzon What you're describing is about generic variance (see #3803). That's got nothing to do with literals. FooClass.new Hash(String, String).new would not match as well.

RX14 · 2018-03-15T17:46:41Z

@straight-shoota no, that can be solved. Just in the same way as you'd promote the integer type to the literal, you'd promote the hash type to the literal.

For example

FooClass.new({"hello" => "world"})

could compile but

hash = {"hello" => "world"}
foo = FooClass.new(hash)

would not.

I would support a simple first fix: when a literal is in method args, it's exact type is taken from the method definition if defined and if ambiguous.

straight-shoota · 2018-03-15T19:34:10Z

@RX14 To me it makes no sense for literals to behave differently depending on context. When you refactor an argument out to a variable, everything breaks. And you just ask WTF?

RX14 · 2018-03-15T21:42:49Z

@straight-shoota well would you prefer something or nothing, because tracking types through variables is far harder than just through method args. And likely even more fragile. Would you rather something broke because you took it out of some method args or because you used the variable in an expression and the compiler couldn't change the variable type because of an edgecase.

I know which one I'd prefer, at least being a literal in or out of method args is predictable.

jhass added RFC topic:compiler status:draft labels Jul 14, 2016

jhass mentioned this issue Jul 16, 2016

malloc: no overload matches #3004

Closed

asterite mentioned this issue Aug 6, 2016

Property macro does not allow initialization of Int64 to 0 #3105

Closed

ozra mentioned this issue Sep 25, 2016

Coerce numeric types #3345

Closed

spalladino removed the RFC label Jan 9, 2017

ozra mentioned this issue Mar 10, 2017

Make / be float division #2968

Closed

oprypin mentioned this issue Oct 6, 2017

Remove "of" syntax #3399

Closed

straight-shoota mentioned this issue Dec 4, 2017

[idea] Implicit constructor #5347

Closed

asterite mentioned this issue May 8, 2018

Explosive birth of automatic casts! #6074

Merged

RX14 closed this as completed in #6074 May 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Untyped number literals #2995

Untyped number literals #2995

oprypin commented Jul 14, 2016

refi64 commented Jul 14, 2016

oprypin commented Jul 14, 2016 •

edited

Loading

asterite commented Jul 14, 2016

oprypin commented Jul 14, 2016

asterite commented Jul 14, 2016

asterite commented Jul 14, 2016

oprypin commented Jul 14, 2016 •

edited

Loading

ozra commented Jul 14, 2016

trans commented Sep 25, 2016

ozra commented Sep 25, 2016 •

edited

Loading

trans commented Sep 25, 2016 •

edited

Loading

ozra commented Sep 25, 2016

lbguilherme commented Sep 26, 2016

RX14 commented Sep 26, 2016

lbguilherme commented Sep 26, 2016

ozra commented Sep 27, 2016

RX14 commented Sep 27, 2016

ozra commented Sep 28, 2016

RX14 commented Sep 28, 2016 •

edited

Loading

ozra commented Sep 28, 2016

faustinoaq commented Sep 25, 2017 •

edited

Loading

lbguilherme commented Sep 25, 2017

andy-twosticks commented Mar 15, 2018

watzon commented Mar 15, 2018

straight-shoota commented Mar 15, 2018

RX14 commented Mar 15, 2018

straight-shoota commented Mar 15, 2018

RX14 commented Mar 15, 2018 •

edited

Loading

Untyped number literals #2995

Untyped number literals #2995

Comments

oprypin commented Jul 14, 2016

refi64 commented Jul 14, 2016

oprypin commented Jul 14, 2016 • edited Loading

asterite commented Jul 14, 2016

oprypin commented Jul 14, 2016

asterite commented Jul 14, 2016

asterite commented Jul 14, 2016

oprypin commented Jul 14, 2016 • edited Loading

ozra commented Jul 14, 2016

trans commented Sep 25, 2016

ozra commented Sep 25, 2016 • edited Loading

trans commented Sep 25, 2016 • edited Loading

ozra commented Sep 25, 2016

lbguilherme commented Sep 26, 2016

RX14 commented Sep 26, 2016

lbguilherme commented Sep 26, 2016

ozra commented Sep 27, 2016

RX14 commented Sep 27, 2016

ozra commented Sep 28, 2016

RX14 commented Sep 28, 2016 • edited Loading

ozra commented Sep 28, 2016

faustinoaq commented Sep 25, 2017 • edited Loading

lbguilherme commented Sep 25, 2017

andy-twosticks commented Mar 15, 2018

watzon commented Mar 15, 2018

straight-shoota commented Mar 15, 2018

RX14 commented Mar 15, 2018

straight-shoota commented Mar 15, 2018

RX14 commented Mar 15, 2018 • edited Loading

oprypin commented Jul 14, 2016 •

edited

Loading

oprypin commented Jul 14, 2016 •

edited

Loading

ozra commented Sep 25, 2016 •

edited

Loading

trans commented Sep 25, 2016 •

edited

Loading

RX14 commented Sep 28, 2016 •

edited

Loading

faustinoaq commented Sep 25, 2017 •

edited

Loading

RX14 commented Mar 15, 2018 •

edited

Loading