-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Untyped number literals #2995
Comments
To an extent, this would probably be classified as a basic form of type deduction. |
This probably shouldn't change (still a union of two types): a = 5i8
a = 10 if true An interesting complicated case: struct Point(T)
def initialize(@x : T, @y : T)
end
end
Point.new(1, 0.5) # Should be Point(Float64) - this is arguable |
Yes, literals should be special. It's a bit hard to implement, but it's definitely a good idea, and we'd also like the language to work this way. Question: should |
@asterite I think we should consider only numbers that have literals. |
I mean: def foo(x : BigInt)
end
foo(1) |
We could probably make it work in those cases too, by hardcoding the rules in the compiler. That should cover most of the cases. I still need to think how to implement this... |
@asterite I mean you can't create a BigInt directly with a literal, so it does not apply. It's not the reason why it shouldn't apply, just a criterion. I think data types that are not tied to the compiler shouldn't be considered. It's very easy for this to get out of hand with all these special cases. |
I wouldn't mind also seeing BigInt / BigFloat get some more status - it would make scientific computing with Crystal a breeze! |
Could there be a set of literal types, e.g. Int::Any, Float::Any, etc. These types are never used concretely but are coerced using conversion methods defined in them. That way it would be extensible. Is that a helpful way to approach it? |
I think a compiler level, pragmatic, "literal to type"-mapping procedure would be most appropriate, instead of involving types and conversion methods. This would be deterministic and reasonable enough imo: Subject: number literal without kind-suffix: Order of Choice:
Contexts:
I find the orders above to be reasonable for preferred match (should there exist several possible signature matches). 16-bits really are in the second room, since they're not likely to ever be used in signatures except for in cases where there is an overload for every int-type (low-level stuff) - and then it's still not the preferred one when slapping on an unspecified literal. The reasonable use-case for those are just compatct data in structs, arrays and such. As I've already mentioned, I like the idea of letting BigNums in on the game here :-) What do you think? |
Thinking about it some more, me thinks it's complicated 😉 For instance, would the compiler be smart enough to know:
But then
And it would have no idea for:
And then
If n is int32 we are almost certainly going to get the wrong answer there. We could put the literal first, but most literals are put at the end in my experience. And in any case it is weird that Maybe this is all easier to do than I realize, but it sure seems like a hairy mess. Maybe a simple way to handle it to always default to the smallest size (for speed) than operations always up-size to the next size (for accuracy),
And then at certain points in execution the compiler can override this if it can figure where they can be reduced. But the programmer can also help by telling it where to do it, e.g. def x(n : Int8)
...
end
a = 10 # Int8
b = a + a # Int16
x(b) # error no x(Int16)
~b # pseudo-code for type reduce b, if possible
x(b) # ok, x(Int8) I would expect the upscaling to max out at UInt64, but BigInt could get in on the act if a compiler flag is set? Just sort of thinking out loud here. |
That's why I specifically left local variables out of possible contexts: Only call-signature and ivar assign cases makes it simple. |
I would go as far as saying there are multiple overloads taking different types of Int and the literal fits in more than one of them, then there should be an ambiguity error, not a predefined order. This is what I mean: def foo(x : Int)
p typeof(x)
end
foo(5) # Int32
foo(3.1) # Float64
foo(999999999999999999) # Int64
def bar(x : Int8)
p typeof(x)
end
def bar(x : UInt16)
p typeof(x)
end
def bar(x : Float32)
p typeof(x)
end
bar(5) # ambiguos (between Int8 and UInt16)
bar(500) # UInt16
bar(-10) # Int8
bar(1.5) # Float32
bar(99999999) # Float32
def baz(x : Int8)
p typeof(x)
end
baz(99) # Int8
baz(9999) # no overload matches |
Shouldn't Int64 be the default now 64bit is the standard word size in most of the world's computers? |
@RX14 There is still advantage for using Int32 instead of Int64. They take less memory on classes/structs, they are more cache friendly and integer multiplication is a little faster on int32 than on int64, even on 64-bit hardware. I don't think those are reasons to keep using int32, but I don't see a reason to change the default. |
@lbguilherme I don't agree it should error in that situation. I think @RX14 is on the right track with Int64 as default, but, this would lead to classes, arrays, etc. in existing crystal programs to grow space-wise to potentially the double, as @lbguilherme mentions, since many people rely on compiler guessing of ivars. Personally I'd like:
The literal-typing should also apply to container-literals (array, etc.) that might grow substantially in size. As long as at least one literal is typed, the rest would be inferred from that in a given (T)-context (an array literal for instance). A small amount of work in type-defs (class, struct, container-literals) lets us get away with less unnecessary explicitness in general code. It also makes it clearer what the type members are. Regarding Int64 vs Int32, @lbguilherme - I did extensive benchmarks with real programs (doing copious amounts of calcs on large amounts of data, thereby affecting cycles and caching) in C++ a whole bunch of years ago, keeping all data-structures the same, only switching main int-size for code, and I found no measurable differences - except for integer division, where Int32 was noticeably faster. That last point might have changed since then with new processors though. (Important note: x86 tests only, and only three different machines) So in general, using Int64 in code is only a pro ("higher roof"). |
Hmmn, yes. I can't imagine space is really an issue because we're only changing the typing of local vars. They live on the stack or in registers. Anything that makes it off the stack's size has to be defined anyway (class, etc). |
@RX14 - no, as mentioned: with the current way of crystal guessing (preliminary inference) ivar types in classes, it would affect size of a lot of classes in existing code bases. Or did you mean in the context of a change similar to what I mentioned? In that case, yep. The biggest jump in space eating on 64 bit arch, is the unavoidable pointers since they make up a huge amount of the data structures as "references". But the joy of 64 bit (48, really) addressing does make that a price worth paying. |
Isn't this currently the case? |
No, they're also guessed from assigns in initialize() etc, and literals are not required to be specifically typed in that context. That would make all |
Why not use I mean alias Num = Int | Float def shoot(x : Num, y : Num)
end
shoot(1.23, 4.56) record(Point, x : Num, y : Num)
Point.new(0, 0) record(Color, r : Num, g : Num, b : Num, a : Num = 0)
Color.new(255, 128, 0) Currently the above alias outputs:
However I can use def shoot(x : Int | Float, y : Int | Float)
end
shoot(1.23, 4.56)
shoot(1.23, 4)
shoot(1.23_f32, 4u8) |
@faustinoaq This is slow! The compiler will have to keep runtime information about which type of int it is, and for every method call on it, check all possible runtime types. |
There's a lot of clever stuff going on here, but I all I really want, personally, is this:
4 is a perfectly valid Int8, and the compiler knows that foo takes an Int8. So why do I need to put |
Really the same could be said for many literals. Take this simple example: class FooClass
def initialize(@foo : Hash(String, String | Int32))
end
def puts_foo
puts @foo
end
end
foo = FooClass.new({"hello" => "world"})
foo.puts_foo which throws the error
even though the hash clearly should fit the constraints. The same thing happens with Arrays. |
@straight-shoota no, that can be solved. Just in the same way as you'd promote the integer type to the literal, you'd promote the hash type to the literal. For example FooClass.new({"hello" => "world"}) could compile but hash = {"hello" => "world"}
foo = FooClass.new(hash) would not. I would support a simple first fix: when a literal is in method args, it's exact type is taken from the method definition if defined and if ambiguous. |
@RX14 To me it makes no sense for literals to behave differently depending on context. When you refactor an argument out to a variable, everything breaks. And you just ask WTF? |
@straight-shoota well would you prefer something or nothing, because tracking types through variables is far harder than just through method args. And likely even more fragile. Would you rather something broke because you took it out of some method args or because you used the variable in an expression and the compiler couldn't change the variable type because of an edgecase. I know which one I'd prefer, at least being a literal in or out of method args is predictable. |
I suggest that number literals like
7
should not be immediately bound to the typeInt32
, but instead they should be implicitly convertible to anyNumber
. Similarly, fractional number literals like3.43
should be implicitly convertible to anyFloat
. To be more exact, no "conversion" would ever take place, ideally the literals would be untyped until they're actually used.In situations where a type is required (e.g. when writing
a = 0
) the literals default toInt32
/Float64
like they used to.Typed literals like
1.2f32
keep working like they used to.This change is expected to be backwards compatible, just leads to less verbose code and more permissive compilation — with no downsides (other than compiler complexity), in my opinion.
In other words, I'm sick of these errors:
This change would make these examples work.
The text was updated successfully, but these errors were encountered: