Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Literal type narrowing regression #10898

Closed
chancancode opened this issue Sep 13, 2016 · 24 comments
Closed

Literal type narrowing regression #10898

chancancode opened this issue Sep 13, 2016 · 24 comments
Assignees
Labels
Bug A bug in TypeScript Fixed A PR has been merged for this issue

Comments

@chancancode
Copy link
Contributor

TypeScript Version: [email protected]

Code

type FAILURE = "FAILURE";
const FAILURE = "FAILURE";

type Result<T> = T | FAILURE;

function doWork<T>(): Result<T> {
  return FAILURE;
}

function isSuccess<T>(result: Result<T>): result is T {
  return !isFailure(result);
}

function isFailure<T>(result: Result<T>): result is FAILURE {
  return result === FAILURE;
}

function increment(x: number): number {
  return x + 1;
}

let result = doWork<number>();

if (isSuccess(result)) {
  increment(result);
//          ~~~~~~
//          Argument of type 'string | number' is not assignable to parameter of type 'number'.
//            Type 'string' is not assignable to type 'number'.
}

Expected behavior:

  1. result (before the narrowing) should be "FAILURE" | number
  2. result (after the narrowing) should be number

Actual behavior:

  1. result (before the narrowing) is string | number
  2. result (after the narrowing) is string | number
@chancancode
Copy link
Contributor Author

This is working on 20160911 but broke in 20160912, most likely related to #10676

@chancancode
Copy link
Contributor Author

It seems like the change is somewhat intentional – that let is getting the "base primitive types" now. However, since the function (doWork, and also isSuccess and isFailure have explicit annotations), I suspect it is possible to keep this pattern working without breaking the other goals?

@mhegazy
Copy link
Contributor

mhegazy commented Sep 13, 2016

the change is intentional. see #10676 for more details. I do not think this is a pattern we explicitly thought about; so we need to go back and talk about this.

a work around would be an explicit type annotation to result: let result: Result<number>

@chancancode
Copy link
Contributor Author

@mhegazy keep me posted! 👍 I did try to read #10676, and there seems to be some thoughts/effort in making functions with explicit return type annotation work, so I am crossing my fingers :P

@mhegazy mhegazy added Bug A bug in TypeScript In Discussion Not yet reached consensus labels Sep 13, 2016
@mhegazy
Copy link
Contributor

mhegazy commented Sep 13, 2016

//cc: @ahejlsberg

@ahejlsberg
Copy link
Member

The issue here is that when we infer the type number | "FAILURE" for a mutable location, we can't tell that "FAILURE" is a special type that you don't want widened to string. Indeed, had the type been 1 | "FAILURE" you probably would have wanted the 1 to be widened to number.

One possible solution is to change the first two lines to:

type FAILURE = "FAILURE" & {};
const FAILURE: FAILURE = "FAILURE";

This turns the FAILURE type into an intersection that won't be widened, but otherwise is similar to the literal type "FAILURE". But I think an even better change would be to use an object as the sentinel failure value. That way Result<T> would also work when T is a string.

type FAILURE = { __kind__: "FAILURE" };
const FAILURE: FAILURE = { __kind__: "FAILURE" };

@chancancode
Copy link
Contributor Author

I agree the last solution would work and arguably better, but there are still a few things that doesn't feel right to me:

  1. When there is an explicitly typed return value, it feels wrong to infer the type on the let, instead of just taking the returned type verbatim. IMO, the explicit return type is a pretty strong signal that you know what you are expecting, and the widening caused by the inference is probably just going to invite bugs.
  2. Even with the widening the error still seems wrong. I would expect that it fails on the call to isSuccess because the argument type doesn't match the signature (which really highlights my problem with 1). If that's somehow okay, then the narrowing should have worked - in the current version it seems like the value is T assertion is just silently dropped, making the isResult a noop.

@ahejlsberg
Copy link
Member

When there is an explicitly typed return value, it feels wrong to infer the type on the let, instead of just taking the returned type verbatim. IMO, the explicit return type is a pretty strong signal that you know what you are expecting, and the widening caused by the inference is probably just going to invite bugs.

Consider this modified example:

function doWork<T>(x: T): Result<T> {
    return x;
}

const c = doWork(1);  // Result<1>
let v = doWork(1);  // Result<1> widened to Result<number>

In both cases doWork(1) returns a Result<1>, preserving the fact that the argument was the literal 1. The type is widened to Result<number> when we infer it as the type for v because we're inferring for a mutable location (e.g. you might later want to assign a Result<123> to v). The explicitly typed return value isn't really an indicator of intent.

The modified example above works when FAILURE is changed in one of the ways I proposed. However, in your original example, the widening of Result<number> (which is equivalent to number | "FAILURE") ends up producing number | string because the "FAILURE" type is just a string literal type like any other.

Even with the widening the error still seems wrong. I would expect that it fails on the call to isSuccess because the argument type doesn't match the signature (which really highlights my problem with 1). If that's somehow okay, then the narrowing should have worked - in the current version it seems like the value is T assertion is just silently dropped, making the isResult a noop.

What happens is that when inferring from number | string to T | "FAILURE", we infer number | string for T, just as we would have inferred number if you had passed a plain number. Effectively, any type is a valid argument for isSuccess because anything will match T | "FAILURE" as there is no constraint on T.

@ahejlsberg ahejlsberg added Breaking Change Would introduce errors in existing code and removed Bug A bug in TypeScript labels Sep 14, 2016
@ahejlsberg
Copy link
Member

I'm re-labeling as Breaking Change as this is working as intended and is an effect of our new approach to better preserving literal types.

@gcnew
Copy link
Contributor

gcnew commented Sep 15, 2016

I was thinking about this issue and what dawned on me is that maybe it will be most intuitive if explicit literal, non-template types are not widened. It's subtle, but the purpose of the union is that the literal options are something you expect and prepare to discriminate on even in mutable locations. Maybe I'm wrong, but I wasn't able to come up with a counter example.

@mhegazy mhegazy removed the In Discussion Not yet reached consensus label Sep 20, 2016
@mhegazy
Copy link
Contributor

mhegazy commented Sep 20, 2016

Some principals/intuitions that were involved in this change:

  • A literal is substitutable with its symbolic name, i.e. replacing func("foo") to const foo = "foo"; func(foo); should behave the same
  • Types are always treated the same, regardless whether they are inferred or explicitly listed. This has been a principal that TS has operated under, and there are a lot of assumptions that are built on this, e.g. declaration file emit does not differentiate between inferred and explicit type annotations.
  • Identifying a declared type should be always possible. and once identified it should not be changed. To clarify the terminology, a declared type is either the explicit type annotation on a declaration, or the type of the initialization expression if one is not present, other wise any. This is what makes var x = 1 be a number.
  • One final intuition, is that mutable locations (e.g. let, array element, or property declaration) are not literal type locations in general, mainly as literal types are not very useful in these conditions. e.g. var x = 0; assuming that the type of x is 0 is not very useful, and most likely the user intent is to have x be a number.

With these all in perspective, I believe the position we are in now is much better than where we were. there are some breaking changes that this entails, but overall the system is better positioned.

@mhegazy mhegazy closed this as completed Sep 20, 2016
@wycats
Copy link

wycats commented Sep 21, 2016

My TLDR:

  • I think it's fatally problematic for const and let to produce different types for the same rhs expression (especially in the only-assigned-at-declaration let case).
  • I think we're over-rotating on concerns about let mutation
  • I think allowing the semantics of let mutation of literal subtypes to diverge from other kinds of subtypes is creating more problems than it solved.

Onward:


@mhegazy I'm quite a bit confused about the way you're applying the stated principles to this problem. Among other things, I'm pretty concerned about so many special rules applied to literal subtypes that are not applied to other subtypes. I'll get to that in a bit.

A literal is substitutable with its symbolic name, i.e. replacing func("foo") to const foo = "foo"; func(foo); should behave the same

In order to make this work, you needed to define "its symbolic name" as "a name declared as a const binding".

In particular, this doesn't work:

type Event = "onmouseover" | "onmouseout";
function onmouseover(): Event { return "onmouseover"; }
function onmouseout(): Event { return "onmouseout"; }
function register(event: Event, callback: DOMCallback) { window.addEventListener(event, callback); }

let event = onmouseover();
register(event, e => { console.log(e); });

In this case, event is a symbolic name that we have declared as returning an Event (which is an abstraction over the precise literals we're talking about). We have other functions that also take Events, but because our "symbolic name" is technically mutable (even though we haven't mutated it), this code fails to type check.

The fact that const event = onmouseover() would be the only change needed to make the same code type check is incredibly unintuitive.

I agree with and accept the constraint about symbolic names, but I think a TypeScript user's plausible intuition doesn't differentiate between const and assigned-only-at-initialization let. Try to look at the above example with fresh eyes and see if it feels intuitive to you.

Types are always treated the same, regardless whether they are inferred or explicitly listed. This has been a principal that TS has operated under, and there are a lot of assumptions that are built on this, e.g. declaration file emit does not differentiate between inferred and explicit type annotations.

Seems reasonable, but perhaps not a hard constraint, given how tricky this problem is getting.

In particular, one approach that gets at the heart of the problem is to have .d.ts generators always emit the wider type if one is not specified. Because we have an opportunity to decide what should happen when we generate a .d.ts, and because we all agree that the inferred return type of a function shouldn't change based on usage, I don't think there's any problem with assuming that inferred types are always wider, while explicitly declared types are as-declared.

Identifying a declared type should be always possible. and once identified it should not be changed. To clarify the terminology, a declared type is either the explicit type annotation on a declaration, or the type of the initialization expression if one is not present, other wise any. This is what makes var x = 1 be a number.

I think this is too loose of a description of the semantic questions. Identified by who, in what context? Do you mean "when hovering over a variable in the IDE"? Or do you mean "once a function has a return type, it shouldn't change"?

In particular, I strongly agree with the decision that the TS made to avoid cross-function inference. So I agree that identifying the type of a function should always be possible, and once identified should not be changed unless the function body has changed.

I also agree that the type of a declared variable should never change from one type to an unrelated one. However, it is not at all obvious to me that those sources of agreement imply that automatic widening within a single function body should be disallowed. That question depends a lot on expected intuitions, as well as how effective we can make our error messages at guiding people in the right direction.

One final intuition, is that mutable locations (e.g. let, array element, or property declaration) are not literal type locations in general, mainly as literal types are not very useful in these conditions. e.g. var x = 0; assuming that the type of x is 0 is not very useful, and most likely the user intent is to have x be a number.

I simply disagree that this is an "intuition". I would accept that it might be "a heuristic we can teach people", but I strongly disagree that this behavior is something that somebody would intuit without instruction and memorization.

I also disagree that literal types are not very useful in these conditions, especially when type aliases are used:

type Event = "onmouseover" | "onmouseout";
function onmouseover(): Event { return "onmouseover"; }
function onmouseout(): Event { return "onmouseout"; }
function register(event: Event, callback: DOMCallback) { window.addEventListener(event, callback); }

let event = onmouseover();

if (someCond) {
  event = onmouseout();
}

register(event, e => { console.log(e); });

In this case, the literal is pretty useful and intuitive, and it's surprising that event becomes string simply because I used let.


With all that said, I'd like to take another tack at expressing my concern about giving literals special semantics.

Here's an example that's pretty similar conceptually to the problem:

function text(text: string, inline = false) {
  let el = document.createElement('div');

  if (inline) {
    el = document.createElement('span');
  }

  el.innerText = text;
}

As in the string literal situation:

  • a human author might very well expect document.createElement to return an Element, just as a human might expect a string to be string
  • HTMLDivElement is a subtype of Element, just as "onmouseover" is a subtype of string
  • a human author would find the error here very confusing ([ts] Type 'HTMLSpanElement' is not assignable to type 'HTMLDivElement'. Property 'align' is missing in type 'HTMLSpanElement'.) just as a human author might find the string error confusing.
  • a user might well want precisely an HTMLDivElement (if they were trying to use align 😉) just as a user might well want precisely an "onmouseover" (if calling a function that expects it).

A great part of the reason that this doesn't matter as much as expected in practice is that mutation to a different subtype of the same supertype is relatively rare. In the absence of mutation, passing the subtype to a function that expects the supertype works as expected.


Finally, I think it's important to acknowledge that the primary scenario driving the decision to treat const and let differently is this:

let x = "hello";

if (someCond) {
  x = "goodbye";
}

First, some examples to illustrate why it's the driving scenario:

function print(s: string) {
  console.log(s);
}

function hello(h: "hello") {
  helloworld(h, "world");
}

let x = "hello"; // let's assume we make x: "hello"
print(x); // type checks, because "hello" is a subtype of string
hello(x); // type checks, because "hello" is "hello"

However:

let x = "one";

if (someCond) {
  x = "two"; // we think it's problematic/unintuitive for this to be a type error
}

I think this is definitely a concern worth thinking about, because it's absolutely true that most people wouldn't expect the narrower type in this scenario.

However, I think solving it by differentiating between let and const is an overly broad hammer to attack this problem with.

Remember that the same problem, with the same intuitive hurdles, exists for other kinds of types:

let x = document.createElement('div');

if (someCond) {
  x = document.createElement('span'); // we aren't as concerned about making this a type error
  // [ts] Type 'HTMLSpanElement' is not assignable to type 'HTMLDivElement'.
  //       Property 'align' is missing in type 'HTMLSpanElement'.
}

We could use the same logic about mutable locations to argue that let x should produce a wide type (maybe the most general subtype of Object) and that users should write const x if they want the narrow type, but we don't (and I don't think we should either).


The most natural way to address the various constraints is to respect the types that users wrote down and use wider types if no narrower type is ever specified.

This addresses any scenario where an explicit type was declared for a literal anywhere:

function onmouseover(): Event { return "onmouseover"; }

let x = onmouseover(); // x: Event because that's what the function said

We can now address the exact literal case much more narrowly, with one of the following solutions:

Option 1. const and "only-assigned-at-initialization let to a literal" uses the narrower type, while actually-muated let uses the wider type. The user can use a type ascription on the original let to get a narrower type if they're mutating the let.

Option 2. all const and let to a string literal use the narrower type, and if the user mutates a let variable to a different subtype of the same primitive type, we instruct them to add : string to the original declaration (I can see why this might seem annoying, but it's consistent with how we handle Element, not so common, and a good error would go a long way).

Option 3. use internal-to-function-only inference to give mutated let variables a union type of all of the string literals that the variable is assigned to.

Important: any type produced by an abstraction gets the explicit return type specified by the function. function() { return "foo"; } has the inferred return type string.


I want to flesh out Option 3 because it has some non-obvious properties:

// inferred return type: `string`
function hello() {
  return "string";
}

// inferred return type: `string`
function world() {
  return hello();
}

type Food = "Hamburger" | "Hot Dog" | "Salad";
function burger(): Food {  // return type Food
  return "Hamburger";
}

function hotdog(): Food { // return type Food
  return "Hot Dog";
}

// inferred return type: Food, from burger()
// note: inferred return types are always fixed, but they are also inductive
function hamburger() {
  return burger();
}

function eat(food: Foo) {

}

let food = burger(); // inferred type: Food, supplied by burger();
eat(food); // type checks

function main() {
  let food = burger(); // inferred type: Food

  if (Math.random() > 0.5) {
    food = hotdog(); // type checks, food: Food, and hotdog(): Food
  }

  let name = "Yehuda"; // inferred type: "Yehuda" | "YEHUDA";

  if (Math.random() > 0.5) {
    name = "YEHUDA";
  }

  // inferred type: "TypeScript" | Food
  let language = "TypeScript";

  if (Math.random() > 0.5) {
    language = burger(); 
  }

  // inferred type: string
  let author = "Anders";

  if (Math.random() > 0.5) {
    author = hello(); // reminder -> hello(): string
  }
}

The interesting thing about these semantics are:

  • Determining the type of a mutable let can always be accomplished by unioning the type of all direct assignments to the mutable variable
  • A particular function always has a fixed return type, which is either the specified named type, the wider type, or inductively determined from the functions it calls.
  • In many common cases, it will be possible to determine the type of a mutable variable (when initially assigned to a subtype of a primitive) purely syntactically (if the string literal "foo" and "bar" are the only assignments to a mutable variable, its type is "foo" | "bar"). But in all cases, it is possible to determine the type of such a variable with a single hop to a function or slot with a fixed type.
  • The behavior of const is as desired (const x = "foo" -> const x: "foo" = "foo"), but it produces a shallower cliff to let.

There is also an alternative variant of these semantics that might be be even easier to implement and satisfy more constraints than the current design (but still have some pitfalls):

When the initial right-hand-side of a let variable is a string literal, its type is the union of all of the right-hand-sides in the same function, iff all of the rhs'es are string literals. Otherwise, the type of a let variable is the type of its rhs.

Applying the same inductive rules to this variant allows explicit types to be respected, and avoids a semantic difference between let and const (because const x = "foo" will trivially satisfy the string literal restriction). The main caveat is that let x = "Hamburger"; x = hotdog(); would produce x: string, but that's a rather minor caveat compared to the much larger cliffs we're facing today.


I apologize for how long this reply has gotten -- I didn't intend to write such a big a wall of text initially, and that may account for some incoherence between the beginning and ending of the comment. Please ask for clarifications or further fleshing out if something is confusing.

@wycats
Copy link

wycats commented Sep 21, 2016

Oh, one last thing. A very simple way to understand the confusion Godfrey had initially was that these two results seem incoherent and surprising:

let x: Event = foo();

// works differently from

let x = foo(); // where foo() is explicitly defined as returning Event

@DanielRosenwasser
Copy link
Member

DanielRosenwasser commented Sep 22, 2016

I still think that mutability widening tends to have surprising characteristics in a lot of instances like this. The idea that types don't transitively flow through declarations feels odd. We should ensure we keep an eye out for any more feedback on this.

@ahejlsberg ahejlsberg added Bug A bug in TypeScript Fixed A PR has been merged for this issue and removed Breaking Change Would introduce errors in existing code labels Sep 24, 2016
@ahejlsberg
Copy link
Member

@chancancode @wycats We have a fix for this in #11126. Explicit type annotations are now respected and your examples behave as expected. Thanks for the well considered feedback (even if it did arrive as a wall of text 😃).

@ahejlsberg ahejlsberg added this to the TypeScript 2.1 milestone Sep 24, 2016
@ahejlsberg ahejlsberg reopened this Sep 24, 2016
@normalser
Copy link

normalser commented Sep 24, 2016

I'm glad that #11126 helps with examples above - but wow - rules got complicated now. One need to think now in terms of literal types and widening literal types and combination of them and some cases that cancel the widening if there is non-widening form that gets priority :)

This example

declare let cond: boolean;
const c1 = cond ? "foo" : "bar";
const c2: "foo" | "bar" = c1;
const c3 = cond ? c1 : c2;
const c4 = cond ? c3 : "baz";
const c5: "foo" | "bar" | "baz" = c4;

let v1 = c1;  // string
let v2 = c2;  // "foo" | "bar"
let v3 = c3;  // "foo" | "bar"
let v4 = c4;  // string
let v5 = c5;  // "foo" | "bar" | "baz"

reminds me one of this famous CoffeeScript question - "What is the result of :"

foo bar and hello world

where users might need to compile the code first to check if it is foo(bar && hello(world)) or foo(bar) && hello(world)

I wish there was a way to satisfy both:

  • use cases above and
  • keeping the language easy to reason about - but maybe there is no such way.

One request though would be to add support to TSServer so that it could show users if literal type is detected as widening literal type, because right now for example above it will show:

image

and one could be surprised that v4 is of type string

so instead of

const c4: "foo" | "bar" | "baz"

if it could show:

const c4: "foo" | "bar" | widening "baz"

Thanks a lot for awesome TS

@ahejlsberg
Copy link
Member

ahejlsberg commented Sep 24, 2016

@wallverb Yeah, I specifically tried to capture as many subtleties as possible in a short example, and it definitely adds another layer to the rules. I think the intuition is reasonably simple though--as long as an explicit type annotation hasn't been encountered for a particular literal in an expression, that literal type will widen to string when inferred for a mutable location.

@ahejlsberg
Copy link
Member

@wallverb I'll think about the suggestion of showing widening (or some other indicator) in the hints. In general I'm not crazy about showing syntax that isn't supported in the language, and I don't think the distinction merits elevation to an actual type modifier.

@gcnew
Copy link
Contributor

gcnew commented Sep 25, 2016

I find it really unfortunate that now more annotations are needed for use-cases where the implicit intent is to use a literal union as a type. I can't formulate a formal proposal, but my strong feeling is that wherever an explicit literal type annotation is given, it should be preserved regardless of mutable or immutable locations.

For example:

type Rel = 'lt' | 'eq' | 'gt';

function compare(x: number, y: number): Rel {
    return (x === y) ? 'eq' : (x > y) ? 'gt' : 'lt';
}

const cmpRes = compare(1, 2); // `Rel`, ok
let cmpResMut = compare(1, 2); // `string`, what? when did I say that?
                               // for what reason will I ever assign 'hello' to a comparison result?

The second widening is like widening a subclass to the base or any value to Object (as @wycats pointed out). It allows for unintended values to be assigned and is arguably not intuitive. If this was the intent, than a natural (and a bless for the reader/maintainer) solution is to add the widened annotation by hand.

The same applies to object literals:

const x = {
    left: 1,
    right: 2,
    cmp: compare(1, 2) // why string?
}

class A {}
class B extends A {}
const y = { instance: new B() }; // it is correctly `{ instance: B }`

As a side note, this is also a regression in the display of types in IDEs, cmpRes used to be Rel now it is expanded to its members.

@ahejlsberg
Copy link
Member

@gcnew Agreed. That is exactly the fix we have in #11126.

@gcnew
Copy link
Contributor

gcnew commented Sep 26, 2016

Well if that's the case then it's good news, I couldn't get it from the examples. I still have the feeling that apparent literal types are inferred less often after #10676. With #11126 we'll have a mechanism for forcing them at the expense of verbosity (and unfortunately they are not transitive :().

@wycats
Copy link

wycats commented Sep 26, 2016

type Rel = 'lt' | 'eq' | 'gt';

function compare(x: number, y: number): Rel {
    return (x === y) ? 'eq' : (x > y) ? 'gt' : 'lt';
}

const cmpRes = compare(1, 2); // `Rel`, ok
let cmpResMut = compare(1, 2); // `string`, what? when did I say that?
                               // for what reason will I ever assign 'hello' to a comparison result?

Unless I understand wrong, both cmpRes and cmpResMut would have the type Rel under #11126, because compare() explicitly expresses a return type of Rel.

@ahejlsberg
Copy link
Member

ahejlsberg commented Sep 26, 2016

@gcnew Not sure what you mean by not transitive. One thing is certain though: With #10676 and #11126 in place, we'll never infer less specific types for string and numeric literals than we used to. Specifically, we behave just like before when you have explicit type annotations, and whereas we used to immediately widen un-annotated const declarations we now preserve string and numeric literal types from un-annotated const declarations for as long as possible. It's strictly a better thing.

@gcnew
Copy link
Contributor

gcnew commented Sep 27, 2016

My apologies. Literal types now work as expected, thanks :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Bug A bug in TypeScript Fixed A PR has been merged for this issue
Projects
None yet
Development

No branches or pull requests

7 participants