Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve usage of outer context for inference #5699

Merged
merged 14 commits into from
Oct 2, 2018

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Sep 30, 2018

Fixes #4872
Fixes #3876
Fixes #2678
Fixes #5199
Fixes #5493
(It also fixes a bunch of similar issues previously closed as duplicates, except one, see below).

This PR fixes a problems when mypy commits to soon to using outer context for type inference. This is done by:

  • Postponing inference to inner (argument) context in situations where type inferred from outer (return) context doesn't satisfy bounds or constraints.
  • Adding a special case for situation where optional return is inferred against optional context. In such situation, unwrapping the optional is a better idea in 99% of cases. (Note: this doesn't affect type safety, only gives empirically more reasonable inferred types.)

In general, instead of adding a special case, it would be better to use inner and outer context at the same time, but this a big change (see comment in code), and using the simple special case fixes majority of issues. Among reported issues, only #5311 will stay unfixed.

@ilevkivskyi
Copy link
Member Author

Note about tests: you might notice that many added tests hit the special cases. However, I checked that all new tests still pass if I remove the old special cases.

@ilevkivskyi
Copy link
Member Author

This PR causes no new errors in the internal code bases.

Copy link
Collaborator

@Michael0x2a Michael0x2a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I have enough context on how we handle generics to say for certain whether or not this approach is the best one/give more global feedback, but FWIW these changes all look pretty reasonable to me.

I did have a few questions, but they're mostly about the old existing code that you were copying over -- I only had a few nitpicks for the actual changes you're making.

#
# See https://github.com/python/mypy/pull/5660#discussion_r219669409 for
# more context.
# TODO: Overloads only use outer context to infer type variables in a given ovelroad variant,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: ovelroad -> overload


outer(f(B()))
x: A = f(B())
[out]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: do we need these [out] sections here? It seems slightly more concise to omit them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually like these empty sections because they make it easier to see where a test ends on a quick look. But it is not very strong preference.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make a project-wide decision here (one way or the other). Now some PRs have empty [out] sections while others may remove them as redundant.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will skip them.

test-data/unit/check-inference-context.test Show resolved Hide resolved
test-data/unit/check-inference-context.test Outdated Show resolved Hide resolved
@@ -9,13 +9,17 @@


def apply_generic_arguments(callable: CallableType, orig_types: Sequence[Optional[Type]],
msg: MessageBuilder, context: Context) -> CallableType:
msg: MessageBuilder, context: Context,
only_allowed: bool = False) -> CallableType:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: maybe a better variable name might be something like skip_unsatisfiable_bounds? It's wordier, but maybe it communicates the semantics more cleanly?

(I was initially a little confused by only_allowed + the description below because we never apply they given types if they don't satisfy the typevar bound or constraints whether or not this flag is True or False -- the main difference is whether we raise an error or just skip the application.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the current name is not perfect, but TBH I don't like the longer version either. Maybe just skip_unsatisfied?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just skip_unsatisfied?

👍

types[i] = None
else:
msg.incompatible_typevar_value(callable, type, callable.variables[i].name,
context)
upper_bound = callable.variables[i].upper_bound
if type and not mypy.subtypes.is_subtype(type, upper_bound):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: both your code and the old code seem to potentially check both the value constraints and the upper bound. Is that intentional? I was under the impression that if there are any value constraints, the upper bound will always be object, which means that if the above if statement runs, there'd be no need to check the upper bound.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as an optimization we can put this block in an else: branch.

# this *seems* to usually be the reasonable thing to do.
#
# See also github issues #462 and #360.
ret_type = NoneTyp()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it maybe be more correct to temporarily pretend the return type is either Any or the uninhabited type instead of None?

(I bring this up mostly because I'm not sure how exactly infer_type_arguments behaves in no-strict-optional vs strict-optional mode -- it feels like there might maybe be some subtle difference in behavior since None inhabits every type in the former mode and doesn't in the latter. So maybe we should set ret_type to something that behaves the same way in both modes? Not sure.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it doesn't really matter. Having anything that doesn't depend on function type variables will will infer no constraints. Probably the cleanest option is to just return at this point, I will check if this works.

@ilevkivskyi
Copy link
Member Author

@Michael0x2a Thanks for review! I fixed most comments, tomorrow I will add the negative tests as you suggested.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see so many issues fixed! This looks reasonable, just left some minor comments.

Random idea: If type variable with a bound is used in an invariant context and we can't infer a value, we currently infer C[<nothing>]. What if we'd instead inferred C[<bound>]? This would generate better error messages. Alternatively, maybe we should generate an error right away.

"""Apply generic type arguments to a callable type.

For example, applying [int] to 'def [T] (T) -> T' results in
'def (int) -> int'.

Note that each type can be None; in this case, it will not be applied.

If `skip_unsatisfied` is True, only apply those types that satisfy type variable
bound or constraints (and replace the type with `None`), instead of giving an error.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this docstring correct? It seems to me that the type is not replaced at all if it's unsatisfied.

#
# See also github issues #462 and #360.
ret_type = NoneTyp()
if mypy.checker.is_optional(ret_type) and mypy.checker.is_optional(ctx):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that is_optional is used from two modules, I'd move it to somewhere else, such as mypy.types. remove_optional is similar.

#
# Give up and just use function arguments for type inference. As an exception,
# if the context is a generic instance type, actually use it as context, as
# this *seems* to usually be the reasonable thing to do.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is only reasonable if the generic instance type is invariant in some type arg? I wonder if that would be a bit more "principled".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I am not sure this is better. Currently, we ignore the return context for cases like T <: int but use it for T <: Iterable[int]. So the only way how only using T <: List[int] will be better, is because invariant containers statistically have less subtypes. As I understand the original problem this solves is like this:

def f(x: List[T]) -> T: ...
y: object = f([1, 2, 3])  # Incompatible argument type "List[int]", expected "List[object]"

In other words the problem appears when the return type appears as a variable in invariant context in one of argument types.

However, if you prefer to only use invariant ones, I can update this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that if the context is, say, List[object], List[int] wouldn't be valid due to invariance, so we are more likely to want to use the surrounding type context and not rely on arguments only. However, if the context is Iterable[object], Iterable[int] would be okay, due co covariance, so we ignore the context. Whether there are many subtypes or not doesn't seem significant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, this is actually exactly what I meant :-) Iterable[object] has more subtypes, such as Iterable[int]. While List[object] doesn't have such, so it is typically better to still use it as a context. OK, I will modify the special case to reflect this.


outer(f(B()))
x: A = f(B())
[out]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably make a project-wide decision here (one way or the other). Now some PRs have empty [out] sections while others may remove them as redundant.


y: B[int]
outer(f(y))
x: A[int] = f(y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it looks like the special case for generic context is only important since A is invariant. Maybe also add a test case where A is covariant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point. I will add a test with a covariant type.

def f(x: List[T]) -> T: ...

# mypy infers List[<nothing>] here, and <nothing> is a subtype of str
y: str = f([])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me feel a bit uneasy, but I guess this is fine.

T = TypeVar('T', bound=int)
def f(x: List[T]) -> List[T]: ...

# TODO: improve error message for such cases, see # 3283
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace # 3283 -> #3283.

#3283 is not just about improving error messages -- it could be fixed by hanging how type inference works, but this solution might not help here. Can you create a separate issue just about this error message -- error message should have something about the type variable bound.

@ilevkivskyi
Copy link
Member Author

@JukkaL Thanks for review! I have implemented all comments except the suggestion about special case. Do you think it is worth changing?

@ilevkivskyi
Copy link
Member Author

@JukkaL @Michael0x2a does this look good now?

Copy link
Collaborator

@Michael0x2a Michael0x2a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me!

(but probably Jukka should have the final say here)

@ilevkivskyi ilevkivskyi merged commit 626ff68 into python:master Oct 2, 2018
@ilevkivskyi ilevkivskyi deleted the outer-context branch October 2, 2018 11:44
TV4Fun pushed a commit to TV4Fun/mypy that referenced this pull request Oct 4, 2018
Fixes python#4872 
Fixes python#3876
Fixes python#2678 
Fixes python#5199 
Fixes python#5493 
(It also fixes a bunch of similar issues previously closed as duplicates, except one, see below).

This PR fixes a problems when mypy commits to soon to using outer context for type inference. This is done by:
* Postponing inference to inner (argument) context in situations where type inferred from outer (return) context doesn't satisfy bounds or constraints.
* Adding a special case for situation where optional return is inferred against optional context. In such situation, unwrapping the optional is a better idea in 99% of cases. (Note: this doesn't affect type safety, only gives empirically more reasonable inferred types.)

In general, instead of adding a special case, it would be better to use inner and outer context at the same time, but this a big change (see comment in code), and using the simple special case fixes majority of issues. Among reported issues, only python#5311 will stay unfixed.
ilevkivskyi added a commit that referenced this pull request Dec 5, 2018
Mitigates #5738 by reverting heuristics for type variable return vs instance type outer context changed in #5699. Both versions are kind of unprincipled, but the old one _seems_ to work better in real codebases.

This doesn't touch the core fixes in #5699
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants