Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Property tests #14178

Merged
merged 5 commits into from
Dec 3, 2024
Merged

[red-knot] Property tests #14178

merged 5 commits into from
Dec 3, 2024

Conversation

sharkdp
Copy link
Contributor

@sharkdp sharkdp commented Nov 7, 2024

Summary

This PR adds a new property_tests module with quickcheck-based tests that verify certain properties of types. The following properties are currently checked:

  • is_equivalent_to:
    • is reflexive: T is equivalent to itself
  • is_subtype_of:
    • is reflexive: T is a subtype of T
    • is antisymmetric: if S <: T and T <: S, then S is equivalent to T
    • is transitive: S <: T & T <: U => S <: U
  • is_disjoint_from:
    • is irreflexive: T is not disjoint from T
    • is symmetric: S disjoint from T => T disjoint from S
  • is_assignable_to:
    • is reflexive
  • negate:
    • is an involution: T.negate().negate() is equivalent to T

There are also some tests that validate higher-level properties like:

  • S <: T implies that S is not disjoint from T
  • S <: T implies that S is assignable to T
  • A singleton type must also be single-valued

These tests found a few bugs so far:

Some additional notes:

  • Quickcheck-based property tests are non-deterministic and finding counter-examples might take an arbitrary long time. This makes them bad candidates for running in CI (for every PR). We can think of running them in a cron-job way from time to time, similar to fuzzing. But for now, it's only possible to run them locally (see instructions in source code).
  • Some tests currently find false positive "counterexamples" because our understanding of equivalence of types is not yet complete. We do not understand that int | str is the same as str | int, for example. These tests are in a separate property_tests::flaky module.
  • Properties can not be formulated in every way possible, due to the fact that is_disjoint_from and is_subtype_of can produce false negative answers.
  • The current shrinking implementation is very naive, which leads to counterexamples that are very long (str & Any & ~tuple[Any] & ~tuple[Unknown] & ~Literal[""] & ~Literal["a"] | str & int & ~tuple[Any] & ~tuple[Unknown]), requiring the developer to simplify manually. It has not been a major issue so far, but there is a comment in the code how this can be improved.
  • The tests are currently implemented using a macro. This is a single commit on top which can easily be reverted, if we prefer the plain code instead. With the macro:
    // `S <: T` implies that `S` can be assigned to `T`.
    type_property_test!(
        subtype_of_implies_assignable_to, db,
        forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t)
    );
    without the macro:
    /// `S <: T` implies that `S` can be assigned to `T`.
    #[quickcheck]
    fn subtype_of_implies_assignable_to(s: Ty, t: Ty) -> bool {
        let db = get_cached_db();
    
        let s = s.into_type(&db);
        let t = t.into_type(&db);
    
        !s.is_subtype_of(&*db, t) || s.is_assignable_to(&*db, t)
    }

Test Plan

while cargo test --release -p red_knot_python_semantic --features property_tests types::property_tests; do :; done

@AlexWaygood AlexWaygood added the red-knot Multi-file analysis & type inference label Nov 7, 2024
@sharkdp sharkdp force-pushed the david/property-tests branch 2 times, most recently from c1f05ab to 634b4f6 Compare November 7, 2024 22:17
sharkdp added a commit that referenced this pull request Nov 7, 2024
## Summary

Minor fix to `Type::is_subtype_of` to make sure that Boolean literals
are subtypes of `int`, to match runtime semantics.

Found this while doing some property-testing experiments [1].

[1] #14178

## Test Plan

New unit test.

This comment was marked as off-topic.

sharkdp added a commit that referenced this pull request Nov 8, 2024
)

## Summary

Another bug found using [property
testing](#14178).

## Test Plan

New unit test
@carljm
Copy link
Contributor

carljm commented Nov 8, 2024

This is awesome! Thank you for doing this, I've been wanting to explore this. I think property testing is very well suited to testing type relation invariants, and I do think we should move forward with actually landing this.

@sharkdp sharkdp force-pushed the david/property-tests branch from c1086b8 to 87c67df Compare November 9, 2024 19:21
@sharkdp sharkdp force-pushed the david/property-tests branch 3 times, most recently from 40de0e3 to 207eb44 Compare December 2, 2024 14:45
@sharkdp sharkdp force-pushed the david/property-tests branch 2 times, most recently from 7ccc8c6 to 6c44442 Compare December 2, 2024 15:42
@sharkdp sharkdp marked this pull request as ready for review December 2, 2024 15:43
@sharkdp

This comment was marked as resolved.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!

db,
t,
t.negate(db).negate(db).is_equivalent_to(db, t)
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another useful property we could test here when we fix the issue about is_subtype_of and non-fully-static types is that a non-fully-static type never participates in subtyping. (This assumes that we add Type::is_fully_static so we can check this predicate.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a task in #14524

@MichaReiser
Copy link
Member

I think you can just ignore them and they can then be run by name

@sharkdp
Copy link
Contributor Author

sharkdp commented Dec 2, 2024

I think you can just ignore them and they can then be run by name

That's an option, but I wanted to mark some of them as #[ignore]d for another reason as well (because they yield many false positives right now). I guess I can also just comment them out for now.. but I don't like commented out committed code :-/

@MichaReiser
Copy link
Member

I'd prefer ignoring them with different reasons. Commented out code defeats the reason why we're merging the tests in the first place

sharkdp added a commit that referenced this pull request Dec 3, 2024
## Summary

Simplify tuples containing `Never` to `Never`:

```py
from typing import Never

def never() -> Never: ...

reveal_type((1, never(), "foo"))  # revealed: Never
```

I should note that mypy and pyright do *not* perform this
simplification. I don't know why.


There is [only one
place](https://github.com/astral-sh/ruff/blob/5137fcc9c81610f687b6cb45413ef83c2c5eea73/crates/red_knot_python_semantic/src/types/infer.rs#L1477-L1484)
where we use `TupleType::new` directly (instead of `Type::tuple`, which
changes behavior here). This appears when creating `TypeVar`
constraints, and it looks to me like it should stay this way, because
we're using `TupleType` to store a list of constraints there, instead of
an actual type. We also store `tuple[constraint1, constraint2, …]` as
the type for the `constraint1, constraint2, …` tuple expression. This
would mean that we infer a type of `tuple[str, Never]` for the following
type variable constraints, without simplifying it to `Never`. This seems
like a weird edge case that's maybe not worth looking further into?!
```py
from typing import Never

#         vvvvvvvvvv
def f[T: (str, Never)](x: T):
    pass
```

## Test Plan

- Added a new unit test. Did not add additional Markdown tests as that
seems superfluous.
- Tested the example above using red knot, mypy, pyright.
- Verified that this allows us to remove `contains_never` from the
property tests
(#14178 (comment))
@sharkdp sharkdp force-pushed the david/property-tests branch from 6c44442 to 92ee882 Compare December 3, 2024 07:55
@sharkdp sharkdp force-pushed the david/property-tests branch from 92ee882 to 89c4c07 Compare December 3, 2024 08:07
Cargo.lock Outdated Show resolved Hide resolved
db.lock().unwrap()
}

macro_rules! type_property_test {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the macro fairly hard to read. What I understand is that their primary benefit is to reduce the repetition of creating the cached db and dereferencing it.

  • How much faster is reusing the db over creating a new database?
  • Have you considered adding a snapshot method to the database instead of locking it? Databases are thread safe and can be shared (for as long as you don't update the db with a &mut Db reference
        pub fn snapshot(&self) -> Self {
            Self {
                storage: self.storage.clone(),
                files: self.files.snapshot(),
                system: self.system.clone(),
                vendored: self.vendored.clone(),
                events: Arc::clone(&self.events),
            }
        }

A test then becomes

    #[test]
    #[quickcheck_macros::quickcheck]
    fn equivalent_to_is_reflexive2(t: crate::types::tests::Ty) -> bool {
        let db = get_cached_db();
        let t = t.into_type(&db);
        t.is_equivalent_to(&db, t)
    }

Which I don't find too bad

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much faster is reusing the db over creating a new database?

Three orders of magnitude. It's essential. I found some bugs only after I added this optimization.

  • Have you considered adding a snapshot method to the database instead of locking it

I have not. Would the benefit be to get rid of the let db = &*db;, or do you think that would be faster?

I find the macro fairly hard to read

Because macros are hard to read, or because this particular macro is hard to read? 😄

I don't disagree, but it's also not like we're hiding lots of complexity in there. It's really just a boilerplate-reduction macro. I personally care much more about the readability of the individual tests. The macro is ugly, yes. But I won't have to read the macro code again when writing new tests.

A test then becomes [...] Which I don't find too bad

Well, you picked the easiest example there ;-). With just one type and no logical implication. Did you see the macro-vs-non-macro comparison in the PR description? Or this example, which has three types:

#[quickcheck]
fn subtype_of_is_transitive(s: Ty, t: Ty, u: Ty) -> bool {
    let db = get_cached_db();
    let db = &*db;
    let s = s.into_type(db);
    let t = t.into_type(db);
    let u = u.into_type(db);

    !(s.is_subtype_of(db, t) && t.is_subtype_of(db, u)) || s.is_subtype_of(db, u)
}
type_property_test!(
    subtype_of_is_transitive,
    db,
    (s, t, u),
    s.is_subtype_of(db, t) && t.is_subtype_of(db, u) => s.is_subtype_of(db, u)
);

their primary benefit is to reduce the repetition of creating the cached db and dereferencing it

It also removes the necessity for lots of let t = t.into_type(db); lines, and allows us to write logical implications using premise => conclusion instead of !(premise) || (conclusion).

Let me know what you think. I'm going to be okay if we remove it again 😄.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not. Would the benefit be to get rid of the let db = &*db;, or do you think that would be faster?

It would allow you to remove the mutex guard and the deref (you still need to use &db when calling any function)

Because macros are hard to read, or because this particular macro is hard to read? 😄

It's somewhat both. I had to expand the macro to understand what the different parts map to. Like what's up with the (s, t, u)? which part is even the assertion? Some extra documentation might help.

Did you see the macro-vs-non-macro comparison in the PR description? Or this example, which has three types:

I admit, I did not! I see how it simplifies the code a bit, but it doesn't significantly reduce the number of lines.

Could we implement into_type for tuples so that you can keep writing (s, t, u) = (s, t, u).into_types(db)?

It also removes the necessity for lots of let t = t.into_type(db); lines, and allows us to write logical implications using premise => conclusion instead of !(premise) || (conclusion).

I do like that. Maybe we can limit the macro part to just that?

I'm not opposed to keeping the macros. I just had to expand them, and I still spent about 5 minutes trying to figure out what was going on. But I'm also not sure if it's worth it if you spend more time on it.

Copy link
Contributor Author

@sharkdp sharkdp Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like what's up with the (s, t, u)? which part is even the assertion?

These are valid concerns. I now pushed another commit where I change the syntax to the following:

type_property_test!(
    subtype_of_implies_assignable_to, db,
    forall types s, t. s.is_subtype_of(db, t) => s.is_assignable_to(db, t)
);

The intention is that it reads like a mathematical property ∀ S, T. ….

I also simplified the macro definition (only two clauses instead of four), clarified the variable names and added some documentation.

I kind of like how this looks, and would merge it for now. I'm happy to revisit it though if I get more feedback on it (from you or others).

It would allow you to remove the mutex guard and the deref (you still need to use &db when calling any function)

I'll note it down as a task for me (low priority), thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I like this much more!

@sharkdp sharkdp merged commit 7430900 into main Dec 3, 2024
21 checks passed
@sharkdp sharkdp deleted the david/property-tests branch December 3, 2024 12:54
dcreager added a commit that referenced this pull request Dec 3, 2024
* main:
  [`ruff`] Extend unnecessary-regular-expression to non-literal strings (`RUF055`) (#14679)
  Minor followups to RUF052 (#14755)
  [red-knot] Property tests (#14178)
  [red-knot] `is_subtype_of` fix for `KnownInstance` types (#14750)
  Improve docs for flake8-use-pathlib rules (#14741)
  [`ruff`] Implemented `used-dummy-variable` (`RUF052`) (#14611)
  [red-knot] Simplify tuples containing `Never` (#14744)
  Possible fix for flaky file watching test (#14543)
  [`flake8-import-conventions`] Improve syntax check for aliases supplied in configuration for `unconventional-import-alias (ICN001)` (#14745)
  [red-knot] Deeper understanding of `LiteralString` (#14649)
  red-knot: support narrowing for bool(E) (#14668)
  [`refurb`] Handle non-finite decimals in `verbose-decimal-constructor (FURB157)` (#14596)
  [red-knot] Re-enable linter corpus tests (#14736)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants