Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISC: Use of assertions in user-facing code #37350

Closed
dsaxton opened this issue Oct 22, 2020 · 7 comments
Closed

DISC: Use of assertions in user-facing code #37350

dsaxton opened this issue Oct 22, 2020 · 7 comments
Labels
Error Reporting Incorrect or improved errors from pandas Needs Discussion Requires discussion from core team before further action

Comments

@dsaxton
Copy link
Member

dsaxton commented Oct 22, 2020

There are a variety of places in the code base where we have some bare assertions that can be somewhat confusing when encountered by end-users (e.g., #35509). As @WillAyd pointed out these are sometimes useful in convincing mypy of something's type, but is this nonetheless something we want to avoid as a general rule? If indeed the assertion is valid then it shouldn't actually raise, but if it can then probably we could come up with a more appropriate error type / message than an empty assertion.

@dsaxton dsaxton added Error Reporting Incorrect or improved errors from pandas Needs Discussion Requires discussion from core team before further action labels Oct 22, 2020
@jbrockmendel
Copy link
Member

IIUC these are largely mypy-motivated. on those im going to follow @simonjayhawkins's lead

@simonjayhawkins
Copy link
Member

simonjayhawkins commented Oct 26, 2020

see also #32785 for related discussion or asserts for mypy. (we have had a few discussions on this in the past in different places.)

IIUC these are largely mypy-motivated.

in the issue referenced, it appears to not be a mypy related assert, as a assert/cast is not necessary for type narrowing in this case.

but is this nonetheless something we want to avoid as a general rule?

the assertion did cause as regression, but at the same time did reveal that a wrong path was being taken (#37023 (comment)) and a gap in our testing.

In this case the assert was helpful (even if not for end users) since although the assert added should not have been necessary since the slobj argument of _slice is typed as slice and we would expect mypy to report this at the call site.

The revealed type of indexer in __getitem__ is 'Any' ( 'reveal_type' always outputs 'Any' in unchecked functions)

As we add more types (and refine the return types) asserts and casts could become less necessary, however there are other issues such as is_* functions not narrowing types.

If indeed the assertion is valid then it shouldn't actually raise, but if it can then probably we could come up with a more appropriate error type / message than an empty assertion.

These assertions are generally to detect bugs in internal pandas code and not users code, so in many cases not necessary imo.

@dsaxton
Copy link
Member Author

dsaxton commented Oct 26, 2020

I can see how these could be useful in finding paths that aren't being tested directly, but am more wondering whether it's appropriate to release them into the wild (I don't know if there's an easy way to remove assertions in release code).

If there are any undetected bugs at release time then I would say the assertions have not succeeded in finding them, at least during development. To me it seems allowing these bugs to then break end user code through a bare assertion (and then hopefully leading to an issue being created on GitHub) may not be the best approach, especially since the bug might not be critical for what the user is trying to accomplish.

@WillAyd
Copy link
Member

WillAyd commented Oct 26, 2020 via email

@rhshadrach
Copy link
Member

rhshadrach commented Oct 27, 2020

IMO asserts should be used to check an assumption that means there is a bug when incorrect. As an end user, if I see code raising an assert, it makes me more aware that I may have stumbled on a bug rather than incorrect usage. Asserts can be incorrectly placed resulting in unwelcome behavior, but that's the same with any line of code.

@dsaxton dsaxton closed this as completed Oct 31, 2020
@jorisvandenbossche
Copy link
Member

General typing question: why not use cast instead of assert isinstance ?
(or stated differently: what are the differences and when should we use which of those two generally speaking?)

@WillAyd
Copy link
Member

WillAyd commented Dec 7, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

6 participants