Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No row index provided in failure_cases["index"] after type check failure #1080

Closed
2 of 3 tasks
shanetorres opened this issue Jan 25, 2023 · 4 comments · Fixed by #1106
Closed
2 of 3 tasks

No row index provided in failure_cases["index"] after type check failure #1080

shanetorres opened this issue Jan 25, 2023 · 4 comments · Fixed by #1106
Labels
bug Something isn't working

Comments

@shanetorres
Copy link

shanetorres commented Jan 25, 2023

Describe the bug
If a column fails the initial type check, no index is provided for that row in the failure_cases["index"], so I am unable to drop the row from the original dataframe.

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandera.
  • (optional) I have confirmed this bug exists on the master branch of pandera.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

schema: pa.DataFrameSchema = pa.DataFrameSchema(columns={
  'Col1': pa.Column(str),
})

df: pd.DataFrame = pd.DataFrame({
    "Col1": ["1", "2", 3]
})

try:
    schema(df, lazy=True)
except pa.errors.SchemaErrors as exc:
    filtered_df = df[~df.index.isin(exc.failure_cases["index"])]

Expected behavior

I would expect the failure cases index to include the index for the 3rd row that contained an integer for Col1 instead of a string. This way, the failing row could be dropped if my desire is to ignore the failing rows and continue processing rather than halt the application. However there is no index present. Printing out exc.failure_cases["index"] reads:

Out[1]: 
0    None
Name: index, dtype: object

Desktop (please complete the following information):

  • OS: Windows
@KarthikKothareddy
Copy link

Hello, any update on this bug?

@cosmicBboy
Copy link
Collaborator

Hi @shanetorres which version of pandera are you using?

I'm seeing a different issue, which is that validation is passing, which is a bug in the way str types are validated. Can you try doing a development installation of the main branch and re-running your code?

@cosmicBboy
Copy link
Collaborator

created this PR to fix the issue I was seeing on the main branch. @shanetorres let me know if you can confirm that fixes the issue you're seeing in your code

@patelnets
Copy link

Sorry for reopening this but i'm still seeing this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants