Skip to content
This repository has been archived by the owner on Mar 9, 2018. It is now read-only.

Add possibility to check values to index validator #36

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

markusbaden
Copy link

Sometimes we want to validate that a DataFrame contains certain columns, without necessarily worrying about what is the content of that column.

In my case I am parsing a file as part of an ETL process and want to check the result. My expectation is that the file will always have the same columns in the same place. Say the columns I am expecting are ['a', 'b'], then

pd.DataFrame(
    [
        [1, 2], 
        [3, 4],
    ],
    columns=['a', 'b'],
)

would be valid, while both

pd.DataFrame(
    [
        [2, 1], 
        [4, 3],
    ],
    columns=['b', 'a'],
)

and

pd.DataFrame(
    [
        [1, 2], 
        [3, 4],
    ],
    columns=['x', 'y'],
)

would not be valid. This case is very strict (i.e. order of columns matters) and we might want to relax this in future iterations.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant