-
-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
option to specify default value upon validation coercion #502
Comments
Thanks @bfmcneill this is a good idea! This highlights the difference between parsing and validation: parsing modifies values to fulfill certain assumptions while validations just check that those assumptions are true, e.g. pydantic is primarily a parsing library while pandera is primarily a validation library. The only parsing that pandera does today is through the Providing an interface for specifying custom parsers will need a little more thought, but I think a I was a little hesitant to expand the parsing capabilities of pandera, the main concern being that it encroaches on the concern of data manipulation (which the the Let me know if you have other thoughts/would have capacity to make a contribution! |
I need to specify a default value for a |
I think |
Hi, I would be happy to have a go at this. I had a look at the other issues that @cosmicBboy tagged. The nullable column would also be a good thing.
In my case I would like it to work like Now there is the question of the implementation itself. I have done something that mirrors the implementation of the Again happy to help. Let me know what you think |
Hey @rbeucher thanks! I'm currently doing a major overhaul of pandera's internals, which should make adding new features like this easier: #381 You can check out progress on this branch: https://github.com/unionai-oss/pandera/tree/core-schema I'm gonna try to get this done by the end of November, I'll ping you when it's ready! |
Great. Looking forward to it. I haven't look at the other branch but did see that you mentioned some refactoring in other posts. |
Thanks to all Pandera contributors! I would love a |
What is the status of this issue, still open? Would love to see a |
hi @jtlz2 this work is currently blocked by the completion of the pandera internals re-write, just need to clean up a few things before releasing |
@cosmicBboy I'd be interested 👌 |
@cosmicBboy https://github.com/unionai-oss/pandera/tree/core-schema gives me a 404, this may mean the PR has been merged and branch deleted? If so would be keen to get started on this work. @rbeucher are you still interested? |
Hi @kykyi yes At a high level, here's what needs to happen:
Please check out the contribution guide for the process of making a PR, and feel free to ask me any questions here! |
Thanks @cosmicBboy I'll get started and ask questions here as I go 🙏 🚀 !! |
Yes. I am still very much interested. I have not looked at 0.14 yet. |
Hey @cosmicBboy do you mind please giving some early feedback on my fork? Bit of an open-source n00b so may need some hand holding 😄 |
Hey @cosmicBboy running |
@kykyi looks like codecov was yanked from pypi: https://twitter.com/hynek/status/1646162688676974594 will need to spend some time migrating to the new language-independent codecov uploader binary: https://docs.codecov.com/docs/codecov-uploader |
fixed by #1136 |
Describe the solution you'd like
During schema validation it would be helpful to not only coerce the data type but to also have the option to fill in NaN
Describe alternatives you've considered
This could be achieved through pandas dataframe manipulation but it would be pretty slick to have an option to default column value as part of the schema validation
Additional context
Perhaps there is a better way to achieve this which you might recommend?
The text was updated successfully, but these errors were encountered: