-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Report all fields which fail registry transformation #496
base: main
Are you sure you want to change the base?
Conversation
if len(errs) > 0: | ||
raise RegistryTransformerException(f"Could not upsert record {name}:" + "\n".join(errs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if len(errs) > 0: | |
raise RegistryTransformerException(f"Could not upsert record {name}:" + "\n".join(errs)) | |
if len(errs) > 0: | |
if len(errs) > 9: | |
rest = len(errs) - 9 | |
errs = errs[:9] + [f"({rest} error(s) hidden)"] | |
raise RegistryTransformerException(f"Could not upsert record {name}:" + textwrap.indent("\n".join(errs))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ayushkamat
Thanks for the feedback. I've incorporated the above suggestion, but I'd really prefer to present an error message that includes all transformation failures.
This exception is raised when uploading records to the Registry. This is something that often occurs in the final stages of a workflow, so if a subset of errors are obscured, the user must re-execute the entire workflow to discover the remaining errors. This can be time consuming and expensive.
The hope of this PR was to expose all errors at once to avoid this. With that in mind, would you be willing to keep the original implementation?
I do like the suggestion to use textwrap
and I'll incorporate that regardless.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. My counter-argument is that typically there are very few "structurally unique" errors when doing bulk inserts (e.g. if you have an error like "LatchFile" cannot be assigned to type "str" or something, its likely that there are 300 more errors with the exact same message). Because of this I think there is limited value in printing everything out all of the time, and we should still limit the output of this to avoid spam.
That being said for now its fine to print everything out given it will be some work to separate out all of the unique errors.
Feel free to leave it as is and print everything, but please add a todo to filter this (you can assign me to it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're saying.
To be clear - this message is reporting all the errors associated with the insert of a single record, not a batch insert. While it is still unlikely that a single record would yield more than 10 errors (hopefully 🙂 ), I think in this context it would be appropriate to leave the set of errors unfiltered.
I agree that it's sensible to filter error messaging for a bulk insert, but I think that would happen elsewhere upstream.
Co-authored-by: Ayush Kamat <[email protected]>
Hi,
I ran into a bug that was a bit challenging to troubleshoot because the error reporting in
to_registry_literal()
does not include the name of the failing field, and only sometimes includes the failing value.The name of each transformed field is only in scope within
upsert_record()
, so I thought it most sensible to catch the thrown exception there and add the relevant field and value to the error message.This pattern has the added benefit of validating all of the upserted values before erroring out, so in the event multiple fields are malformatted, the user can see all malformatted fields at once instead of one malformatted field per execution.