-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gdt 82 transformer class refactor #106
Conversation
* Rename Transformer > XmlTransformer to prepare for new base class * Rename xml arg to source_record for XmlTransformer methods
Why these changes are being introduced: * A format-agnostic Transformer base class is needed for deriving both XmlTransformer and JsonTransformer format classes How this addresses that need: * Create a Transformer base class * Add JSON type for validation * Refactor XmlTransformer to derive from Transformer class * Rename arg xml > source_record and update docstrings Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-82
Why these changes are being introduced: * parse_xml_records is a function in the helpers module that will be replaced by a method in the Transformer base class How this addresses that need: * Add parse_source_records method to Transformer base class as an abstractmethod * Add parse_source_records to XmlTransformer class and corresponding unit test Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-82
* Remove parse_xml_records function from helpers module and corresponding unit test * Replace all calls of that function with the appropriate class method
* Shift get_transformer from config module to Transformer class staticmethod along with corresponding unit tests * Update CLI to call new method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think this looks fantastic.
The requested changes are mostly quite small, just some rough edges from a fairly big refactoring process.
And, acknowledging per conversations that some more changes may be coming related to getting the appropriate transformer class type and parsing input records.
* Remove instance from transformer_instance variable name * Update write_timdex_records_to_json type hinting * Update get_transformer type hinting * Update docstrings
* Add load and write_timdex_records_to_json methods to Transformer class * Remove write_timdex_records_to_json function from helpers module * Update CLI command with new methods * Update variable names and docstrings for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved!
I did have one question, but perhaps this is slated for a future PR. Any reason to not move helpers.write_deleted_records_to_file()
(link) to a Transformer
method as well?
If so, I think it calls into question the nature of the method refactored method Transformer.write_timdex_records_to_json()
.
I wonder if a higher level method like Transformer.write_output_files()
could encapsulate the logic and processing of both? It could:
- write all normal records to
output_file
- if deleted records, rename
output_file
and write - log all information that was formerly in CLI
Still marking as approved, as I think this PR represents a big and good change, but just floating that idea for future PRs.
Exactly what I was thinking, saving for a future PR since this was already so big! |
Helpful background context
The addition of the
Transformer
base class and associated refactoring will be followed by a newJsonTransformer
format class that will be used for upcomingaardvark
transforms.How can a reviewer manually see the effects of these changes?
Run a transform on a test fixture to ensure that the base class and associated refactor hasn't broken the application
What are the relevant tickets?
Developer
Code Reviewer
(not just this pull request message)
Includes new or updated dependencies?
YES