-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Timx 285 aardvark fmr #198
Conversation
* Reorder Aardvark unit tests to match method order * Add get_timdex_record_id unit test * Remove unnecessary non-field method unit test
Why these changes are being introduced: * Update Aardvark class to use current field method conventions How this addresses that need: * Add field methods and associated private methods for alternate_titles, content_type, contributors, and dates * Update unit tests with current conventions Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/TIMX-285
tests/sources/json/test_aardvark.py
Outdated
def test_aardvark_get_alternate_titles_success(aardvark_record_all_fields): | ||
assert MITAardvark.get_alternate_titles(next(aardvark_record_all_fields)) == [ | ||
def test_aardvark_get_alternate_titles_success(aardvark_records): | ||
source_record = next(aardvark_records) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stub records are a lot different with JSON vs. XML so I don't fully approve of this particular approach but figure we would discuss in this PR before I go further. All suggestions welcome so we can properly set our JSON source testing conventions going forward!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree that JSON records are easier to manipulate, given that once was parsed it's just a dictionary we can easily modify.
Instead of using the overhead of an iterator and next(...)
, what about a similar stub method -- for symmetry -- that just returns a barebones dictionary? Then the test is free to add/remove keys as needed.
While I think a test could just do:
source_record = {}
The benefit of a stub might be that we could set some shared defaults, maybe bits and bobs that we know are commonly accessed. But at the end of the day, it is just a dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct about the value of symmetry, I'll create a method!
def _issued_dates(cls, source_record: dict) -> Iterator[timdex.Date]: | ||
if issued_date := source_record.get("dct_issued_s"): # noqa: SIM102 | ||
if validate_date(issued_date, cls.get_source_record_id(source_record)): | ||
yield (timdex.Date(value=issued_date, kind="Issued")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm shifting the date validation before the Date
instance is created, which seems more inline with how we're doing it in other transforms but again, I want to hear everyone's thoughts on this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this pattern! This is very easy to parse and understand. With a nod to the oustanding date parsing issue, I think this looks great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than a couple of conversational comments, overall looks great! Thinks it's a testament to the quality of the original pass.
tests/sources/json/test_aardvark.py
Outdated
def test_aardvark_get_alternate_titles_success(aardvark_record_all_fields): | ||
assert MITAardvark.get_alternate_titles(next(aardvark_record_all_fields)) == [ | ||
def test_aardvark_get_alternate_titles_success(aardvark_records): | ||
source_record = next(aardvark_records) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd agree that JSON records are easier to manipulate, given that once was parsed it's just a dictionary we can easily modify.
Instead of using the overhead of an iterator and next(...)
, what about a similar stub method -- for symmetry -- that just returns a barebones dictionary? Then the test is free to add/remove keys as needed.
While I think a test could just do:
source_record = {}
The benefit of a stub might be that we could set some shared defaults, maybe bits and bobs that we know are commonly accessed. But at the end of the day, it is just a dictionary.
def _issued_dates(cls, source_record: dict) -> Iterator[timdex.Date]: | ||
if issued_date := source_record.get("dct_issued_s"): # noqa: SIM102 | ||
if validate_date(issued_date, cls.get_source_record_id(source_record)): | ||
yield (timdex.Date(value=issued_date, kind="Issued")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this pattern! This is very easy to parse and understand. With a nod to the oustanding date parsing issue, I think this looks great.
* Add create_aardvark_source_record_stub function * Update unit tests to call stub function * Refactor Aardvark.record_is_deleted method and update corresponding unit test
@ghukill @jonavellecuerdo Pushed a new commit and re-requesting review |
def create_aardvark_source_record_stub() -> dict: | ||
return {"id": "123", "dct_title_s": "Test title 1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if isinstance(source_record["gbl_suppressed_b"], bool): | ||
if isinstance(source_record.get("gbl_suppressed_b"), bool): | ||
return source_record["gbl_suppressed_b"] | ||
|
||
message = ( | ||
f"Record ID '{cls.get_source_record_id(source_record)}': " | ||
"'gbl_suppressed_b' value is not a boolean" | ||
"'gbl_suppressed_b' value is not a boolean or missing" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice updates here! Good catch.
Purpose and background context
Refactors the first set of
Aardvark
field methods to align with the conventions developed since this class was initially created.How can a reviewer manually see the effects of these changes?
Run the following command to see that the
Aardvark
class transform still transforms a source file:Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
What are the relevant tickets?
Developer
Code Reviewer(s)