Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Fix: metadata_location to be optional in TableResponse #1321

Merged
merged 4 commits into from
Nov 14, 2024

Conversation

sungwy
Copy link
Collaborator

@sungwy sungwy commented Nov 14, 2024

While testing out the iceberg-rest-catalog docker image, I ran into the following error when parsing the following example TableResponse that is created in tests/integration/test_writes/test_writes.py::test_create_table_transaction[session_catalog-2]

{'metadata': {'format-version': 2, 'table-uuid': 'f269bf0e-f8c9-4443-a341-e5bd4936892a', 'location': 's3://warehouse/default/arrow_create_table_transaction_local_2', 'last-sequence-number': 0, 'last-updated-ms': 1731553376222, 'last-column-id': 1, 'current-schema-id': 0, 'schemas': [{'type': 'struct', 'schema-id': 0, 'fields': [{'id': 1, 'name': 'foo', 'required': False, 'type': 'string'}]}], 'default-spec-id': 0, 'partition-specs': [{'spec-id': 0, 'fields': []}], 'last-partition-id': 999, 'default-sort-order-id': 0, 'sort-orders': [{'order-id': 0, 'fields': []}], 'properties': {'created-at': '2024-11-14T03:02:56.222721210Z', 'write.parquet.compression-codec': 'zstd'}, 'current-snapshot-id': -1, 'refs': {}, 'snapshots': [], 'statistics': [], 'partition-statistics': [], 'snapshot-log': [], 'metadata-log': []}}

Error trace

E       pydantic_core._pydantic_core.ValidationError: 1 validation error for TableResponse
E       metadata-location
E         Field required [type=missing, input_value={'metadata': {'format-ver...[], 'metadata-log': []}}, input_type=dict]
E           For further information visit https://errors.pydantic.dev/2.9/v/missing

The following is noted from pydantic docs:

A field annotated as typing.Optional[T] will be required, and will allow for a value of None. It does not mean that the field has a default value of None. (This is a breaking change from V1.)

@kevinjqliu I think this is a bug that we should consider in cutting out a new RC for the ongoing 0.8.0 release. The implication of not fixing this bug, is that TableResponse from staged tables cannot be parsed by PyIceberg and will lead to a fatal exception :(

According to the Iceberg Rest Catalog Spec, the metadata-location is an optional field, and may be absent from the TableResponse

https://github.com/apache/iceberg/blob/3659ded18d50206576985339bd55cd82f5e200cc/open-api/rest-catalog-open-api.yaml#L3201-L3208

@sungwy sungwy changed the title Bug Fix: metadata_location optional Bug Fix: metadata_location to be optional in TableResponse Nov 14, 2024
Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @sungwy this is a great catch! Since @kevinjqliu is the release manager, we can either follow up with this in 0.8.1 or cut another RC with this fix included.

I think it would be good to have a test to cover this. It should be pretty straightforward by adding another fixture that covers this case:

https://github.com/apache/iceberg-python/blob/2ba86b5c852cb0c8f4b4ba95d54fb5a6bfa00fa2/tests/catalog/test_rest.py#L70-L79

WDYT?

@sungwy
Copy link
Collaborator Author

sungwy commented Nov 14, 2024

I think it would be good to have a test to cover this. It should be pretty straightforward by adding another fixture that covers this case:

Just a test for testing the creation of a staged table @Fokko - thank you for the suggestion!

Hey @sungwy this is a great catch! Since @kevinjqliu is the release manager, we can either follow up with this in 0.8.1 or cut another RC with this fix included.

And yes, I agree it'll be fine either way. I'll leave it up to the captain 🚢

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Lets cut another RC for this

iceberg Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whats this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question - I'll get it removed

@kevinjqliu kevinjqliu merged commit 3ccdc44 into apache:main Nov 14, 2024
7 checks passed
@sungwy sungwy deleted the bf-table-response branch November 14, 2024 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants