-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Introduce schema definition. #19
Conversation
} | ||
|
||
/// Set the field's initial default value. | ||
pub fn with_write_default(mut self, value: impl ToString) -> Self { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write_default and initial_default seems should be a value
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but currently Value
is not defined yet 🤪
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM! Although, I feel that the usage of once_cell
here is a bit over-designed. Is it possible that we may need to add new fields into StructType
, for example, in schema evaluation?
The Python and java implementation treats schema as immutable data structures, I think we should also follow that. Mutable makes things complicated, especially when we have indexes. For example the name indexing following. |
Got it, makes sense now. |
Should we also include identifier-field-ids? |
Just took a look at it. We need visitor pattern to verify them, for example the types, etc. So I want to postpone it after we introduce schema visitor. |
I don't entirely understand. Could you elaborate why we would need the visitor pattern for the identifier-field-ids? I was thinking about something similar to the serialized representation. |
I mean the verification check, you can find the implementation in java: https://github.com/apache/iceberg/blob/1bb853191fd378fb1dfda5a5cb297475b7fc204b/api/src/main/java/org/apache/iceberg/Schema.java#L104 cc @Fokko @JanKaul Why I don't include |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments, but looks good 👍🏻
r#struct: StructType, | ||
schema_id: i32, | ||
highest_field_id: i32, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the identifier-field-ids
are missing: https://iceberg.apache.org/spec/#identifier-field-ids
Initial schema definition.
SchemaVisitor
, name indexes will come later.