-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: complete task JSON serialization for other types (like data task, manifest task) #9597
Comments
Regarding
The main question is how to serialize the |
I think we could make this similar to how it's done for a In this case we could have a I haven't looked what challeges we'd have with serializing all subclasses of |
@nastra thanks for the comments! Regarding the JSON format, we are on the same page of adding a new Are you suggesting adding a new API Class renaming/moving around can be a problem although I don't know if practically we should do that. We can add a unit test to assert the class's FQCN didn't change. if renamed/relocated, the parser needs to be updated to track both old and new names. what's your take on the problem of serializing the |
I was suggesting an enum type at the JSON level, not at the API level (similar to how it's done for Regarding |
@nastra
If I understand you correctly, we can define the enum type inside the |
@stevenzwu this is only because |
yeah. we are on the same page now. regarding the
I am thinking maybe we should implement a
|
@nastra @aokolnychyi any feedback on the proposal of adding a As for the
|
Regarding the
We can add a new package private constructor
|
I think it is reasonable to have a field in JSON that would indicate the task type. I'd also avoid any changes in task APIs, we can leverage that enum only in the parser. I doubt using FQCN is a good idea as it would make the implementation specific to Java. Can we reuse |
Given that the task serialization is part of the spec, shall we raise this discussion during the sync? |
I think we should definitely use single-value serialization for the values in the structs when we convert to JSON. I probably wouldn't use objects, though. We could use a list and send values by position instead. |
today @rdblue are you suggesting a different behavior than the current |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
Feature Request / Improvement
Right now,
FileScanTaskParser
JSON serializer only handlesBaseFileScanTask
for data files (see issue #1698 ). There are otherFileScanTask
impl classes that are not covered:StaticDataTask
,AllManifestsTable$ManifestListReadTask
,BaseEntriesTable$ManifestReadTask
,BaseFilesTable$ManifestReadTask
. This was discovered while I was trying to Flink FLIP-27IcebergSource
with metadata tables unit tests.I propose that we add a
type
(orimpl
) filed to the JSON format that captures the FQCN of theFileScanTask
implementation class. Fortunately, with JSON format, this can be a backward compatible change. if thetype
field is not present, the implementation class is defaulted toBaseFileScanTask
.We can also incrementally add the missing implementations. The next one to tackle should be the
StaticDataTask
.cc @nastra @rdblue @aokolnychyi @pvary
Query engine
None
The text was updated successfully, but these errors were encountered: