You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the refreshing of the Flint index is dependent on "polling" within the Spark FileStreamSource operator. This approach can potentially lead to performance issues, especially when dealing with a source table containing a substantial number of partitions and files.
What solution would you like?
The proposal is to allow user provide SNS topic for S3 data source. In this way, the streaming execution can find out "delta" (changed file list) efficiently.
Questions to think about:
Is this option provided on source table or Flint index DDL statement?
Do we only handle new changes via notification or we can also load cold data?
What alternatives have you considered?
Provide some way for user to refresh source table metadata periodically. But need to figure out how-to because:
Spark Hive table: MSTK REPAIR statement works for this purpose but Hive table doesn't support Spark structured streaming
Spark data source table: as aforementioned, FileStreamSource polls S3 file list
Do you have any additional context?
N/A
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
Currently, the refreshing of the Flint index is dependent on "polling" within the
Spark FileStreamSource
operator. This approach can potentially lead to performance issues, especially when dealing with a source table containing a substantial number of partitions and files.What solution would you like?
The proposal is to allow user provide SNS topic for S3 data source. In this way, the streaming execution can find out "delta" (changed file list) efficiently.
Questions to think about:
What alternatives have you considered?
Provide some way for user to refresh source table metadata periodically. But need to figure out how-to because:
FileStreamSource
polls S3 file listDo you have any additional context?
N/A
The text was updated successfully, but these errors were encountered: