-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate using S3 Select #48
Comments
What are you trying to achieve? it looks like SELECT only queries JSON structures? |
It was raised on slack (https://the-asf.slack.com/archives/C01QUFS30TD/p1645989728729579?thread_ts=1645245240.528129&cid=C01QUFS30TD), i dont have any particular insight at this stage. Just created this to log the request and will look into later when i have some more time. |
If i recall correctly, S3 Select worked on CSV, JSON, and Parquet. But I read about it a while ago so dont hold me to that. Doing zero research i thought maybe we could add something like a Honestly though I havent used before or had time to look into this so ill just come back or see if someone else (maybe the person who raised it) looks into it. |
Hi, I raises this up as an idea only
As of 2022-02, from source
S3 select supports aggregation pushdown and predicate pushdown, it could improve performance based on use cases. e.g. Using S3 Select Pushdown with Presto to improve performance |
I am now looking into this. Let me share my investigation and opinion. S3 Select itself
How to achieve the S3 Select accelerationAs per the previous two examples, we should integrate S3 Select into the CSV scan. Since we need to pass predicates and build the SQL query from them, I believe it's not an Actually, I did the implementation as a |
@Licht-T Hi thanks for raising this. This repo will be archived soon. There is now object_store which is preferred. I recommend raising this request there. |
It seems support was added for this based on https://github.com/awslabs/aws-sdk-rust/releases/tag/v0.0.17-alpha
Look into integrating this into
S3FileSystem
or using it to create aTableProvider
.The text was updated successfully, but these errors were encountered: