Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement more scale testing #1461

Open
felipeblazing opened this issue Apr 12, 2021 · 0 comments
Open

Implement more scale testing #1461

felipeblazing opened this issue Apr 12, 2021 · 0 comments

Comments

@felipeblazing
Copy link

We want to test various sizes but are still just assuming one and two nodes for execution for the time being. I think that we can test on 1GB and 10GB datasets without running into too many scale issues on even very complex queries.

For the various backends and various file formats that we use for storing data we need to upload 1GB and 10GB versions (we already have these in S3 for parquet for example) and have a subset of queries run on these files to make sure that we are still able to run queries at scale.

look at #1460

to see the various places where we will need to be uploading these datasets.

As a start I would pick CSV and Parquet. As the file formats that we make available.

In addition to this someone needs to modify the e2e testing framework so that we can run these kinds of tests specifying the scale and file format that we want to run the scales tests on.

@felipeblazing felipeblazing added the ? - Needs Triage needs team to review and classify label Apr 12, 2021
@kharoc kharoc removed the ? - Needs Triage needs team to review and classify label Apr 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants