Implement more scale testing #1461

felipeblazing · 2021-04-12T13:33:02Z

We want to test various sizes but are still just assuming one and two nodes for execution for the time being. I think that we can test on 1GB and 10GB datasets without running into too many scale issues on even very complex queries.

For the various backends and various file formats that we use for storing data we need to upload 1GB and 10GB versions (we already have these in S3 for parquet for example) and have a subset of queries run on these files to make sure that we are still able to run queries at scale.

look at #1460

to see the various places where we will need to be uploading these datasets.

As a start I would pick CSV and Parquet. As the file formats that we make available.

In addition to this someone needs to modify the e2e testing framework so that we can run these kinds of tests specifying the scale and file format that we want to run the scales tests on.

felipeblazing added the ? - Needs Triage needs team to review and classify label Apr 12, 2021

kharoc removed the ? - Needs Triage needs team to review and classify label Apr 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement more scale testing #1461

Implement more scale testing #1461

felipeblazing commented Apr 12, 2021

Implement more scale testing #1461

Implement more scale testing #1461

Comments

felipeblazing commented Apr 12, 2021