-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial entry point to data generation for scale test #9054
Conversation
Signed-off-by: Allen Xu <[email protected]>
Signed-off-by: Allen Xu <[email protected]>
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/DataGenEntry.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/DataGenEntry.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Show resolved
Hide resolved
Signed-off-by: Allen Xu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolve comments and add CorrelatedKeyGroup for key groups in tables.
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/DataGenEntry.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/DataGenEntry.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Allen Xu <[email protected]>
Signed-off-by: Allen Xu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good. Just a few nits
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Show resolved
Hide resolved
Signed-off-by: Allen Xu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolve more comments.
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/DataGenEntry.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience in doing the rework. I think we are really close now and it looks really good.
Thanks
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Allen Xu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review! add a flag arg "--overwrite" and resolve the rest comments.
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/ScaleTestDataGen.scala
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
datagen/src/main/scala/com/nvidia/rapids/tests/scaletest/TableGenerator.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ScalaTest.md appears to also have some TODOs in it. Are there plans to fix that?
Signed-off-by: Allen Xu <[email protected]>
build |
* Init entry point to data generation for scale test Signed-off-by: Allen Xu <[email protected]> * add date range Signed-off-by: Allen Xu <[email protected]> * add correlatedKeyGroup settings for key groups Signed-off-by: Allen Xu <[email protected]> * refine NOTICE file Signed-off-by: Allen Xu <[email protected]> * shorten the code when dealing with data format Signed-off-by: Allen Xu <[email protected]> * resolve more comments and add doc for usage Signed-off-by: Allen Xu <[email protected]> * add --overwrite argument and resolve some comments Signed-off-by: Allen Xu <[email protected]> * style update,unblock CI Signed-off-by: Allen Xu <[email protected]> --------- Signed-off-by: Allen Xu <[email protected]>
As titled.
close #8813
This PR aims to provide the initial entry point to the data generation application for scale test.
The design and user interface are described at #8813 (comment)
still DRAFT version, posted for early review and feedbacks.
One example command to test it locally:
Giving an example to show the actual disk size the data will take so we have basic impression:
For Scale=1, Complexity=1 and parquet file:
For Scale=1, Complexity=10 and parquet file:
For Scale=10, Complexity=10 and parquet file: