-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[e2eTESTING] V tests: Kerchunk vs Pyfive engines #191
Comments
On JASMIN/sci2CPU:
Kerchunk-based PipelineResult is 4677.8594 (stable)
Kerchunk indexing and JSON file writing times:
Pyfive-based pipeline
|
Question no 1
Answer@bnlawrence suggests chunking, and he is correct: 2.8G file field has 30 chunks, the other field has 3400 chunks -> here's the penalty factor right there! |
Use
|
so it's starting to look like this: Kerchunk-based pipelineKerchunk indexer:
To (network) and at Reductionist
Total time
|
Local tests on V Computer
Test code:
Kerchunk is restricted to
Dataset
of interest:Chunks
Both Kerchunk and Pyfive send variable (give or take 5 or 10) numbers of chunks to Reductionist; order of magnitude is 3360 chunks.
Kerchunk-based Pipeline
Result is 4677.8594 (stable)
Kerchunk indexing and JSON file writing times:
Pyfive-based pipeline
Result is 4677.8594 (stable)
Sliced Kerchunk (slice
[0:3, 4:6, 7:9]
)Sliced Pyfive (slice
[0:3, 4:6, 7:9]
)The text was updated successfully, but these errors were encountered: