[TASK] Big Reliability Epic #1870
Labels
epic
Issue that encompasses a significant feature or body of work
P0
Must have for release
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
task
Work required that improves the product but is not user facing
test
Only impacts tests
We recently had an issue where
contiguousSplit
started to fail on 2GB partitions. We know that there are some issues with similar limits in shuffle #45 but it is the unknown unknowns that are more problematic because we cannot make informed decisions about prioritizing fixing these issues.We need to come up with a test plan to really hammer on size limits in both cudf and this plugin so we can have a better understanding of what limits exist and so we can come up with a proper plan to address them.
Avoid Crashes:
Highest priority:
Next on the list:
Test for new issues:
Auto Tune:
Better Error Reporting:
cudf::cuda_error
rapidsai/cudf#10553The text was updated successfully, but these errors were encountered: