You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A clear and concise description of what the bug is.
Doctests in dataframe.rs are taking very long to run in the main branch. Moreover the tests likely use up so much resource that it is not even easy to submit this issue or switch to another tab while the tests are running.
test src/dataframe.rs - dataframe::DataFrame (line 62) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::aggregate (line 189) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::cache (line 860) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::collect (line 471) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::collect_partitioned (line 550) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::count (line 438) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::distinct (line 287) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::except (line 719) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::execute_stream (line 530) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::execute_stream_partitioned (line 569) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::explain (line 658) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::filter (line 169) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::intersect (line 696) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::join (line 330) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::join_on (line 371) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::limit (line 221) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::registry (line 678) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::repartition (line 417) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::schema (line 590) has been running for over 60 seconds
test src/dataframe.rs - dataframe::DataFrame::select (line 124) has been running for over 60 seconds
To Reproduce
Steps to reproduce the behavior: cargo test --docs DataFrame Expected behavior
A clear and concise description of what you expected to happen.
The tests should be faster and shouldn't cause my machine to hang. Additional context
Add any other context about the problem here.
I'm actually on a pretty new and good Ubuntu 22.04/AMD64 machine.
The text was updated successfully, but these errors were encountered:
Not sure what can be done here, maybe try to reduce the amount of doctests used (not really ideal), or be able to omit the doctests from default cargo test?
The reason these tests lock up is very high memory utilization to run them in parallel, which is cargo's default behavior. My system peaked at over 100GB of memory utilization 🤯 ! I took a look through the dataframe doc tests, and I don't see any inherent reason for such extreme memory usage. I believe @Jefffrey is correct that the cause is rust loading many multiples of a large debug binary into memory.
I think it would be a reasonable workaround to improve the developer experience to find a way to default cargo to run these specific tests with a maximum parallelism of somewhere in the 1-4 range which should work on most systems.
You can do this manually by running cargo test --doc dataframe -- --test-threads 1
Describe the bug
A clear and concise description of what the bug is.
Doctests in
dataframe.rs
are taking very long to run in the main branch. Moreover the tests likely use up so much resource that it is not even easy to submit this issue or switch to another tab while the tests are running.To Reproduce
Steps to reproduce the behavior:
cargo test --docs DataFrame
Expected behavior
A clear and concise description of what you expected to happen.
The tests should be faster and shouldn't cause my machine to hang.
Additional context
Add any other context about the problem here.
I'm actually on a pretty new and good Ubuntu 22.04/AMD64 machine.
The text was updated successfully, but these errors were encountered: