You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The parquet_tests are skipped on Dataproc CI
[2024-01-21T18:56:32.815Z] [33m=============================== warnings summary ===============================[0m
[2024-01-21T18:56:32.815Z] ../../src/main/python/parquet_testing_test.py:100
[2024-01-21T18:56:32.815Z] /home/sa_116163337916449219958/integration_tests/src/main/python/parquet_testing_test.py:100: UserWarning: Skipping parquet-testing tests.
Unable to locate data in any of: hdfs:/tmp/rapids_it/src/test/resources/parquet-testing/data/*.parquet, hdfs:/tmp/rapids_it/src/test/resources/parquet-testing/bad_data/*.parquet, /home/sa_116163337916449219958/thirdparty/parquet-testing/data/*.parquet, /home/sa_116163337916449219958/thirdparty/parquet-testing/bad_data/*.parquet
[2024-01-21T18:56:32.815Z] warnings.warn("Skipping parquet-testing tests. Unable to locate data in any of: " + locations)
Expected behavior
Expect the parquet tests to be run on dataproc ci
Describe the bug
The parquet_tests are skipped on Dataproc CI
Expected behavior
Expect the parquet tests to be run on dataproc ci
Additional context
Seems the parquet test files are packaged in the hdfs path
hdfs:/tmp/rapids_it/src/test/resources/parquet-testing
andlocate_parquet_testing_files
function (https://github.com/NVIDIA/spark-rapids/blob/branch-24.02/integration_tests/src/main/python/parquet_testing_test.py#L91) only supports to search files in local file system, so the test data could not be found and the tests are skipped.The text was updated successfully, but these errors were encountered: