Cannot query parquet files generated by Apache Spark from datafusion-cli #1648
Labels
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
Describe the bug
I have a data set created by Apache Spark and I tried to query it from the DataFusion CLI. It failed, saying that a parquet file was corrupt.
I added some debug logging and found that it was actually trying to read the following file, which is not a Parquet file.
To Reproduce
Create a non-Parquet file with a non-Parquet extension and put it in a directory along with some valid parquet files.
Expected behavior
Should only try and read files with file extension
.parquet
.Additional context
None
The text was updated successfully, but these errors were encountered: