You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a delta table I am accessing using datafusion.
A select * query works just fine, but any other query - like selecting only a column of summing a column does not do anything. No error or warning is thrown.
Basically the code is
async fn run_from_delta_table(ctx: &SessionContext) -> Result<(), DeltaTableError> {
let table = open_table("../data/delta-table")
.await
.unwrap();
ctx.register_table("demo", Arc::new(table)).unwrap();
let df = ctx
.sql("SELECT * FROM demo").await?;
df.show().await?; // prints to the console
let df = ctx
.sql("SELECT ViewCount FROM demo").await?;
df.show().await?; // does not print to the console
let df = ctx
.sql("SELECT SUM(ViewCount) FROM demo").await?;
df.show().await?; // does not print to the console
Ok(())
}
It is worth mentioning that querying using the dataframe API works as expected:
async fn run_df(ctx: &SessionContext) -> Result<(), DeltaTableError> {
let table = open_table("../data/delta-table")
.await
.unwrap();
let df = ctx.read_table(Arc::new(table))?;
df.show().await?; // prints to the console
let view_col = df.select(vec![col("ViewCount")])?;
view_col.show().await?; // also prints to the console
let view_sum = df
.aggregate(vec![], vec![sum(col("ViewCount"))])?;
view_sum.show().await?; // also prints to the console
Ok(())
}
My Cargo.toml looks like this:
[dependencies]
datafusion = "15.0.0"
deltalake = {version="0.6.0", features = ["datafusion-ext"]}
tokio = {version="1.25.0", features = ["macros", "rt", "parking_lot"]}
To Reproduce
Create a project with the above mentioned dependencies in Cargo.toml
Have a detla-lake table at a known path
Run the two functions above run_df and run_from_delta_table while updating the path to the delta-lake path
Expected behavior
The SQL API should perform the mentioned query just as the dataframe API did
Additional context
Note I am using version 15.0.0 as this is the version compatible with deltalake
The text was updated successfully, but these errors were encountered:
It was a case of wrong capitalisation. This portion of the documentation here alluded to it. Switching to lowercase, or escaping the column name and everything works fine.
It was a case of wrong capitalisation. This portion of the documentation here alluded to it. Switching to lowercase, or escaping the column name and everything works fine.
It is unfortunate that there was not error that would have pointed you at the problem.
Describe the bug
I have a delta table I am accessing using datafusion.
A
select * query
works just fine, but any other query - like selecting only a column of summing a column does not do anything. No error or warning is thrown.Basically the code is
It is worth mentioning that querying using the dataframe API works as expected:
My Cargo.toml looks like this:
To Reproduce
run_df
andrun_from_delta_table
while updating the path to the delta-lake pathExpected behavior
The SQL API should perform the mentioned query just as the dataframe API did
Additional context
Note I am using version
15.0.0
as this is the version compatible withdeltalake
The text was updated successfully, but these errors were encountered: