Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve doc on lowercase treatment of columns on SQL #3385

Merged
merged 5 commits into from
Oct 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions datafusion/core/tests/capitalized_example.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
A,b,c
1,2,3
1,10,5
2,5,6
2,1,4
17 changes: 11 additions & 6 deletions docs/source/user-guide/example-usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@

# Example Usage

In this example some simple processing is performed on a csv file. Please be aware that all identifiers are made lower-case in SQL, so if your csv file has capital letters (ex: Name) you should put your column name in double quotes or the example won't work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note is great


The following example uses [this file](../../../datafusion/core/tests/capitalized_example.csv)

## Update `Cargo.toml`

Add the following to your `Cargo.toml` file:
Expand All @@ -37,10 +41,10 @@ use datafusion::prelude::*;
async fn main() -> datafusion::error::Result<()> {
// register the table
let ctx = SessionContext::new();
ctx.register_csv("example", "tests/example.csv", CsvReadOptions::new()).await?;
ctx.register_csv("example", "tests/capitalized_example.csv", CsvReadOptions::new()).await?;

// create a plan to run a SQL query
let df = ctx.sql("SELECT a, MIN(b) FROM example GROUP BY a LIMIT 100").await?;
let df = ctx.sql("SELECT \"A\", MIN(b) FROM example GROUP BY \"A\" LIMIT 100").await?;
alamb marked this conversation as resolved.
Show resolved Hide resolved

// execute and print results
df.show().await?;
Expand All @@ -57,10 +61,10 @@ use datafusion::prelude::*;
async fn main() -> datafusion::error::Result<()> {
// create the dataframe
let ctx = SessionContext::new();
let df = ctx.read_csv("tests/example.csv", CsvReadOptions::new()).await?;
let df = ctx.read_csv("tests/capitalized_example.csv", CsvReadOptions::new()).await?;

let df = df.filter(col("a").lt_eq(col("b")))?
.aggregate(vec![col("a")], vec![min(col("b"))])?;
let df = df.filter(col("A").lt_eq(col("c")))?
.aggregate(vec![col("A")], vec![min(col("b"))])?;

// execute and print results
df.show_limit(100).await?;
Expand All @@ -72,8 +76,9 @@ async fn main() -> datafusion::error::Result<()> {

```text
+---+--------+
| a | MIN(b) |
| A | MIN(b) |
+---+--------+
| 2 | 1 |
| 1 | 2 |
+---+--------+
```
3 changes: 3 additions & 0 deletions docs/source/user-guide/sql/select.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@
# SELECT syntax

The queries in DataFusion scan data from tables and return 0 or more rows.
Please be aware that column names in queries are made lower-case, but not on the inferred schema. Accordingly, if you
want to query against a capitalized field, make sure to use double quotes. Please see this
[example](https://arrow.apache.org/datafusion/user-guide/example-usage.html) for clarification.
In this documentation we describe the SQL syntax in DataFusion.

DataFusion supports the following syntax for queries:
Expand Down