From 211f336d79398c4fd940394351b0dc26febbcbbb Mon Sep 17 00:00:00 2001
From: Cody Peterson <54814569+lostmygithubaccount@users.noreply.github.com>
Date: Thu, 7 Mar 2024 09:47:40 -0500
Subject: [PATCH] docs: add Python + SQL section to why ibis (#8526)

---
 docs/why.qmd | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)

diff --git a/docs/why.qmd b/docs/why.qmd
index 0bfd8d75a516..c3410816f991 100644
--- a/docs/why.qmd
+++ b/docs/why.qmd
@@ -228,6 +228,74 @@ and robust framework for data manipulation in Python.
 In the long-term, we aim for a standard query plan Intermediate Representation
 (IR) like [Substrait](https://substrait.io) to simplify this further.
 
+## Python + SQL: better together
+
+For most backends, Ibis works by compiling Python expressions into SQL:
+
+```{python}
+g = t.group_by(["species", "island"]).agg(count=t.count()).order_by("count")
+ibis.to_sql(g)
+```
+
+You can mix and match Python and SQL code:
+
+```{python}
+sql = """
+SELECT
+  species,
+  island,
+  COUNT(*) AS count
+FROM penguins
+GROUP BY species, island
+""".strip()
+```
+
+::: {.panel-tabset}
+
+## DuckDB
+
+```{python}
+con = ibis.duckdb.connect()
+t = con.read_parquet("penguins.parquet")
+g = t.alias("penguins").sql(sql)
+g
+```
+
+```{python}
+g.order_by("count")
+```
+
+## DataFusion
+
+```{python}
+con = ibis.datafusion.connect()
+t = con.read_parquet("penguins.parquet")
+g = t.alias("penguins").sql(sql)
+g
+```
+
+```{python}
+g.order_by("count")
+```
+
+## PySpark
+
+```{python}
+con = ibis.connect("pyspark://")
+t = con.read_parquet("penguins.parquet")
+g = t.alias("penguins").sql(sql)
+g
+```
+
+```{python}
+g.order_by("count")
+```
+
+:::
+
+This allows you to combine the flexibility of Python with the scale and
+performance of modern SQL.
+
 ## Scaling up and out
 
 Out of the box, Ibis offers a great local experience for working with many file