Skip to content

Commit

Permalink
docs(tutorial): add a tutorial for the Flink backend (#8085)
Browse files Browse the repository at this point in the history
<!--
Thanks for taking the time to contribute to Ibis!

Please ensure that your pull request title matches the conventional
commits
specification: https://www.conventionalcommits.org/en/v1.0.0/
-->

## Description of changes

<!--
Write a description of the changes commensurate with the pull request's
scope.

Extremely small changes such as fixing typos do not need a description.
-->

Add a tutorial for the Flink backend.
  • Loading branch information
chloeh13q authored Feb 12, 2024
1 parent 3f24b89 commit e2a3fb6
Show file tree
Hide file tree
Showing 12 changed files with 432 additions and 1 deletion.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,8 @@ website:
contents:
- install.qmd
- auto: tutorials/*.qmd
- auto: tutorials/data-platforms
- auto: tutorials/cloud-data-platforms
- auto: tutorials/open-source-software
- id: concepts
title: "Concepts"
style: "docked"
Expand Down
53 changes: 53 additions & 0 deletions docs/tutorials/open-source-software/apache-flink/0_setup.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Getting started

In this tutorial, you will learn how to set up and use Flink with Ibis. After
setting up the Flink backend for Ibis in this tutorial, we'll see a real-life
example in [A real-life use case: fraud detection](1_single_feature.qmd).

## Set up and connect to Flink

Install the Flink backend for Ibis with `pip`:
```{python}
# | include: false
!pip install ibis-framework apache-flink
```

::: {.callout-warning}
You need to install the Flink backend for Ibis alongside
the `apache-flink` package. PyFlink is not available on conda-forge, so you
cannot install the Flink backend for Ibis with `conda`, `mamba`, or `pixi`.
:::


To connect to a Flink session, simply create a `pyflink.table.TableEnvironment`
and pass that to `ibis.flink.connect()`:

```{python}
from pyflink.table import EnvironmentSettings, TableEnvironment
import ibis
env_settings = EnvironmentSettings.in_streaming_mode()
table_env = TableEnvironment.create(env_settings)
connection = ibis.flink.connect(table_env)
```

::: {.callout-tip}
If you’re working on a batch data pipeline, simply change the
TableEnvironment settings to batch mode before connecting to it:
```{python}
env_settings = EnvironmentSettings.in_batch_mode()
```
:::

Now you can connect to data sources, create transformations, and write the
results into sinks!

## Next steps

Now that you're connected to Flink, you can [continue this tutorial to learn the
basics of Ibis](1_single_feature.qmd) or query your own data. See the rest of
the Ibis documentation or
[Flink documentation](https://nightlies.apache.org/flink/flink-docs-stable/). You
can [open an issue](https://github.com/ibis-project/ibis/issues/new/choose) if you
run into one!
Loading

0 comments on commit e2a3fb6

Please sign in to comment.