Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(tutorial): add a tutorial for the Flink backend #8085

Merged

Conversation

chloeh13q
Copy link
Contributor

Description of changes

Add a tutorial for the Flink backend.

@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch from 0a43c28 to 1d67bbf Compare January 24, 2024 09:16
@ncclementi
Copy link
Contributor

@chloeh13q for thelinting error you are getting, you can run the pre-commit and that should fix them. https://ibis-project.org/contribute/03_style

@ncclementi ncclementi added the docs-preview Add this label to trigger a docs preview label Jan 24, 2024
@ibis-docs-bot ibis-docs-bot bot removed the docs-preview Add this label to trigger a docs preview label Jan 24, 2024
@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch 2 times, most recently from 1508f91 to 0305c5a Compare January 25, 2024 07:49
@ncclementi
Copy link
Contributor

The remaining CI failure is because pyflink is missing in the docs dependencies. We should add it to the group using poetry

poetry add --group docs pyflink

Then we will want to run

poetry lock --no-upgrade

to update the lock file, then push all those changes up. I think we also might need to re-create the requirements-dev.txt but I'm not sure, maybe @gforsyth can chime in here?

@gforsyth
Copy link
Member

If you have https://github.com/casey/just installed, you can run just lock to handle the relock and regeneration of the requirements dev file, Otherwise, these are the commands to run:

poetry lock --no-update
poetry export --extras all --with dev --with test --with docs --without-hashes --no-ansi > requirements-dev.txt

@cpcloud
Copy link
Member

cpcloud commented Jan 25, 2024

This is unlikely to work, due to docs being built with nix. I suggest freezing the output for now until we can get this working properly.

@@ -0,0 +1,35 @@
# Getting started
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Getting started
---
execute:
freeze: auto
---
# Getting started

to freeze the output as it gets generated on your machine you can add this to the top of each of the qmd files

* JDK 11 release: Flink requires Java 11.
* Python 3.9 or 3.10.
* Follow [the instructions on the Ibis project page](https://ibis-project.org/install) to install the Flink backend for Ibis. For the tutorial below, we assume that you already have the Ibis package correctly installed in your environment.
* Clone the [example repository](https://github.com/claypotai/ibis-flink-example).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're hosting the example repo under our github account right now, but will most likely need to move this somewhere else (under ibis-project?), clean up some of the code, and figure out how best to structure multiple tutorials and blog posts.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch 3 times, most recently from c0d52d7 to 97438fa Compare January 25, 2024 19:56
@chloeh13q chloeh13q marked this pull request as ready for review January 25, 2024 21:03
@lostmygithubaccount lostmygithubaccount added the docs-preview Add this label to trigger a docs preview label Jan 25, 2024
@ibis-docs-bot ibis-docs-bot bot removed the docs-preview Add this label to trigger a docs preview label Jan 25, 2024
@ibis-docs-bot
Copy link

ibis-docs-bot bot commented Jan 25, 2024

Copy link
Member

@lostmygithubaccount lostmygithubaccount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general, wrap lines. I would keep the tutorial document a bit more simple and focused on getting the user up and running

docs/tutorials/open-source/apache-flink/0_setup.qmd Outdated Show resolved Hide resolved
docs/tutorials/open-source/apache-flink/0_setup.qmd Outdated Show resolved Hide resolved
docs/tutorials/open-source/apache-flink/0_setup.qmd Outdated Show resolved Hide resolved
docs/tutorials/open-source/apache-flink/0_setup.qmd Outdated Show resolved Hide resolved
docs/_quarto.yml Outdated Show resolved Hide resolved
* JDK 11 release: Flink requires Java 11.
* Python 3.9 or 3.10.
* Follow [the instructions on the Ibis project page](https://ibis-project.org/install) to install the Flink backend for Ibis. For the tutorial below, we assume that you already have the Ibis package correctly installed in your environment.
* Clone the [example repository](https://github.com/claypotai/ibis-flink-example).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch 6 times, most recently from 2ef9142 to e07325f Compare January 30, 2024 22:30
@chloeh13q
Copy link
Contributor Author

chloeh13q commented Jan 30, 2024

@lostmygithubaccount Thanks for the review, I addressed all of your comments!

I would keep the tutorial document a bit more simple and focused on getting the user up and running

The intention is to get to the point where the user can write a transformation. If this is too long, I can split it into two and have the first tutorial go up to the point where we write a simple over aggregation, and the second post cover window aggregation and what the differences are (but the setup code would be identical, i.e. connect to a source, connect to a sink). Alternatively, we can just focus on window aggregation and get rid of the code for over aggregation. We can provide the code for over aggregation in the example repo but not explicitly talk about the syntax in this tutorial. What do you think?

P.S. Still pending porting the complete code example to the new ibis-flink-example repo.

@lostmygithubaccount lostmygithubaccount added the docs-preview Add this label to trigger a docs preview label Jan 30, 2024
@ibis-docs-bot
Copy link

ibis-docs-bot bot commented Feb 1, 2024

@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch 2 times, most recently from dc555bd to 87e9613 Compare February 1, 2024 19:46
@lostmygithubaccount lostmygithubaccount added the docs-preview Add this label to trigger a docs preview label Feb 1, 2024
@ibis-docs-bot ibis-docs-bot bot removed the docs-preview Add this label to trigger a docs preview label Feb 1, 2024
@ibis-docs-bot
Copy link

ibis-docs-bot bot commented Feb 1, 2024

@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch from 87e9613 to 916b9ab Compare February 1, 2024 20:53
@lostmygithubaccount lostmygithubaccount added the docs-preview Add this label to trigger a docs preview label Feb 1, 2024
@ibis-docs-bot ibis-docs-bot bot removed the docs-preview Add this label to trigger a docs preview label Feb 1, 2024
@ibis-docs-bot
Copy link

ibis-docs-bot bot commented Feb 1, 2024

@chloeh13q chloeh13q force-pushed the docs/flink-tutorial-1 branch from 9182e4e to 5b7cfc1 Compare February 2, 2024 22:54
Copy link
Member

@lostmygithubaccount lostmygithubaccount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@lostmygithubaccount
Copy link
Member

this should be good to merge. #8197 is a follow up

@chloeh13q
Copy link
Contributor Author

@lostmygithubaccount BTW, we're also working on adding a visualization dashboard to the demo. We think it adds some fun elements and could help make the demo more interactive and tangible. We will provide the complete code for the dashboard so that all that the user needs to do is to spin up the service for the dashboard (via one line command) and then they should be able to see plots on a local port. Does that sound reasonable to you? (We can put this in a separate PR because we're still tweaking some parts)

@lostmygithubaccount
Copy link
Member

sounds great! follow up PR seems fine for that

what kind of dashboard is this? Quarto?

@chloeh13q
Copy link
Contributor Author

No it's a Dash app. It's not a part of the Quarto rendering - in the tutorial it would tell users to go to http://127.0.0.1:8050/ in the browser to see the dashboard. We're also thinking about adding a grafana dashboard for more complex visualizations, but won't tackle that now.

@lostmygithubaccount
Copy link
Member

why dash? if it's quarto we can put it right in the website (like the backend support matrix)

@chloeh13q
Copy link
Contributor Author

chloeh13q commented Feb 5, 2024

From @mfatihaktas :

For the following reasons:

  • As much as I could understand it, Quarto is for reproducible technical reporting. It integrates very well with the notebooks and allows for quick implementation of beautiful graphs. However, I could not find much information on how to use it as a "web service" in a containerized setting. The alternative would to implement a "wrapper python module" around Quarto API and insert calls to it in the demo notebook. I thought that this would "pollute/inflate" and might deviate/conflate the goal of the notebook: introducing Ibis/Flink.
  • Quarto's functionality seems to be limited in terms of developing dynamic graphs. It allows for responding to user input (e.g., selecting a button on the graphs), however I could not find enough information on how to generate semi-runtime-plots with it.
  • Dash requires very small amount of boilerplate code to turn data into a dashboard. Thus, we can easily replace it with any other tool that allows for serving dashboards over HTTP.

We should be able to switch to any other viz tool quickly as long as the dashboard tool works in a containerized setting. Otherwise, we would need to figure out how to bring the dashboard into the demo notebook.

@mfatihaktas
Copy link
Contributor

From @mfatihaktas :

For the following reasons:

  • As much as I could understand it, Quarto is for reproducible technical reporting. It integrates very well with the notebooks and allows for quick implementation of beautiful graphs. However, I could not find much information on how to use it as a "web service" in a containerized setting. The alternative would to implement a "wrapper python module" around Quarto API and insert calls to it in the demo notebook. I thought that this would "pollute/inflate" and might deviate/conflate the goal of the notebook: introducing Ibis/Flink.
  • Quarto's functionality seems to be limited in terms of developing dynamic graphs. It allows for responding to user input (e.g., selecting a button on the graphs), however I could not find enough information on how to generate semi-runtime-plots with it.
  • Dash requires very small amount of boilerplate code to turn data into a dashboard. Thus, we can easily replace it with any other tool that allows for serving dashboards over HTTP.

We should be able to switch to any other viz tool quickly as long as the dashboard tool works in a containerized setting. Otherwise, we would need to figure out how to bring the dashboard into the demo notebook.

Quick note. Dash apps can be embedded in Quarto easily as dash is powered by plotly.

@lostmygithubaccount lostmygithubaccount added docs Documentation related issues or PRs docs-preview Add this label to trigger a docs preview labels Feb 6, 2024
@ibis-docs-bot ibis-docs-bot bot removed the docs-preview Add this label to trigger a docs preview label Feb 6, 2024
@ibis-docs-bot
Copy link

ibis-docs-bot bot commented Feb 6, 2024

Copy link
Member

@lostmygithubaccount lostmygithubaccount left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@lostmygithubaccount lostmygithubaccount merged commit e2a3fb6 into ibis-project:main Feb 12, 2024
93 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation related issues or PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants