-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gitpodify Apache Airflow - online development workspace #19756
Conversation
copy to @potiuk and David Brownkush |
Nice and Simple! :) This is what Breeze was created for :) . My initial goal was to get starrted with airflow under 10 minutes, so 5 minutes is pretty damn good. Re: ports: I think all the ports that Breeze has comments about:
Re: mongo - not sure why the problems. there are some problems with docker-compose2 for integrations (and networking) so maybe worth checking if we can configure Questions @j143 : I do not know gitpod that much, but is there a way we could configure some "options" when starting such vm? for example it would be great if when starting the vm you could choose:
For integrations - maybe just some predefined sets of those would be enough: ( |
If all else fails - i think it would be possible with env variables. Breeze already supports reacting to the environment variables so you could pass them for your gp instance https://www.gitpod.io/docs/environment-variables. Those will be:
The last one is the list of integrations enabled. I think for this one to be merged we need a separate "quick-start" - short version on how to start and how to configure the env variables) in https://github.com/apache/airflow/blob/main/CONTRIBUTORS_QUICK_START.rst . |
Two more things: |
One more cool thing while we are adding it, what's interesting is this one: https://www.gitpod.io/docs/environment-variables#provide-env-vars-via-url. It should be a follow-up PR but tt would be great if we can add an option to replicate CI failed builds in GitPod environment - seems with this one it should be possible. So it should be essentially possibleo add instructions for the user on how to replicate CI failed build in their gitpod environment. Essentially we should be able to craft an URL that should create a gitpodify environment for this specific build configuration:
|
I really like how simple it is to make the environment works with GitPod + Breeze :). We'll do very similar thing for Codespaces when they are publicly available. |
|
||
.. code-block:: bash | ||
|
||
$ breeze --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type Core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command gives an error for me. that the port 28080
is already allocated, I also have the container running (may be that is the cause!).
error log
gitpod /workspace/airflow $ breeze --backend mysql --mysql-version 5.7 --python 3.8 --db-reset --test-type Core
Good version of docker 20.10.8.
Backend: mysql
MySQL version: 5.7
Python version: 3.8
Resetting the DB!
Selected test type: Core
mkdir: created directory '/workspace/airflow/.build/main/3.8'
mkdir: created directory '/workspace/airflow/.build/main/3.8/CI'
a69c30ae03621a0e7051da64fcf34eff62de9166c8538ef408793a4caa5af362
a69c30ae03621a0e7051da64fcf34eff62de9166c8538ef408793a4caa5af362
Use CI image.
Branch name: main
Docker image: ghcr.io/apache/airflow/main/ci/python3.8:latest
Airflow source version: 2.3.0.dev0
Python version: 3.8
Backend: mysql 5.7
####################################################################################################
Airflow Breeze CHEATSHEET
/workspace/airflow/breeze
####################################################################################################
Port forwarding:
Ports are forwarded to the running docker containers for webserver and database
* 12322 -> forwarded to Airflow ssh server -> airflow:22
* 28080 -> forwarded to Airflow webserver -> airflow:8080
* 25555 -> forwarded to Flower dashboard -> airflow:5555
* 25433 -> forwarded to Postgres database -> postgres:5432
* 23306 -> forwarded to MySQL database -> mysql:3306
* 21433 -> forwarded to MSSQL database -> mssql:1443
* 26379 -> forwarded to Redis broker -> redis:6379
Here are links to those services that you can use on host:
* ssh connection for remote debugging: ssh -p 12322 airflow@127.0.0.1 pw: airflow
* Webserver: http://127.0.0.1:28080
* Flower: http://127.0.0.1:25555
* Postgres: jdbc:postgresql://127.0.0.1:25433/airflow?user=postgres&password=airflow
* Mysql: jdbc:mysql://127.0.0.1:23306/airflow?user=root
* Redis: redis://127.0.0.1:26379/0
####################################################################################################
You can setup autocomplete by running 'breeze setup-autocomplete'
####################################################################################################
You can toggle ascii/cheatsheet by running:
* breeze toggle-suppress-cheatsheet
* breeze toggle-suppress-asciiart
####################################################################################################
Unable to find image 'ghcr.io/apache/airflow/main/ci/python3.8:latest' locally
ylatest: Pulling from apache/airflow/main/ci/python3.8
a10c77af2613: Already exists
...
Digest: sha256:e19ff75603a5e74a82f39f897426119ad4c61cbbb5f0035b00c216d83f39190e
Status: Downloaded newer image for ghcr.io/apache/airflow/main/ci/python3.8:latest
Checking resources.
* Memory available 63G. OK.
* CPUs available 16. OK.
WARNING!!!: Not enough Disk space available for Docker.
At least 40 GBs recommended. You have 23G
WARNING!!!: You have not enough resources to run Airflow (see above)!
Please follow the instructions to increase amount of resources available:
Please check https://github.com/apache/airflow/blob/main/BREEZE.rst#resources-required for details
Good version of docker-compose: 1.29.2
WARNING: The ENABLE_TEST_COVERAGE variable is not set. Defaulting to a blank string.
Pulling mysql (mysql:5.7)...
5.7: Pulling from library/mysql
2e35f83a12e9: Pull complete
Digest: sha256:7a3a7b7a29e6fbff433c339fc52245435fa2c308586481f2f92ab1df239d6a29
Status: Downloaded newer image for mysql:5.7
Creating docker-compose_mysql_1 ... done
Creating docker-compose_airflow_run ... done
Error response from daemon: driver failed programming external connectivity on endpoint docker-compose_airflow_run_f081fd6ac899 (0d04b319ee1c4e4ceaed9d626bc70df0cb1ae0699efdf9ccead83aaed72ca420): Bind for 0.0.0.0:28080 failed: port is already allocated
ERROR: 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - fixed ports unfortunately :(. I think it might be a good idea to run breeze stop
beore starting a new instance. It will also make sure that all the DB volumes are cleared and databased will be "fresh like a daisy". You can also actually start breeze
always with --db-reset
switch - this will make sure that every time you initialize gitpod environment the database will be recreated. This is a nice feature - especially if you plan switch back/forth between branches and the environment will be preserved.
|
||
1. Breeze is already initialized in one of the terminals in Gitpod | ||
|
||
2. Once the breeze environment is initialized, create airflow tables and users from the breeze CLI. ``airflow db reset`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be gone if you use -db-reset
switch when starting Breeze (see the other comment)
.. code-block:: bash | ||
|
||
root@b76fcb399bb6:/opt/airflow# airflow db reset | ||
root@b76fcb399bb6:/opt/airflow# airflow users create --role Admin --username admin --password admin \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should stay here - even if you do --db-reset
but since it is only needed when you run/use webserver, I think you can specify that you need it only when you do.
I, for example, use airflow webserver extremely rarely when developing Airflow, and while it is useful to have it, it's mostly not needed to add core feature or provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love where it goes - few more corrections and it shoudl be good to go!
Hi Jarek, I have added the basic docs, and remaining tasks I have added as a checklist at #16480 Thanks for review and suggestion.
|
Oh yeah. That's ALMOST codespaces. Actually I already have access to codespaces and I want to do very same thing you did for gitpodify to make breeze starts when you enter codespaces :) |
BTW. Static checks are failing :). That's why it would have been great to integrate pre-commit from get-go :D |
How about the comments with |
I thought, I have added a comment on breeze stop. I have added this notes as far I understood - fc45c3e please provide any comments on that. :) |
I think maybe (not 100% sure if you think it's a good idea) you shoud add |
Is it ok, if I skip this note and add it as a new task in the #16480 . I need to spend little more time in this to understand this better. But, If you suggesting adding |
* starts the workspace with ./breeze -y * opens another terminal with bash * add documentation for opening Gitpod workspace, creating a branch, making changes * also, the instructions about setting up and working with `breeze` * add workaround for setting PIP_USER=no variable
Hi @potiuk , I have rebased it recently into one single commit. I hope main points were addressed. :) |
@j143 - you might be interested that we just start the project of rewriting You might want to contribute to it and eventually switch the gitpodified experience to it (and any comments/suggestions/improvemetns or contribution while we develop it is most welcome). |
Thank you @uranusjr for review. 😺 |
…ache#19756) * starts the workspace with ./breeze -y * opens another terminal with bash * add documentation for opening Gitpod workspace, creating a branch, making changes * also, the instructions about setting up and working with `breeze` * add workaround for setting PIP_USER=no variable
At present, configuration startups up the ide
with
./breeze -y
for setting up breeze environment. It takes 5 minsto load all the docker images. 😸
How to test?
apache/airflow/pull/16498
would fire up the online ready to code workspace.docker terminal
, in the right you could run any tests withbreeze
Terminals:
Testing:
pytest tests/core/test_core.py::TestCore::test_check_operators
Problems encountered:
"Error response from daemon: driver failed programming external connectivity on endpoint" while running
./breeze --integration mongo
Which ports should be open to public/private ? (suggestions please.)
Related: #16480