Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Amazon example DAGs to new design #22438

Closed
36 tasks done
potiuk opened this issue Mar 22, 2022 · 15 comments
Closed
36 tasks done

Migrate Amazon example DAGs to new design #22438

potiuk opened this issue Mar 22, 2022 · 15 comments
Labels
AIP-47 AIP-47 New Design of System Tests kind:meta High-level information important to the community

Comments

@potiuk
Copy link
Member

potiuk commented Mar 22, 2022

There is a new design of system tests that was introduced by the AIP-47.

All current example dags need to be migrated and converted into system tests, so they can be run in the CI process automatically before releases.

This is an aggregated issue for all example DAGs related to Amazon provider. It is created to track progress of their migration.

List of paths to example DAGs:

  • airflow/providers/amazon/aws/example_dags/example_s3_bucket_tagging.py
  • airflow/providers/amazon/aws/example_dags/example_emr_eks.py
  • airflow/providers/amazon/aws/example_dags/example_athena.py
  • airflow/providers/amazon/aws/example_dags/example_sqs.py
  • airflow/providers/amazon/aws/example_dags/example_eks_with_nodegroups.py
  • airflow/providers/amazon/aws/example_dags/example_dms_full_load_task.py
  • airflow/providers/amazon/aws/example_dags/example_emr.py
  • airflow/providers/amazon/aws/example_dags/example_sagemaker.py
  • airflow/providers/amazon/aws/example_dags/example_eks_with_nodegroup_in_one_step.py
  • airflow/providers/amazon/aws/example_dags/example_dynamodb_to_s3.py
  • airflow/providers/amazon/aws/example_dags/example_s3_to_sftp.py
  • airflow/providers/amazon/aws/example_dags/example_local_to_s3.py
  • airflow/providers/amazon/aws/example_dags/example_s3_bucket.py
  • airflow/providers/amazon/aws/example_dags/example_google_api_to_s3_transfer_basic.py
  • airflow/providers/amazon/aws/example_dags/example_redshift_cluster.py
  • airflow/providers/amazon/aws/example_dags/example_lambda.py
  • airflow/providers/amazon/aws/example_dags/example_s3_to_redshift.py
  • airflow/providers/amazon/aws/example_dags/example_imap_attachment_to_s3.py
  • airflow/providers/amazon/aws/example_dags/example_ecs_fargate.py
  • airflow/providers/amazon/aws/example_dags/example_rds.py
  • airflow/providers/amazon/aws/example_dags/example_eks_with_fargate_in_one_step.py
  • airflow/providers/amazon/aws/example_dags/example_batch.py
  • airflow/providers/amazon/aws/example_dags/example_datasync_2.py
  • airflow/providers/amazon/aws/example_dags/example_dynamodb_to_s3_segmented.py
  • airflow/providers/amazon/aws/example_dags/example_ecs_ec2.py
  • airflow/providers/amazon/aws/example_dags/example_eks_with_fargate_profile.py
  • airflow/providers/amazon/aws/example_dags/example_redshift_to_s3.py
  • airflow/providers/amazon/aws/example_dags/example_redshift_sql.py
  • airflow/providers/amazon/aws/example_dags/example_sns.py
  • airflow/providers/amazon/aws/example_dags/example_google_api_to_s3_transfer_advanced.py
  • airflow/providers/amazon/aws/example_dags/example_datasync_1.py
  • airflow/providers/amazon/aws/example_dags/example_glacier_to_gcs.py
  • airflow/providers/amazon/aws/example_dags/example_redshift_data_execute_sql.py
  • airflow/providers/amazon/aws/example_dags/example_eks_templated.py
  • airflow/providers/amazon/aws/example_dags/example_sftp_to_s3.py
  • airflow/providers/amazon/aws/example_dags/example_salesforce_to_s3.py
@potiuk potiuk added the AIP-47 AIP-47 New Design of System Tests label Mar 22, 2022
@potiuk potiuk added the kind:meta High-level information important to the community label Mar 22, 2022
@ferruzzi
Copy link
Contributor

ferruzzi commented Jun 3, 2022

@potiuk - your new handy dandy script will update that checklist auto-magically on a schedule, right? You merged Athena yesterday.

@potiuk
Copy link
Member Author

potiuk commented Jun 3, 2022

It should update it already. I guess Athena example is still there simply :)

@potiuk
Copy link
Member Author

potiuk commented Jun 3, 2022

Yep. It needs to be removed ...

@ferruzzi
Copy link
Contributor

ferruzzi commented Jun 3, 2022

Huh. I'm positive I did that but it's definitely still there. I'll do a PR for you in a sec and tag you in it.

@ferruzzi
Copy link
Contributor

Quick update, the following are not checked off above:

  • airflow/providers/amazon/aws/example_dags/example_emr.py

    • @syedahsn is working on this one right now.
  • airflow/providers/amazon/aws/example_dags/example_eks_templated.py

    • All of the operators in this DAG are tested in other system tests which have already been converted, so there is no added benefit to converting this one. It was initially a demo of how to use Jinja templating that I wrote and contributed when I was learning how to use it, so I do think it still has value as an example DAG. Should we delete this file, move it somewhere else, or just leave it? I feel like if we keep it, then "Example DAGs" is exactly where it should live. It uses AWS operators, but it's specifically an AWS example (if that makes sense?) so maybe move it to a more generic location?

All of the remaining files were deemed out of scope for the current conversion project since they require external services/accounts/servers/etc (for example: requires a MongoDB account, a GCP account, hosting an SFTP server, etc) and we won't be able to commit to running/maintaining them on a regular basis. We may add them at a later date, but they won't be part of this big push.

  • airflow/providers/amazon/aws/example_dags/example_s3_to_sftp.py
  • airflow/providers/amazon/aws/example_dags/example_sftp_to_s3.py
  • airflow/providers/amazon/aws/example_dags/example_imap_attachment_to_s3.py
    • I think this one goes here because it needed an SMTP server configured but I'm not certain?
  • airflow/providers/amazon/aws/example_dags/example_glacier_to_gcs.py
  • airflow/providers/amazon/aws/example_dags/example_salesforce_to_s3.py

@potiuk
Copy link
Member Author

potiuk commented Oct 24, 2022

ok. So I guess we should be able to close that one soon ? Can we just migrate the example_* files to tests/* anyway so that we do not have two places where to look for them ? Even if those will not be "regular" pytests, having them in tests/system (with a comment that they are not runnable yet) would make for a clean conversion and remove duplicated "examples" link in the documentation (legacy/new). They might even have different sub-folder or naming to clearly distinguish them from "converted" tests.

@ferruzzi
Copy link
Contributor

ferruzzi commented Oct 25, 2022

If they get moved there, then they will be run by our CI and always fail and cause alerts. Maybe a subdirectory? @vincbeck what do you think of adding an excluded directory in the system tests directory??

@vincbeck
Copy link
Contributor

Moving the example dags in tests directory will not make the CI run them so we are safe on that side. Though I am wondering if we want to have example dags in this directory. That would mix system tests and example dags together. I would be actually inclined to have a subdirectory (e.g. example_dags) just for clarification purposes. WDYT?

@potiuk
Copy link
Member Author

potiuk commented Oct 26, 2022

Moving the example dags in tests directory will not make the CI run them so we are safe on that side. Though I am wondering if we want to have example dags in this directory. That would mix system tests and example dags together. I would be actually inclined to have a subdirectory (e.g. example_dags) just for clarification purposes. WDYT?

Subdir is fine.

@eladkal
Copy link
Contributor

eladkal commented Jan 7, 2023

I think the current state where we have some examples dag under system tests and some in providers is very confusing.

@ferruzzi comment is reasonable and at least for now we can't have system tests for everything (accounts etc..) but the current state where example dags are in two separated parts of the code base is very confusing.

I wonder if we can convert everything to system tests yet mark the relevant ones with missing accounts or other issues as excluded ? Kinda like what we do with quarantined tests.

@shubham22
Copy link

+1 to the suggestion to move the remaining example DAGs to the system test directory. In the past month, I've had at least three people ask me why we got rid of the example DAGs, and they were not aware of the system tests.

I am not in favor of creating subdir as Airflow users may not understand why only certain ones are in example_dags directory. It might be less confusing if we could filter the System tests using a file name suffix, like example_xyz_test vs example_xyz, as this would provide a clear separation while still being easily understandable.

@potiuk
Copy link
Member Author

potiuk commented Jan 10, 2023

+1 too.

@ferruzzi
Copy link
Contributor

ferruzzi commented Jan 10, 2023

If I'm not mistaken, that would require changes to the system test runner which I think currently triggers on the "example" prefix. This was a holdout to when they were still example DAGs, so maybe the better solution would be to combine the example dags and the system tests into the tests tree, rename system test files something more obvious like {service}_system_test.py and use the new naming convention to trigger the CI tests.

That would require adjusting links to code snippets in the rst files. Any other side effects that I'm not thinking about?

@potiuk
Copy link
Member Author

potiuk commented Jan 19, 2023

If I'm not mistaken, that would require changes to the system test runner which I think currently triggers on the "example" prefix. This was a holdout to when they were still example DAGs, so maybe the better solution would be to combine the example dags and the system tests into the tests tree, rename system test files something more obvious like {service}_system_test.py and use the new naming convention to trigger the CI tests.

That would require adjusting links to code snippets in the rst files. Any other side effects that I'm not thinking about?

I think that. Any other side effects would be detected by failing CI (which is very comprehensive for those). This is not a "production" code, so I thinl even breaking small things there by going a bit ahead is fine. I am generally much more bold for test code than for production code (with the exception of making sure that the tests are run and actually test things) - any problems there are at most impacting developers, not users.

@potiuk
Copy link
Member Author

potiuk commented Jun 22, 2023

Closing as "done"

@potiuk potiuk closed this as completed Jun 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-47 AIP-47 New Design of System Tests kind:meta High-level information important to the community
Projects
None yet
Development

No branches or pull requests

5 participants