Skip to content

Commit

Permalink
Fix links to sources for examples
Browse files Browse the repository at this point in the history
The links to example sources in exampleinclude have been broken in a
number of providers and they were additionally broken by AIP-47.

This PR fixes it.

Fixes: apache#23632
Fixes: apache/airflow-site#536
  • Loading branch information
potiuk committed Jun 13, 2022
1 parent 224285b commit e3745cc
Show file tree
Hide file tree
Showing 231 changed files with 2,080 additions and 433 deletions.
21 changes: 18 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -422,11 +422,11 @@ repos:
- id: check-no-relative-imports
language: pygrep
name: No relative imports
description: Airflow style is to use absolute imports only
description: Airflow style is to use absolute imports only (except docs building)
entry: "^\\s*from\\s+\\."
pass_filenames: true
files: \.py$
exclude: ^tests/|^airflow/_vendor/
exclude: ^tests/|^airflow/_vendor/|^docs/
- id: check-for-inclusive-language
language: pygrep
name: Check for language that we do not accept as community
Expand Down Expand Up @@ -648,7 +648,7 @@ repos:
entry: ./scripts/ci/pre_commit/pre_commit_check_system_tests.py
language: python
files: ^tests/system/.*/example_[^/]*.py$
exclude: ^tests/system/providers/google/bigquery/example_bigquery_queries\.py$
exclude: ^tests/system/providers/google/cloud/bigquery/example_bigquery_queries\.py$
pass_filenames: true
additional_dependencies: ['rich>=12.4.4']
- id: lint-markdown
Expand Down Expand Up @@ -786,6 +786,21 @@ repos:
pass_filenames: true
files: ^docs/.*index\.rst$|^docs/.*example-dags\.rst$
additional_dependencies: ['rich>=12.4.4', 'pyyaml']
always_run: true
- id: check-system-tests-tocs
name: Check that system tests is properly added
entry: ./scripts/ci/pre_commit/pre_commit_check_system_tests_hidden_in_index.py
language: python
pass_filenames: true
files: ^docs/apache-airflow-providers-[^/]*/index\.rst$
additional_dependencies: ['rich>=12.4.4', 'pyyaml']
- id: create-missing-init-py-files-tests
name: Create missing init.py files in tests
entry: ./scripts/ci/pre_commit/pre_commit_check_init_in_tests.py
language: python
additional_dependencies: ['rich>=12.4.4']
pass_filenames: false
files: ^tests/.*\.py$
## ADD MOST PRE-COMMITS ABOVE THAT LINE
# The below pre-commits are those requiring CI image to be built
- id: run-mypy
Expand Down
2 changes: 1 addition & 1 deletion RELEASE_NOTES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1249,7 +1249,7 @@ Logical date of a DAG run triggered from the web UI now have its sub-second comp

Due to a change in how the logical date (``execution_date``) is generated for a manual DAG run, a manual DAG run’s logical date may not match its time-of-trigger, but have its sub-second part zero-ed out. For example, a DAG run triggered on ``2021-10-11T12:34:56.78901`` would have its logical date set to ``2021-10-11T12:34:56.00000``.

This may affect some logic that expects on this quirk to detect whether a run is triggered manually or not. Note that ``dag_run.run_type`` is a more authoritative value for this purpose. Also, if you need this distinction between automated and manually-triggered rus for “next execution date” calculation, please also consider using the new data interval variables instead, which provide a more consistent behavior between the two run types.
This may affect some logic that expects on this quirk to detect whether a run is triggered manually or not. Note that ``dag_run.run_type`` is a more authoritative value for this purpose. Also, if you need this distinction between automated and manually-triggered run for “next execution date” calculation, please also consider using the new data interval variables instead, which provide a more consistent behavior between the two run types.

New Features
^^^^^^^^^^^^
Expand Down
4 changes: 4 additions & 0 deletions STATIC_CODE_CHECKS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -209,10 +209,14 @@ require Breeze Docker image to be build locally.
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-system-tests-present | Check if system tests have required segments of code | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-system-tests-tocs | Check that system tests is properly added | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| check-xml | Check XML files with xmllint | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| codespell | Run codespell to check for common misspellings in files | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| create-missing-init-py-files-tests | Create missing init.py files in tests | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| debug-statements | Detect accidentally committed debug statements | |
+--------------------------------------------------------+------------------------------------------------------------------+---------+
| detect-private-key | Detect if private key is added to the repository | |
Expand Down
20 changes: 10 additions & 10 deletions airflow/example_dags/example_branch_datetime_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
from airflow.operators.datetime import BranchDateTimeOperator
from airflow.operators.empty import EmptyOperator

dag = DAG(
dag1 = DAG(
dag_id="example_branch_datetime_operator",
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
catchup=False,
Expand All @@ -35,44 +35,44 @@
)

# [START howto_branch_datetime_operator]
empty_task_1 = EmptyOperator(task_id='date_in_range', dag=dag)
empty_task_2 = EmptyOperator(task_id='date_outside_range', dag=dag)
empty_task_11 = EmptyOperator(task_id='date_in_range', dag=dag1)
empty_task_21 = EmptyOperator(task_id='date_outside_range', dag=dag1)

cond1 = BranchDateTimeOperator(
task_id='datetime_branch',
follow_task_ids_if_true=['date_in_range'],
follow_task_ids_if_false=['date_outside_range'],
target_upper=pendulum.datetime(2020, 10, 10, 15, 0, 0),
target_lower=pendulum.datetime(2020, 10, 10, 14, 0, 0),
dag=dag,
dag=dag1,
)

# Run empty_task_1 if cond1 executes between 2020-10-10 14:00:00 and 2020-10-10 15:00:00
cond1 >> [empty_task_1, empty_task_2]
cond1 >> [empty_task_11, empty_task_21]
# [END howto_branch_datetime_operator]


dag = DAG(
dag2 = DAG(
dag_id="example_branch_datetime_operator_2",
start_date=pendulum.datetime(2021, 1, 1, tz="UTC"),
catchup=False,
tags=["example"],
schedule_interval="@daily",
)
# [START howto_branch_datetime_operator_next_day]
empty_task_1 = EmptyOperator(task_id='date_in_range', dag=dag)
empty_task_2 = EmptyOperator(task_id='date_outside_range', dag=dag)
empty_task_12 = EmptyOperator(task_id='date_in_range', dag=dag2)
empty_task_22 = EmptyOperator(task_id='date_outside_range', dag=dag2)

cond2 = BranchDateTimeOperator(
task_id='datetime_branch',
follow_task_ids_if_true=['date_in_range'],
follow_task_ids_if_false=['date_outside_range'],
target_upper=pendulum.time(0, 0, 0),
target_lower=pendulum.time(15, 0, 0),
dag=dag,
dag=dag2,
)

# Since target_lower happens after target_upper, target_upper will be moved to the following day
# Run empty_task_1 if cond2 executes between 15:00:00, and 00:00:00 of the following day
cond2 >> [empty_task_1, empty_task_2]
cond2 >> [empty_task_12, empty_task_22]
# [END howto_branch_datetime_operator_next_day]
26 changes: 14 additions & 12 deletions airflow/example_dags/example_external_task_marker_dag.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,23 +18,25 @@

"""
Example DAG demonstrating setting up inter-DAG dependencies using ExternalTaskSensor and
ExternalTaskMarker
ExternalTaskMarker.
In this example, child_task1 in example_external_task_marker_child depends on parent_task in
example_external_task_marker_parent. When parent_task is cleared with "Recursive" selected,
the presence of ExternalTaskMarker tells Airflow to clear child_task1 and its
downstream tasks.
example_external_task_marker_parent. When parent_task is cleared with 'Recursive' selected,
the presence of ExternalTaskMarker tells Airflow to clear child_task1 and its downstream tasks.
ExternalTaskSensor will keep poking for the status of remote ExternalTaskMarker task at a regular
interval till one of the following will happen:
1. ExternalTaskMarker reaches the states mentioned in the allowed_states list
In this case, ExternalTaskSensor will exit with a success status code
2. ExternalTaskMarker reaches the states mentioned in the failed_states list
In this case, ExternalTaskSensor will raise an AirflowException and user need to handle this
with multiple downstream tasks
3. ExternalTaskSensor times out
In this case, ExternalTaskSensor will raise AirflowSkipException or AirflowSensorTimeout
exception
ExternalTaskMarker reaches the states mentioned in the allowed_states list.
In this case, ExternalTaskSensor will exit with a success status code
ExternalTaskMarker reaches the states mentioned in the failed_states list
In this case, ExternalTaskSensor will raise an AirflowException and user need to handle this
with multiple downstream tasks
ExternalTaskSensor times out. In this case, ExternalTaskSensor will raise AirflowSkipException
or AirflowSensorTimeout exception
"""

import pendulum
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@
Further information:
YOUTUBE_VIDEO_PUBLISHED_AFTER and YOUTUBE_VIDEO_PUBLISHED_BEFORE needs to be formatted
"YYYY-MM-DDThh:mm:ss.sZ". See https://developers.google.com/youtube/v3/docs/search/list for more information.
``YYYY-MM-DDThh:mm:ss.sZ``.
See https://developers.google.com/youtube/v3/docs/search/list for more information.
YOUTUBE_VIDEO_PARTS depends on the fields you pass via YOUTUBE_VIDEO_FIELDS. See
https://developers.google.com/youtube/v3/docs/videos/list#parameters for more information.
YOUTUBE_CONN_ID is optional for public videos. It does only need to authenticate when there are private videos
Expand Down
5 changes: 1 addition & 4 deletions airflow/providers/amazon/aws/example_dags/example_s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,9 @@
# [START howto_sensor_s3_key_function_definition]
def check_fn(files: List) -> bool:
"""
Example of custom check: check if all files are bigger than 1kB
Example of custom check: check if all files are bigger than ``1kB``
:param files: List of S3 object attributes.
Format: [{
'Size': int
}]
:return: true if the criteria is met
:rtype: bool
"""
Expand Down
4 changes: 2 additions & 2 deletions airflow/providers/arangodb/example_dags/example_arangodb.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@

# [START howto_aql_sensor_template_file_arangodb]

sensor = AQLSensor(
sensor2 = AQLSensor(
task_id="aql_sensor_template_file",
query="search_judy.sql",
timeout=60,
Expand All @@ -65,7 +65,7 @@

# [START howto_aql_operator_template_file_arangodb]

operator = AQLOperator(
operator2 = AQLOperator(
task_id='aql_operator_template_file',
dag=dag,
result_processor=lambda cursor: print([document["name"] for document in cursor]),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -204,14 +204,14 @@ def get_target_column_spec(columns_specs: List[Dict], column_name: str) -> str:
catchup=False,
user_defined_macros={"extract_object_id": extract_object_id},
) as example_dag:
create_dataset_task = AutoMLCreateDatasetOperator(
create_dataset_task2 = AutoMLCreateDatasetOperator(
task_id="create_dataset_task",
dataset=DATASET,
location=GCP_AUTOML_LOCATION,
project_id=GCP_PROJECT_ID,
)

dataset_id = create_dataset_task.output['dataset_id']
dataset_id = create_dataset_task2.output['dataset_id']

import_dataset_task = AutoMLImportDataOperator(
task_id="import_dataset_task",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,17 +17,15 @@
# under the License.

"""
Example Airflow DAG that demonstrates interactions with Google Cloud Transfer.
Example Airflow DAG that demonstrates interactions with Google Cloud Transfer. This DAG relies on
the following OS environment variables
This DAG relies on the following OS environment variables
Note that you need to provide a large enough set of data so that operations do not execute too quickly.
Otherwise, DAG will fail.
* GCP_PROJECT_ID - Google Cloud Project to use for the Google Cloud Transfer Service.
* GCP_DESCRIPTION - Description of transfer job
* GCP_TRANSFER_SOURCE_AWS_BUCKET - Amazon Web Services Storage bucket from which files are copied.
.. warning::
You need to provide a large enough set of data so that operations do not execute too quickly.
Otherwise, DAG will fail.
* GCP_TRANSFER_SECOND_TARGET_BUCKET - Google Cloud Storage bucket to which files are copied
* WAIT_FOR_OPERATION_POKE_INTERVAL - interval of what to check the status of the operation
A smaller value than the default value accelerates the system test and ensures its correct execution with
Expand Down
8 changes: 4 additions & 4 deletions airflow/providers/google/cloud/example_dags/example_pubsub.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
catchup=False,
) as example_sensor_dag:
# [START howto_operator_gcp_pubsub_create_topic]
create_topic = PubSubCreateTopicOperator(
create_topic1 = PubSubCreateTopicOperator(
task_id="create_topic", topic=TOPIC_FOR_SENSOR_DAG, project_id=GCP_PROJECT_ID, fail_if_exists=False
)
# [END howto_operator_gcp_pubsub_create_topic]
Expand Down Expand Up @@ -105,7 +105,7 @@
)
# [END howto_operator_gcp_pubsub_delete_topic]

create_topic >> subscribe_task >> publish_task
create_topic1 >> subscribe_task >> publish_task
pull_messages >> pull_messages_result >> unsubscribe_task >> delete_topic

# Task dependencies created via `XComArgs`:
Expand All @@ -120,7 +120,7 @@
catchup=False,
) as example_operator_dag:
# [START howto_operator_gcp_pubsub_create_topic]
create_topic = PubSubCreateTopicOperator(
create_topic2 = PubSubCreateTopicOperator(
task_id="create_topic", topic=TOPIC_FOR_OPERATOR_DAG, project_id=GCP_PROJECT_ID
)
# [END howto_operator_gcp_pubsub_create_topic]
Expand Down Expand Up @@ -170,7 +170,7 @@
# [END howto_operator_gcp_pubsub_delete_topic]

(
create_topic
create_topic2
>> subscribe_task
>> publish_task
>> pull_messages_operator
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,16 @@
This DAG relies on the following OS environment variables:
* GCP_VERTEX_AI_BUCKET - Google Cloud Storage bucket where the model will be saved
after training process was finished.
after training process was finished.
* CUSTOM_CONTAINER_URI - path to container with model.
* PYTHON_PACKAGE_GSC_URI - path to test model in archive.
* LOCAL_TRAINING_SCRIPT_PATH - path to local training script.
* DATASET_ID - ID of dataset which will be used in training process.
* MODEL_ID - ID of model which will be used in predict process.
* MODEL_ARTIFACT_URI - The artifact_uri should be the path to a GCS directory containing saved model
artifacts.
artifacts.
"""

import os
from datetime import datetime
from uuid import uuid4
Expand Down
2 changes: 1 addition & 1 deletion airflow/providers/mongo/hooks/mongo.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ def replace_many(
:param mongo_collection: The name of the collection to update.
:param docs: The new documents.
:param filter_docs: A list of queries that match the documents to replace.
Can be omitted; then the _id fields from docs will be used.
Can be omitted; then the _id fields from airflow.docs will be used.
:param mongo_db: The name of the database to use.
Can be omitted; then the database from the connection string is used.
:param upsert: If ``True``, perform an insert if no documents
Expand Down
2 changes: 2 additions & 0 deletions dev/breeze/src/airflow_breeze/pre_commit_ids.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,10 @@
'check-setup-order',
'check-start-date-not-used-in-defaults',
'check-system-tests-present',
'check-system-tests-tocs',
'check-xml',
'codespell',
'create-missing-init-py-files-tests',
'debug-statements',
'detect-private-key',
'doctoc',
Expand Down
16 changes: 16 additions & 0 deletions docs/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
6 changes: 6 additions & 0 deletions docs/apache-airflow-providers-alibaba/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@ Content

Python API <_api/airflow/providers/alibaba/index>

.. toctree::
:hidden:
:caption: System tests

System Tests <_api/tests/system/providers/alibaba/index>

.. toctree::
:maxdepth: 1
:caption: Resources
Expand Down
1 change: 1 addition & 0 deletions docs/apache-airflow-providers-amazon/example-dags.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ Example DAGs
You can learn how to use Amazon AWS integrations by analyzing the source code of the example DAGs:

* `Amazon AWS <https://github.com/apache/airflow/tree/providers-amazon/4.0.0/tests/system/providers/amazon/aws>`__
* `Amazon AWS (legacy) <https://github.com/apache/airflow/tree/providers-amazon/4.0.0/airflow/providers/amazon/aws/example_dags>`__
6 changes: 6 additions & 0 deletions docs/apache-airflow-providers-amazon/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ Content

Python API <_api/airflow/providers/amazon/index>

.. toctree::
:hidden:
:caption: System tests

System Tests <_api/tests/system/providers/amazon/index>

.. toctree::
:maxdepth: 1
:caption: Resources
Expand Down
11 changes: 11 additions & 0 deletions docs/apache-airflow-providers-apache-beam/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,17 @@ Content
:caption: References

Python API <_api/airflow/providers/apache/beam/index>

.. toctree::
:hidden:
:caption: System tests

System Tests <_api/tests/system/providers/apache/beam/index>

.. toctree::
:maxdepth: 1
:caption: Resources

PyPI Repository <https://pypi.org/project/apache-airflow-providers-apache-beam/>
Example DAGs <https://github.com/apache/airflow/tree/providers-apache-beam/4.0.0/tests/system/providers/apache/beam>

Expand Down
Loading

0 comments on commit e3745cc

Please sign in to comment.