Implement transfer for BigQuery native path S3/GCS #1850

rajaths010494 · 2023-03-17T05:17:33Z

Please describe the feature you'd like to see

Add DataProvider for Bigquery
Add native transfer implementation for GCS to BigQuery
Add native transfer implementation for S3 to BigQuery
Add example DAG
Add tests with 90% coverage

Acceptance Criteria

All checks and tests in the CI should pass
Unit tests (90% code coverage or more, once available)
Integration tests (if the feature relates to a new database or external service)
Example DAG
Docstrings in reStructuredText for each of methods, classes, functions and module-level attributes (including Example DAG on how it should be used)
Exception handling in case of errors
Logging (are we exposing useful information to the user? e.g. source and destination)
Improve the documentation (README, Sphinx, and any other relevant)
How to use Guide for the feature (example)

Allow users to use `run_raw_sql` to convert from pandas dataframe created with `aql.dataframe` into a DuckDB in-memory table. Previously we were creating different database and connection instances within the base SQL operator unnecessarily. Fix: #1831

https://tilt.dev/ is a powerful dev tool when using containers. It can automatically restart containers and sync files between localhost and container

Co-authored-by: Pankaj Koti <[email protected]>

# Description ## What is the current behavior? Currently, `astro.sql.cleanup` deletes temporary tables once upstream tasks are done, whether or not the DAG succeeded. While usually desirable, more detailed control is sometimes useful. For example, during DAG development, I may not want to keep regenerating all temporary tables while bugfixing failing tasks. ## What is the new behavior? - `CleanupOperator` has a new optional argument `skip_on_failure` that prevents table cleanup if any upstream task fails. - To mimic current behavior, `skip_on_failure=False` by default. This PR closes issue #1826. ## Does this introduce a breaking change? No ### Checklist [x ] Created tests which fail without the change (if possible) [ ] Extended the README / documentation, if necessary

# Description ## What is the current behavior? Currently, the codecov is broken because we don't generate the codecoverage.xml file for the entire set of test cases available, we generate it for only a subset of it like - PythonSDK or UTO or SQL CLI and codecov expects coverage file for all the tests cases, without which it leads to thinking the code coverage went down. There is a feature of flags and carry-forward flags - https://docs.codecov.com/docs/carryforward-flags and https://docs.codecov.com/docs/flags Which can help is carry forwarding the right coverage report from the correct commit if we generate the coverage report for only a subset of test cases. ## What is the new behavior? codecov has a feature of flags to handle the mono repo use case - `Flags`. ## Does this introduce a breaking change? Nope ### Checklist - [ ] Created tests which fail without the change (if possible) - [ ] Extended the README / documentation, if necessary

# Description ## What is the current behavior? Fix s3 provider issue, needs {} instead of None in`transfer_config_args` config.

# Description ## What is the current behavior? Add example dags as part of integration tests.

Bump astro-runtime 7.4.0 version

**Please describe the feature you'd like to see** - Add `DataProvider` for Bigquery - read/write methods - Add non-native transfer implementation for GCS to BigQuery - Add non-native transfer implementation for S3 to BigQuery - Add non-native transfer example DAG for BigQuery to Sqlite - Add non-native transfer example DAG for BigQuery to Snowflake - Add example DAG - Add tests with 90% coverage **Acceptance Criteria** - [ ] All checks and tests in the CI should pass - [ ] Unit tests (90% code coverage or more, [once available](#191)) - [ ] Integration tests (if the feature relates to a new database or external service) - [ ] Example DAG - [ ] Docstrings in [reStructuredText](https://peps.python.org/pep-0287/) for each of methods, classes, functions and module-level attributes (including Example DAG on how it should be used) - [ ] Exception handling in case of errors - [ ] Logging (are we exposing useful information to the user? e.g. source and destination) - [ ] Improve the documentation (README, Sphinx, and any other relevant) - [ ] How to use Guide for the feature ([example](https://airflow.apache.org/docs/apache-airflow-providers-postgres/stable/operators/postgres_operator_howto_guide.html)) closes: #1732 closes: #1785 closes: #1730 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Utkarsh Sharma <[email protected]> Co-authored-by: Phani Kumar <[email protected]>

tatiana and others added 12 commits March 8, 2023 15:13

Removing 'Kaxil' from Code Owner (#1827)

5c56d20

Add basic Tiltfile for Python SDK local dev (#1819)

ecf2d61

https://tilt.dev/ is a powerful dev tool when using containers. It can automatically restart containers and sync files between localhost and container

Fix RELEASE.md documentation broken link (#1834)

710cc61

Co-authored-by: Pankaj Koti <[email protected]>

[pre-commit.ci] pre-commit autoupdate (#1842)

a6206d2

Fix S3provider issue (#1846)

14bb7c3

# Description ## What is the current behavior? Fix s3 provider issue, needs {} instead of None in`transfer_config_args` config.

Add example dags testcase as part of CI (#1845)

ba26711

# Description ## What is the current behavior? Add example dags as part of integration tests.

Bump astro-runtime 7.4.0 version (#1848)

bf189de

Bump astro-runtime 7.4.0 version

Add MySQL support (#1801)

ee161d4

rajaths010494 closed this Mar 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement transfer for BigQuery native path S3/GCS #1850

Implement transfer for BigQuery native path S3/GCS #1850

rajaths010494 commented Mar 17, 2023 •

edited

Loading

Implement transfer for BigQuery native path S3/GCS #1850

Implement transfer for BigQuery native path S3/GCS #1850

Conversation

rajaths010494 commented Mar 17, 2023 • edited Loading

rajaths010494 commented Mar 17, 2023 •

edited

Loading