Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement transfer for BigQuery native path S3/GCS #1850

Closed
wants to merge 12 commits into from

Conversation

rajaths010494
Copy link
Contributor

@rajaths010494 rajaths010494 commented Mar 17, 2023

Please describe the feature you'd like to see

  • Add DataProvider for Bigquery
  • Add native transfer implementation for GCS to BigQuery
  • Add native transfer implementation for S3 to BigQuery
  • Add example DAG
  • Add tests with 90% coverage

Acceptance Criteria

  • All checks and tests in the CI should pass
  • Unit tests (90% code coverage or more, once available)
  • Integration tests (if the feature relates to a new database or external service)
  • Example DAG
  • Docstrings in reStructuredText for each of methods, classes, functions and module-level attributes (including Example DAG on how it should be used)
  • Exception handling in case of errors
  • Logging (are we exposing useful information to the user? e.g. source and destination)
  • Improve the documentation (README, Sphinx, and any other relevant)
  • How to use Guide for the feature (example)

tatiana and others added 12 commits March 8, 2023 15:13
Allow users to use `run_raw_sql` to convert from pandas dataframe
created with `aql.dataframe` into a DuckDB in-memory table.

Previously we were creating different database and connection instances
within the base SQL operator unnecessarily.

Fix: #1831
https://tilt.dev/ is a powerful dev tool when using containers.

It can automatically restart containers and sync files between localhost
and container
# Description
## What is the current behavior?
Currently, `astro.sql.cleanup` deletes temporary tables once upstream
tasks are done, whether or not the DAG succeeded.

While usually desirable, more detailed control is sometimes useful. For
example, during DAG development, I may not want to keep regenerating all
temporary tables while bugfixing failing tasks.

## What is the new behavior?

- `CleanupOperator` has a new optional argument `skip_on_failure` that
prevents table cleanup if any upstream task fails.
- To mimic current behavior, `skip_on_failure=False` by default.

This PR closes issue #1826. 

## Does this introduce a breaking change?
No

### Checklist
[x ] Created tests which fail without the change (if possible)
[ ] Extended the README / documentation, if necessary
# Description
## What is the current behavior?
Currently, the codecov is broken because we don't generate the
codecoverage.xml file for the entire set of test cases available, we
generate it for only a subset of it like - PythonSDK or UTO or SQL CLI
and codecov expects coverage file for all the tests cases, without which
it leads to thinking the code coverage went down.

There is a feature of flags and carry-forward flags -
https://docs.codecov.com/docs/carryforward-flags and
https://docs.codecov.com/docs/flags Which can help is carry forwarding
the right coverage report from the correct commit if we generate the
coverage report for only a subset of test cases.

## What is the new behavior?
codecov has a feature of flags to handle the mono repo use case -
`Flags`.


## Does this introduce a breaking change?
Nope

### Checklist
- [ ] Created tests which fail without the change (if possible)
- [ ] Extended the README / documentation, if necessary
# Description
## What is the current behavior?
Fix s3 provider issue, needs {} instead of None in`transfer_config_args`
config.
# Description
## What is the current behavior?
Add example dags as part of integration tests.
Bump astro-runtime 7.4.0 version
**Please describe the feature you'd like to see**
- Add `DataProvider` for Bigquery - read/write methods
- Add non-native transfer implementation for GCS to BigQuery
-  Add non-native transfer implementation for S3 to BigQuery
- Add non-native transfer example DAG for BigQuery to Sqlite 
- Add non-native transfer example DAG for BigQuery to Snowflake
- Add example DAG
- Add tests with 90% coverage

**Acceptance Criteria**

- [ ] All checks and tests in the CI should pass
- [ ] Unit tests (90% code coverage or more, [once
available](#191))
- [ ] Integration tests (if the feature relates to a new database or
external service)
- [ ] Example DAG
- [ ] Docstrings in
[reStructuredText](https://peps.python.org/pep-0287/) for each of
methods, classes, functions and module-level attributes (including
Example DAG on how it should be used)
- [ ] Exception handling in case of errors
- [ ] Logging (are we exposing useful information to the user? e.g.
source and destination)
- [ ] Improve the documentation (README, Sphinx, and any other relevant)
- [ ] How to use Guide for the feature
([example](https://airflow.apache.org/docs/apache-airflow-providers-postgres/stable/operators/postgres_operator_howto_guide.html))


closes: #1732 
closes: #1785
closes: #1730

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Utkarsh Sharma <[email protected]>
Co-authored-by: Phani Kumar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants