From 328f2e3693e01738a51ea6439681a9e3178212de Mon Sep 17 00:00:00 2001 From: Matthew McKnight Date: Thu, 20 Jan 2022 13:39:51 -0600 Subject: [PATCH 1/6] First draft of adding contributing.md to each adapter repo --- CONTRIBUTING.MD | 99 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 CONTRIBUTING.MD diff --git a/CONTRIBUTING.MD b/CONTRIBUTING.MD new file mode 100644 index 000000000..08c69be2f --- /dev/null +++ b/CONTRIBUTING.MD @@ -0,0 +1,99 @@ +# Contributing to `dbt-spark` + +1. [About this document](#about-this-document) +3. [Getting the code](#getting-the-code) +5. [Running `dbt-spark` in development](#running-dbt-spark-in-development) +6. [Testing](#testing) +7. [Updating Docs](#updating-docs) +7. [Submitting a Pull Request](#submitting-a-pull-request) + +## About this document +This document is a guide intended for folks interested in contributing to `dbt-spark`. Below, we document the process by which members of the community should create issues and submit pull requests (PRs) in this repository. It is not intended as a guide for using `dbt-spark`, and it assumes a certain level of familiarity with Python concepts such as virtualenvs, `pip`, python modules, filesystems, and so on. This guide assumes you are using macOS or Linux and are comfortable with the command line. + +For those wishing to contribute we highly suggest reading the [dbt-core](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md). if you haven't already. Almost all of the information there is applicable to contributing here, too! + +### Signing the CLA + +Please note that all contributors to `dbt-spark` must sign the [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements) to have their Pull Request merged into an `dbt-spark` codebase. If you are unable to sign the CLA, then the `dbt-spark` maintainers will unfortunately be unable to merge your Pull Request. You are, however, welcome to open issues and comment on existing ones. + + +## Getting the code + +You will need `git` in order to download and modify the `dbt-spark` source code. On macOS, the best way to download git is to just install [Xcode](https://developer.apple.com/support/xcode/). + +### External contributors + +If you are not a member of the `dbt-labs` GitHub organization, you can contribute to `dbt-spark` by forking the `dbt-spark` repository. For a detailed overview on forking, check out the [GitHub docs on forking](https://help.github.com/en/articles/fork-a-repo). In short, you will need to: + +1. fork the `dbt-spark` repository +2. clone your fork locally +3. check out a new branch for your proposed changes +4. push changes to your fork +5. open a pull request against `dbt-labs/dbt-spark` from your forked repository + +### dbt Labs contributors + +If you are a member of the `dbt Labs` GitHub organization, you will have push access to the `dbt-spark` repo. Rather than forking `dbt-spark` to make your changes, just clone the repository, check out a new branch, and push directly to that branch. + + +## Running `dbt-spark` in development + +### Installation + +First make sure that you set up your `virtualenv` as described in [Setting up an environment](https://github.com/dbt-labs/dbt-core/blob/HEAD/CONTRIBUTING.md#setting-up-an-environment). Ensure you have the latest version of pip installed with `pip install --upgrade pip`. Next, install `dbt-spark` latest dependencies: + +```sh +pip install -e . -r dev-requirements.txt +``` + +When `dbt-spark` is installed this way, any changes you make to the `dbt-spark` source code will be reflected immediately in your next `dbt-spark` run. + +To confirm you have correct version of `dbt-core` installed please run `dbt --version` and `which dbt`. + + +## Testing + +### Initial Setup + +`dbt-spark` uses test credentials specified in a `test.env` file in the root of the repository. This `test.env` file is git-ignored, but please be _extra_ careful to never check in credentials or other sensitive information when developing. To create your `test.env` file, copy the provided example file, then supply your relevant credentials. + +``` +cp test.env.example test.env +$EDITOR test.env +``` + +### Test commands +There are a few methods for running tests locally. + +#### `tox` +`tox` takes care of managing Python virtualenvs and installing dependencies in order to run tests. You can also run tests in parallel, for example you can run unit tests for Python 3.7, Python 3.8, Python 3.9, and `flake8` checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration of these tests are located in `tox.ini`. + +#### `pyteest` +Finally, you can also run a specific test or group of tests using `pytest` directly. With a Python virtualenv active and dev dependencies installed you can do things like: + +```sh +# run specific spark integration tests +python -m pytest -m profile_spark tests/integration/get_columns_in_relation +# run all unit tests in a file +python -m pytest tests/unit/test_adapter.py +# run a specific unit test +python -m pytest test/unit/test_adapter.py::TestSparkAdapter::test_profile_with_database +``` +## Updating Docs + +Many changes will require and update to the `dbt-spark` docs here are some useful resources. + +- Docs are [here](https://docs.getdbt.com/). +- The docs repo for making changes is located [here]( https://github.com/dbt-labs/docs.getdbt.com). +- The changes made are likely to impact one or both of [Spark Profile](https://docs.getdbt.com/reference/warehouse-profiles/spark-profile), or [Saprk Configs](https://docs.getdbt.com/reference/resource-configs/spark-configs). +- We ask every community member who makes a user-facing change to open an issue or PR regarding doc changes. + + + +## Submitting a Pull Request + +dbt Labs provides a CI environment to test changes to the `dbt-spark` adapter, and periodic checks against the development version of `dbt-core` through Github Actions. + +A `dbt-spark` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code. + +Once all tests are passing and your PR has been approved, a `dbt-spark` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada: \ No newline at end of file From 46c014f41c16050f563f49a718d54e9618e95697 Mon Sep 17 00:00:00 2001 From: Matthew McKnight Date: Wed, 9 Feb 2022 09:47:01 -0600 Subject: [PATCH 2/6] updates after kyle review, and minor changes regarding review process and CI as spark still uses CircleCI and not GHA --- CONTRIBUTING.MD | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/CONTRIBUTING.MD b/CONTRIBUTING.MD index 08c69be2f..1447e254a 100644 --- a/CONTRIBUTING.MD +++ b/CONTRIBUTING.MD @@ -8,9 +8,9 @@ 7. [Submitting a Pull Request](#submitting-a-pull-request) ## About this document -This document is a guide intended for folks interested in contributing to `dbt-spark`. Below, we document the process by which members of the community should create issues and submit pull requests (PRs) in this repository. It is not intended as a guide for using `dbt-spark`, and it assumes a certain level of familiarity with Python concepts such as virtualenvs, `pip`, python modules, filesystems, and so on. This guide assumes you are using macOS or Linux and are comfortable with the command line. +This document is a guide intended for folks interested in contributing to `dbt-spark`. Below, we document the process by which members of the community should create issues and submit pull requests (PRs) in this repository. It is not intended as a guide for using `dbt-spark`, and it assumes a certain level of familiarity with Python concepts such as virtualenvs, `pip`, Python modules, and so on. This guide assumes you are using macOS or Linux and are comfortable with the command line. -For those wishing to contribute we highly suggest reading the [dbt-core](https://github.com/dbt-labs/dbt-core/blob/main/CONTRIBUTING.md). if you haven't already. Almost all of the information there is applicable to contributing here, too! +For those wishing to contribute we highly suggest reading the dbt-core's [contribution guide](https://github.com/dbt-labs/dbt-core/blob/HEAD/CONTRIBUTING.md) if you haven't already. Almost all of the information there is applicable to contributing here, too! ### Signing the CLA @@ -19,7 +19,7 @@ Please note that all contributors to `dbt-spark` must sign the [Contributor Lice ## Getting the code -You will need `git` in order to download and modify the `dbt-spark` source code. On macOS, the best way to download git is to just install [Xcode](https://developer.apple.com/support/xcode/). +You will need `git` in order to download and modify the `dbt-spark` source code. You can find directions [here](https://github.com/git-guides/install-git) on how to install `git`. ### External contributors @@ -68,7 +68,7 @@ There are a few methods for running tests locally. #### `tox` `tox` takes care of managing Python virtualenvs and installing dependencies in order to run tests. You can also run tests in parallel, for example you can run unit tests for Python 3.7, Python 3.8, Python 3.9, and `flake8` checks in parallel with `tox -p`. Also, you can run unit tests for specific python versions with `tox -e py37`. The configuration of these tests are located in `tox.ini`. -#### `pyteest` +#### `pytest` Finally, you can also run a specific test or group of tests using `pytest` directly. With a Python virtualenv active and dev dependencies installed you can do things like: ```sh @@ -88,8 +88,6 @@ Many changes will require and update to the `dbt-spark` docs here are some usefu - The changes made are likely to impact one or both of [Spark Profile](https://docs.getdbt.com/reference/warehouse-profiles/spark-profile), or [Saprk Configs](https://docs.getdbt.com/reference/resource-configs/spark-configs). - We ask every community member who makes a user-facing change to open an issue or PR regarding doc changes. - - ## Submitting a Pull Request dbt Labs provides a CI environment to test changes to the `dbt-spark` adapter, and periodic checks against the development version of `dbt-core` through Github Actions. From 3fe9a2e4109ef779ed434ca0ef1a4eb7c7f5b912 Mon Sep 17 00:00:00 2001 From: Matthew McKnight Date: Wed, 9 Feb 2022 09:59:04 -0600 Subject: [PATCH 3/6] minor addition --- CONTRIBUTING.MD | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CONTRIBUTING.MD b/CONTRIBUTING.MD index 1447e254a..20dfd6e80 100644 --- a/CONTRIBUTING.MD +++ b/CONTRIBUTING.MD @@ -94,4 +94,6 @@ dbt Labs provides a CI environment to test changes to the `dbt-spark` adapter, a A `dbt-spark` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code. +Once all requests and answers have been answered the `dbt-spark` maintainer can trigger CI testing. + Once all tests are passing and your PR has been approved, a `dbt-spark` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada: \ No newline at end of file From 882c1bb29b2c913d3d9e6f580ab5a401e7a84408 Mon Sep 17 00:00:00 2001 From: Matthew McKnight Date: Tue, 3 May 2022 16:33:25 -0500 Subject: [PATCH 4/6] add test.env.example --- .gitignore | 1 + test.env.example | 10 ++++++++++ 2 files changed, 11 insertions(+) create mode 100644 test.env.example diff --git a/.gitignore b/.gitignore index 4c05634f3..d04ebc1cf 100644 --- a/.gitignore +++ b/.gitignore @@ -8,6 +8,7 @@ __pycache__ .idea/ build/ dist/ +test.env dbt-integration-tests test/integration/.user.yml .DS_Store diff --git a/test.env.example b/test.env.example new file mode 100644 index 000000000..863eb4fdc --- /dev/null +++ b/test.env.example @@ -0,0 +1,10 @@ +# Cluster ID +DBT_DATABRICKS_CLUSTER_NAME= +# SQL Endpoint +DBT_DATABRICKS_ENDPOINT= +# Server Hostname value +DBT_DATABRICKS_HOST_NAME= +# personal token +DBT_DATABRICKS_TOKEN= +# file path to local ODBC driver +ODBC_DRIVER= \ No newline at end of file From 35b747e80d1d76a09b93f03056a74228ad51e845 Mon Sep 17 00:00:00 2001 From: Matthew McKnight Date: Wed, 1 Jun 2022 09:58:41 -0500 Subject: [PATCH 5/6] fix eof black errors --- CONTRIBUTING.MD | 6 +++--- test.env.example | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/CONTRIBUTING.MD b/CONTRIBUTING.MD index 20dfd6e80..adf025874 100644 --- a/CONTRIBUTING.MD +++ b/CONTRIBUTING.MD @@ -17,7 +17,7 @@ For those wishing to contribute we highly suggest reading the dbt-core's [contri Please note that all contributors to `dbt-spark` must sign the [Contributor License Agreement](https://docs.getdbt.com/docs/contributor-license-agreements) to have their Pull Request merged into an `dbt-spark` codebase. If you are unable to sign the CLA, then the `dbt-spark` maintainers will unfortunately be unable to merge your Pull Request. You are, however, welcome to open issues and comment on existing ones. -## Getting the code +## Getting the code You will need `git` in order to download and modify the `dbt-spark` source code. You can find directions [here](https://github.com/git-guides/install-git) on how to install `git`. @@ -90,10 +90,10 @@ Many changes will require and update to the `dbt-spark` docs here are some usefu ## Submitting a Pull Request -dbt Labs provides a CI environment to test changes to the `dbt-spark` adapter, and periodic checks against the development version of `dbt-core` through Github Actions. +dbt Labs provides a CI environment to test changes to the `dbt-spark` adapter, and periodic checks against the development version of `dbt-core` through Github Actions. A `dbt-spark` maintainer will review your PR. They may suggest code revision for style or clarity, or request that you add unit or integration test(s). These are good things! We believe that, with a little bit of help, anyone can contribute high-quality code. Once all requests and answers have been answered the `dbt-spark` maintainer can trigger CI testing. -Once all tests are passing and your PR has been approved, a `dbt-spark` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada: \ No newline at end of file +Once all tests are passing and your PR has been approved, a `dbt-spark` maintainer will merge your changes into the active development branch. And that's it! Happy developing :tada: diff --git a/test.env.example b/test.env.example index 863eb4fdc..bf4cf2eee 100644 --- a/test.env.example +++ b/test.env.example @@ -7,4 +7,4 @@ DBT_DATABRICKS_HOST_NAME= # personal token DBT_DATABRICKS_TOKEN= # file path to local ODBC driver -ODBC_DRIVER= \ No newline at end of file +ODBC_DRIVER= From aea62d70c534fb0750546d9eed063c951ab76054 Mon Sep 17 00:00:00 2001 From: Matthew McKnight Date: Wed, 1 Jun 2022 10:47:10 -0500 Subject: [PATCH 6/6] added example for functional tests --- CONTRIBUTING.MD | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CONTRIBUTING.MD b/CONTRIBUTING.MD index adf025874..c0d9bb3d2 100644 --- a/CONTRIBUTING.MD +++ b/CONTRIBUTING.MD @@ -74,6 +74,8 @@ Finally, you can also run a specific test or group of tests using `pytest` direc ```sh # run specific spark integration tests python -m pytest -m profile_spark tests/integration/get_columns_in_relation +# run specific functional tests +python -m pytest --profile databricks_sql_endpoint tests/functional/adapter/test_basic.py # run all unit tests in a file python -m pytest tests/unit/test_adapter.py # run a specific unit test