remove references to tox

databricks · Sep 14, 2023 · b10f624 · b10f624
1 parent d269f11
commit b10f624
Show file tree

Hide file tree

Showing 5 changed files with 57 additions and 38 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -81,7 +81,7 @@ jobs:
         run: poetry install --no-interaction --no-root
 
       - name: Run linting
-        run: poetry run nox -t lint
+        run: poetry run nox -t lint_check
 
   unit:
     name: unit test / python ${{ matrix.python-version }}

diff --git a/CONTRIBUTING.MD b/CONTRIBUTING.MD
@@ -5,6 +5,7 @@ We happily welcome contributions to the `dbt-databricks` package. We use [GitHub
 Contributions are licensed on a license-in/license-out basis.
 
 ## Communication
+
 Before starting work on a major feature, please reach out to us via GitHub, Slack, email, etc. We will make sure no one else is already working on it and ask you to open a GitHub issue. A "major feature" is defined as any change that is > 100 LOC altered (not including tests), or changes any user-facing behavior.
 
 We will use the GitHub issue to discuss the feature and come to agreement. This is to prevent your time being wasted, as well as ours. The GitHub review process for major features is also important so that organizations with commit access can come to agreement on design.
@@ -18,21 +19,20 @@ If it is appropriate to write a design document, the document must be hosted eit
 1. [Run the unit tests](#unit-tests) (and the [integration tests](#functional--integration-tests) if you [can](#please-test-what-you-can))
 1. [Sign your commits](#sign-your-work)
 1. [Open a pull request](#pull-request-review-process)
-    - Answer the PR template questions as best as you can
-    - _Recommended: [Allow edits from Maintainers]_
-
+   - Answer the PR template questions as best as you can
+   - _Recommended: [Allow edits from Maintainers]_
 
 ## Pull request review process
 
 dbt-databricks uses a **two-step review process** to merge PRs to `main`. We first squash the patch onto a staging branch so that we can securely run our full matrix of integration tests against a real Databricks workspace. Then we merge the staging branch to `main`.
 
 > **Note:** When you create a pull request we recommend that you _[Allow Edits from Maintainers]_. This smooths our two-step process and also lets your reviewer easily commit minor fixes or changes.
 
-A dbt-databricks maintainer will review your PR and may suggest changes for style and clarity, or they may request that you add unit or integration tests. 
+A dbt-databricks maintainer will review your PR and may suggest changes for style and clarity, or they may request that you add unit or integration tests.
 
 Once your patch is approved, a maintainer will create a staging branch and either you or the maintainer (if you allowed edits from maintainers) will change the base branch of your PR to the staging branch. Then a maintainer will squash and merge the PR into the staging branch.
 
-dbt-databricks uses staging branches to run our full matrix of functional and integration tests via Github Actions. This extra step is required for security because GH Action workflows that run on pull requests from forks can't access our testing Databricks workspace. 
+dbt-databricks uses staging branches to run our full matrix of functional and integration tests via Github Actions. This extra step is required for security because GH Action workflows that run on pull requests from forks can't access our testing Databricks workspace.
 
 If the functional or integration tests fail as a result of your change, a maintainer will work with you to fix it _on your fork_ and then repeat this step.
 
@@ -46,19 +46,20 @@ See [docs/local-dev.md](docs/local-dev.md).
 
 ## Code Style
 
-We follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) with one exception: lines can be up to 100 characters in length, not 79. You can run [`tox` linter command](#linting) to automatically format the source code before committing your changes.
+We follow [PEP 8](https://www.python.org/dev/peps/pep-0008/) with one exception: lines can be up to 100 characters in length, not 79. You can run the [`nox` lint command](#linting) to detect issues in your source code before committing your changes.
 
 ### Linting
 
-This project uses [Black](https://pypi.org/project/black/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](https://www.mypy-lang.org/) for linting and static type checks. Run all three with the `linter` command and commit before opening your pull request.
+This project uses [Black](https://pypi.org/project/black/), [flake8](https://flake8.pycqa.org/en/latest/), and [mypy](https://www.mypy-lang.org/) for linting and static type checks. Run all three with the `lint` command and commit before opening your pull request.
 
-```
-tox -e linter
+```zsh
+poetry run nox -t lint
 ```
 
 To simplify reviews you can commit any format changes in a separate commit.
 
 ## Sign your work
+
 The sign-off is a simple line at the end of the explanation for the patch. Your signature certifies that you wrote the patch or otherwise have the right to pass it on as an open-source patch. The rules are pretty simple: if you can certify the below (from developercertificate.org):
 
 ```
@@ -110,40 +111,36 @@ Use your real name (sorry, no pseudonyms or anonymous contributions.)
 
 If you set your `user.name` and `user.email` git configs, you can sign your commit automatically with `git commit -s`.
 
-
 ## Unit tests
 
 Unit tests do not require a Databricks account. Please confirm that your pull request passes our unit test suite before opening a pull request.
 
-```bash
-tox -e unit
+```zsh
+poetry run nox -s unit
 ```
 
 ## Functional & Integration Tests
 
-Functional and integration tests require a Databricks account with access to a workspace containing four compute resources. These four comprise a matrix of multi-purpose cluster vs SQL warehouse with and without Unity Catalog enabled. The `tox` commands to run each set of these tests appear below:
+Functional and integration tests require a Databricks account with access to a workspace containing four compute resources. These four comprise a matrix of multi-purpose cluster vs SQL warehouse with and without Unity Catalog enabled. The `nox` commands to run each set of these tests appear below:
 
-|Compute Type |Unity Catalog |Command|
-|-|-|-|
-|SQL warehouse| Yes | `tox -e integration-databricks-uc-sql-endpoint`  |
-|SQL warehouse| No | `tox -e integration-databricks-sql-endpoint` |
-|Multi-purpose| Yes |  `tox -e integration-databricks-uc-cluster` |
-|Multi-Purpose| No | `tox -e integration-databricks-cluster` |
+| Compute Type  | Unity Catalog | Command                             |
+| ------------- | ------------- | ----------------------------------- |
+| SQL warehouse | Yes           | `poetry run nox -t uc_sql_endpoint` |
+| Multi-purpose | Yes           | `poetry run nox -t uc_cluster`      |
+| Multi-Purpose | No            | `poetry run nox -t cluster`         |
 
-These tests are configured with environment variables that `tox` reads from a file called [test.env](/test.env.example) which you can copy from the example:
+These tests are configured with environment variables that `pytest` reads from a file called [test.env](/test.env.example) which you can copy from the example:
 
 ```sh
 cp test.env.example test.env
 ```
 
-Update `test.env` with the relevant HTTP paths and tokens. 
-
+Update `test.env` with the relevant HTTP paths and tokens.
 
 ### Please test what you can
+
 We understand that not every contributor will have all four types of compute resources in their Databricks workspace. For this reason, once a change has been reviewed and merged into a staging branch, we will run the full matrix of tests against our testing workspace at our expense (see the [pull request review process](#pull-request-review-process) for more detail).
 
 That said, we ask that you include integration tests where relevant and that you indicate in your pull request description the environment type(s) you tested the change against.
 
-
-
-[Allow Edits from Maintainers]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork
+[Allow Edits from Maintainers]: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork
diff --git a/README.md b/README.md
@@ -25,6 +25,7 @@ The `dbt-databricks` adapter contains all of the code enabling dbt to work with
 - **Performance**. The adapter generates SQL expressions that are automatically accelerated by the native, vectorized [Photon](https://databricks.com/product/photon) execution engine.
 
 ## Choosing between dbt-databricks and dbt-spark
+
 If you are developing a dbt project on Databricks, we recommend using `dbt-databricks` for the reasons noted above.
 
 `dbt-spark` is an actively developed adapter which works with Databricks as well as Apache Spark anywhere it is hosted e.g. on AWS EMR.
@@ -34,11 +35,13 @@ If you are developing a dbt project on Databricks, we recommend using `dbt-datab
 ### Installation
 
 Install using pip:
+
 ```nofmt
 pip install dbt-databricks
 ```
 
 Upgrade to the latest version
+
 ```nofmt
 pip install --upgrade dbt-databricks
 ```
@@ -61,6 +64,7 @@ your_profile_name:
 ### Quick Starts
 
 These following quick starts will get you up and running with the `dbt-databricks` adapter:
+
 - [Developing your first dbt project](https://github.com/databricks/dbt-databricks/blob/main/docs/local-dev.md)
 - Using dbt Cloud with Databricks ([Azure](https://docs.microsoft.com/en-us/azure/databricks/integrations/prep/dbt-cloud) | [AWS](https://docs.databricks.com/integrations/prep/dbt-cloud.html))
 - [Running dbt production jobs on Databricks Workflows](https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-workflows.md)
@@ -73,11 +77,13 @@ These following quick starts will get you up and running with the `dbt-databrick
 
 The `dbt-databricks` adapter has been tested:
 
-- with Python 3.7 or above.
+- with Python 3.8 or above.
 - against `Databricks SQL` and `Databricks runtime releases 9.1 LTS` and later.
 
 ### Tips and Tricks
+
 ## Choosing compute for a Python model
+
 You can override the compute used for a specific Python model by setting the `http_path` property in model configuration. This can be useful if, for example, you want to run a Python model on an All Purpose cluster, while running SQL models on a SQL Warehouse. Note that this capability is only available for Python models.
 
 ```

diff --git a/docs/local-dev.md b/docs/local-dev.md
@@ -1,41 +1,55 @@
 # Local development with dbt-databricks
+
 This page describes how to develop a dbt project on your computer using `dbt-databricks`. We will create an empty dbt project with information on how to connect to Databricks. We will then run our first dbt models.
 
 ## Prerequisites
+
 - Access to a Databricks workspace
 - Ability to create a Personal Access Token (PAT)
 - Python 3.8+
-- dbt-core v1.1.0+
-- dbt-databricks v1.1.0+
+- [Poetry 1.6+](https://python-poetry.org/docs/)
+
+To install the project and all its dependencies run
+
+```
+poetry install
+```
+
+from within the project directory.
+
+## Prepare to connect
 
-##  Prepare to connect
 ### Collect connection information
+
 Before you scaffold a new dbt project, you have to collect some information which dbt will use to connect to Databricks. Where you find this information depends on whether you are using Databricks Clusters or Databricks SQL endpoints. We recommend that you develop dbt models against Databricks SQL endpoints as they provide the latest SQL features and optimizations.
 
 #### Databricks SQL endpoints
-1. Log in to your Databricks workspace 
+
+1. Log in to your Databricks workspace
 2. Click the _SQL_ persona in the left navigation bar to switch to Databricks SQL
 3. Click _SQL Endpoints_
 4. Choose the SQL endpoint you want to connect to
 5. Click _Connection details_
 6. Copy the value of _Server hostname_. This will be the value of `host` when you scaffold a dbt project.
-7. Copy the value of _HTTP path_.  This will be the value of `http_path` when you scaffold a dbt project.
+7. Copy the value of _HTTP path_. This will be the value of `http_path` when you scaffold a dbt project.
 
 ![image](/docs/img/sql-endpoint-connection-details.png "SQL endpoint connection details")
 
 #### Databricks Clusters
-1. Log in to your Databricks workspace 
+
+1. Log in to your Databricks workspace
 2. Click the _Data Science & Engineering_ persona in the left navigation bar
 3. Click _Compute_
 4. Click on the cluster you want to connect to
 5. Near the bottom of the page, click _Advanced options_
 6. Scroll down some more and click _JDBC/ODBC_
 7. Copy the value of _Server Hostname_. This will be the value of `host` when you scaffold a dbt project.
-7. Copy the value of _HTTP Path_.  This will be the value of `http_path` when you scaffold a dbt project.
+8. Copy the value of _HTTP Path_. This will be the value of `http_path` when you scaffold a dbt project.
 
 ![image](/docs/img/cluster-connection-details.png "SQL endpoint connection details")
 
 ## Scaffold a new dbt project
+
 Now, we are ready to scaffold a new dbt project. Switch to your terminal and type:
 
 ```nofmt
@@ -63,6 +77,7 @@ In `schema`, enter `databricks_demo`, which is the schema you created earlier.
 Leave threads at `1` for now.
 
 ## Test connection
+
 You are now ready to test the connection to Databricks. In the terminal, enter the following command:
 
 ```nofmt
@@ -72,6 +87,7 @@ dbt debug
 If all goes well, you will see a successful connection. If you cannot connect to Databricks, double-check the PAT and update it accordingly in `~/.dbt/profiles.yml`.
 
 ## Run your first models
+
 At this point, you simply run the demo models in the `models/example` directory. In your terminal, type:
 
 ```nofmt

diff --git a/noxfile.py b/noxfile.py
@@ -1,27 +1,27 @@
 from nox_poetry import session
 
 
-@session(tags=["lint"])
+@session(tags=["lint", "lint_check"])
 def black(session):
     session.install("black")
     session.run("black", "--check", "dbt", "tests")
 
 
-@session
+@session(tags=["lint"])
 def black_fix(session):
     session.install("black")
     session.run("black", "dbt", "tests")
 
 
-@session(tags=["lint"])
+@session(tags=["lint", "lint_check"])
 def flake8(session):
     session.install("flake8")
     session.run(
         "flake8", "--select=E,W,F", "--ignore=E203,W503", "--max-line-length=100", "dbt", "tests"
     )
 
 
-@session(tags=["lint"])
+@session(tags=["lint", "lint_check"])
 def mypy(session):
     session.run_always("poetry", "install", external=True)
     session.run("mypy", "--explicit-package-bases", "dbt", "tests")