Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publicize the databricks delta lake destination connector #6043

Closed
5 of 6 tasks
tuliren opened this issue Sep 14, 2021 · 10 comments
Closed
5 of 6 tasks

Publicize the databricks delta lake destination connector #6043

tuliren opened this issue Sep 14, 2021 · 10 comments

Comments

@tuliren
Copy link
Contributor

tuliren commented Sep 14, 2021

Tell us about the problem you're trying to solve

Currently the databricks destination connector is private due to legal reasons. We should find a way to publicize it.

This relates to #2075, and is a follow-up issue from PR #5998.

TODO

@tuliren tuliren added the type/enhancement New feature or request label Sep 14, 2021
@sherifnada sherifnada added the area/connectors Connector related issues label Nov 15, 2021
@nikhilalmeida
Copy link

Just following up on this.

@nikhilalmeida
Copy link

We do want to adopt airbyte over rudderstack, but the destination of choice is not completely supported. Just following up on this.

@tuliren
Copy link
Contributor Author

tuliren commented Feb 2, 2022

@nikhilalmeida, thank you for the interest in this connector.

Currently, to use the Databricks connector in your own deployment of Airbyte, you can build the image yourself, and upload it to a private registry. The building process has been simplified and is the same as any other connector now. See the developer doc for details.

@klogdog
Copy link

klogdog commented Feb 6, 2022

+1 For adopting Airbyte and using Databricks connector. I will follow @tuliren 's build process for now but we should definitely have a mechanism to pull the image after agreeing to the License agreement. Ill also bring this up with my Databricks CSR at the next call.

EDIT

I finally got the connector to build. For anyone trying I used an aws ubuntu 20.04 lightsail server and ran this script that I made

#!/bin/bash
echo "updating ubuntu"
sudo apt update
sudo DEBIAN_FRONTEND=noninteractive apt-get upgrade -y
echo "installing docker community edition"
sudo apt install apt-transport-https ca-certificates curl software-properties-common -y
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
apt-cache policy docker-ce
sudo apt install docker-ce -y
echo "installing docker compose"
sudo curl -L "https://github.com/docker/compose/releases/download/v2.2.3/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
echo "installing openjdk 17"
sudo apt install openjdk-17-jre-headless -y
echo "Installing postgresql"
sudo apt install postgresql -y
echo "installing jq"
sudo apt install jq
echo "installing docker"
sudo apt install docker.io
echo "downloading airbyte"
sudo mkdir source
cd source
sudo mkdir repos
cd repos 
sudo git clone https://github.com/airbytehq/airbyte.git
cd airbyte
sudo ./gradlew :airbyte-integrations:connectors:destination-databricks:build

@klogdog
Copy link

klogdog commented Feb 8, 2022

When building the databricks connector image as instructed we are greeted with the "Agree to the Databricks JDBC Driver Terms & Conditions" as a required radio button (cannot proceed unless it is selected)

Is that not good enough to release as a public connector?

image

@tuliren
Copy link
Contributor Author

tuliren commented May 16, 2022

@klogdog, sorry that I missed your question.

Is that not good enough to release as a public connector?

No, unfortunately it is not. The license requires the users to accept the terms before they can download the driver file. But to use the connector, users will technically download the image including the drive file first.


The good news is that Databricks has released a new public JDBC driver earlier this month. I am working on migrating our Databricks connector. Will make this connector public this week.

@tuliren
Copy link
Contributor Author

tuliren commented May 17, 2022

The Databricks destination connector is now available for everyone. The next Airbyte version will include it by default. You can also add it manually. Its docker image can be found here:
https://hub.docker.com/r/airbyte/destination-databricks

@jonathanneo
Copy link
Contributor

@tuliren , I am attempting to use the latest connector you've pushed to dockerhub.

I'm getting the following error when I try to Set up destination. (I've only copied relevant error lines below).

Could not connect to the warehouse with the provided configuration. [Databricks][DatabricksJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Configuration HiveServerType is not available.:48:47,
org.apache.spark.sql.AnalysisException:Configuration HiveServerType is not available.:133:86,
errorCode:0, errorMessage:Configuration HiveServerType is not available.), Query: create database if not exists airbyte.
  • I do have a database called airbyte that exists which points to my AWS S3 mount point dbfs:/mnt/airbyte.

@jonathanneo
Copy link
Contributor

jonathanneo commented May 17, 2022

@tuliren Comment above resolved.

I was attempting to connect using a SQL Endpoint Cluster.

The issue is resolved when I used a Databricks Cluster (as per documentation here: https://docs.airbyte.com/integrations/destinations/databricks#requirements).

@tuliren
Copy link
Contributor Author

tuliren commented May 17, 2022

@jonathanneo, thank you for the update. I will update our documentation to include this in a troubleshooting section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants