Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MacOSX] Workspace Example for Python Connector Hangs With No Errors #441

Open
caldempsey opened this issue Sep 17, 2024 · 1 comment
Open

Comments

@caldempsey
Copy link

caldempsey commented Sep 17, 2024

Hi,

I believe there is an issue specific to macOS users, as I encountered problems getting the workspace example to work correctly on my Mac. I found a similar issue reported by another macOS user at this GitHub issue. Additionally, when I tried using ODBC, I encountered further issues, but I was able to resolve them with macOS-specific configuration changes. I can see your E2E tests working and I respect that a lot, so I think this might be an issue specific to macOS worth testing.

Here’s a detailed diagnostic:

Using the Workspace Example:

from databricks import sql
import os

connection = sql.connect(
                        server_hostname="<warehouse_name>.cloud.databricks.com",
                        http_path="/sql/1.0/warehouses/<warehouse_id>",
                        access_token="<access-token>")

cursor = connection.cursor()

cursor.execute("SELECT * from range(10)")
print(cursor.fetchall())

cursor.close()
connection.close()

The connection hangs with no error message when using the latest version of the SQL connector. This is not the case for the Go SQL Connector, which we regularly use. While the Go SQL Connector works fine, the Python implementation neither starts my SQL Warehouse nor connects to an already running one.

When Downgrading:

I encountered this error:

Failed to connect to the database: HTTPSConnectionPool(host='.....cloud.databricks.com', port=443): Max retries exceeded with url: /sql/1.0/warehouses/.... (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed 
certificate in certificate chain (_ssl.c:1006)')))

ODBC Attempt:

When I dug deeper I found the ODBC driver package provided by Databricks doesn't point to the right lib-path.

image

Here's the key adjustment I made: Point the Simba ODBC driver to where ODBC is installed, which for macOS is located at /usr/lib/libiodbcinst.2.dylib. The necessary update for the Simba Spark ODBC Client is:

File: /Library/simba/spark/lib/simba.sparkodbc.ini

[Driver]
ErrorMessagesPath=/Library/simba/spark/ErrorMessages/
LogLevel=0
LogPath=
SwapFilePath=/tmp
ODBCInstLib=libiodbcinst.2.dylib

For other macOS users facing similar challenges, I hope this helps resolve your issues.

@caldempsey caldempsey changed the title Workspace Example for Python Connector Hangs With No Errors MacOSX Specific? Workspace Example for Python Connector Hangs With No Errors Sep 17, 2024
@caldempsey caldempsey changed the title MacOSX Specific? Workspace Example for Python Connector Hangs With No Errors [MacOSX] Workspace Example for Python Connector Hangs With No Errors Sep 17, 2024
@susodapop
Copy link
Contributor

Interesting find. In general we don't want to point people to use the ODBC driver with pyodbc since databricks-sql-connector is meant specifically as a mechanism for running queries in pure python without the ODBC dependency. Your specific error suggests there is an issue with your certificate chain. You can validate this by attempting to run your script while passing _tls_no_verify=True when you run sql.connect(...). If that works then the root cause is your certificates, not your OS or the connector itself.

For reference, when I was maintaining this package we used Apple Silicon macs to run all of the tests without issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants