Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I specify a catalog if I'm defining mutiple tables using the sqlalchemy ORM style? #446

Open
sshevlyagin opened this issue Oct 1, 2024 · 6 comments

Comments

@sshevlyagin
Copy link

I have two tables I want to query from, the fully qualified names are foo_catalog.foo_schema.foo and bar_catalog.bar_schema.bar. I can specify a schema and a tablename in the table definition but I only get a chance to specify the catalog once in the connection.

How do I specify a catalog using declarative base style?

If I was writing raw sql I could query accross both catalogs no problem.

class Base(declarative_base(metaclass=DeclarativeABCMeta)):
    __abstract__ = True

class Foo(Base):
    __tablename__ = "foo"
    __table_args__ = { "schema": "foo_schema"}

    id: Mapped[int] = mapped_column(name="id", type_=INTEGER,  primary_key=True)
class Bar(Base):
    __tablename__ = "bar"
    __table_args__ = { "schema": "bar_schema"}

    id: Mapped[int] = mapped_column(name="id", type_=INTEGER,  primary_key=True)
@susodapop
Copy link
Contributor

What happens when you attempt to query with those definitions? It should "just work". If you needed to join data across catalogs that could lead to some issues — although I'm still curious what happens when you try it.

@sshevlyagin
Copy link
Author

I get the following error if I try to query a Table who'se catalog doesn't match. E.g using foo_catalog connection to look for bar I get
sqlalchemy.exc.DatabaseError: (databricks.sql.exc.ServerOperationError) [TABLE_OR_VIEW_NOT_FOUND] The table or view bar_schema.bar cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a schema, verify the current_schema() output, or qualify the name with the correct schema and catalog.

@susodapop
Copy link
Contributor

Interesting! Does the same thing happen if you try query the table directly using the SQLAlchemy engine versus using ORM? Your answer to this makes it easier to determine where a fix would need to be applied.

@sshevlyagin
Copy link
Author

What would the syntax be for querying the table directly using the SQLAlchemy engine? I wnat to make sure I'm testing the right thing :)

@susodapop
Copy link
Contributor

with engine.begin() as conn:
    cursor = conn.execute("SELECT field FROM catalog1.schema1.table1 LEFT JOIN catalog2.schema2.table2 ON ...")
    result = cursor.fetchall()

@sshevlyagin
Copy link
Author

That works with no error (probably because the sql text has a fully qualified table name that includes catalog and schema)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants