Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] use simple catalog name in Trino connector #2433

Closed
Tracked by #2418
xiacongling opened this issue Mar 5, 2024 · 6 comments · Fixed by #2547
Closed
Tracked by #2418

[Improvement] use simple catalog name in Trino connector #2433

xiacongling opened this issue Mar 5, 2024 · 6 comments · Fixed by #2547
Assignees
Labels
improvement Improvements on everything

Comments

@xiacongling
Copy link
Contributor

What would you like to be improved?

When using Gravitino Trino connector to access data, user need to use quoted catalog names like "{metalake}.{catalog}". this may lead to compatible problems when migrating file-based catalogs to gravitino metalake. it should keep the original catalog names (usually without {metalake}. prefix) for Trino.

How should we improve?

No response

@xiacongling xiacongling added the improvement Improvements on everything label Mar 5, 2024
@jerryshao
Copy link
Contributor

Thanks @xiacongling for your valid feedback, we're working on this thing, I think @shaofengshi may have more to tell.

@shaofengshi
Copy link
Contributor

@xiacongling, @diqiu50 do you want to work on this?

@xiacongling
Copy link
Contributor Author

hi, @shaofengshi, @diqiu50. i'd be happy to contribute. we are trying to make our file based catalogs being managed by gravitino and i'll be working on it in the next few days.

@diqiu50
Copy link
Contributor

diqiu50 commented Mar 15, 2024

@xiacongling Ok, Thank you for the contribute. Let's first discuss the solution.

@xiacongling
Copy link
Contributor Author

Well, @diqiu50. The idea is pretty straightforward. Simplify the name when Gravitino invokes Trino's API and canonicalize it when Trino invokes Gravitino.

  • When Gravitino creates and drops catalog via com.datastrato.gravitino.trino.connector.catalog.CatalogInjector#injectCatalogConnector and com.datastrato.gravitino.trino.connector.catalog.CatalogInjector#removeCatalogConnector, metalake prefix of catalogName need to be removed when these methods are invoked;
  • When Trino create its internal catalog, it gets a delegated connector instance
    via com.datastrato.gravitino.trino.connector.catalog.CatalogConnectorManager#getCatalogConnector. Here, metalake prefix should be added

The metalake name which will be omitted is provided by configuration property gravitino.metalake and an additional property gravitino.simplify-catalog-names (`false' by default) will be added for enabling this feature on demand.

Not sure if there's anything else I'm missing here. please point them out. thanks!

@xiacongling
Copy link
Contributor Author

@diqiu50 here is a preview: #2547

yuqi1129 pushed a commit that referenced this issue Apr 15, 2024
…2547)

<!--
1. Title: [#<issue>] <type>(<scope>): <subject>
   Examples:
     - "[#123] feat(operator): support xxx"
     - "[#233] fix: check null before access result in xxx"
     - "[MINOR] refactor: fix typo in variable name"
     - "[MINOR] docs: fix typo in README"
     - "[#255] test: fix flaky test NameOfTheTest"
   Reference: https://www.conventionalcommits.org/en/v1.0.0/
2. If the PR is unfinished, please mark this PR as draft.
-->

### What changes were proposed in this pull request?

Support omitting metalake prefix for Trino connector. 

### Why are the changes needed?

Gravitino register a dynamic catalog with name like
`some_metalake.some_catalog`. It is long and must be quoted to use.
Besides, if one wants to manage file-based catalogs with Gravitino,
users need to adjust their SQL for catalog name changing. With this
patch, Trino admins can add a Gravitino connector property
`gravitino.simplify-catalog-names=true` to keep the catalog name as it
is without `some_metalake.` prefix.

Fix: #2433

### Does this PR introduce _any_ user-facing change?

No for default settings. If `gravitino.simplify-catalog-names=true` is
set, the catalog names will change when using Trino with Gravitino.

### How was this patch tested?

UT

---------

Co-authored-by: yuhui <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvements on everything
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants