Multitenant datasources #11861

gavinking · 2020-09-03T14:37:22Z

Currently the Hibernate extension features some integration with Hibernate multitenancy. Multitenancy in Hibernate essentially boils down to letting the program set a tenant id in a certain context, and having the datasource used by Hibernate selected automatically on the basis of that id. Where "automatically" means you the user can write code to select it.

It seems to me that this concern would be better handled in Quarkus at the level of the Quarkus datasource itself. There's no particular advantage to doing this in Hibernate, other than before this we had control over Hibernate but not over the environment in which Hibernate operated. And there are disadvantages, primarily that non-Hibernate code can't take advantage of the feature.

So I propose that we provide a way to map a contextual tenant id to a schema name, catalog name, or database URI, as a feature of the Quarkus datasource, and remove this feature from the Hibernate extension.

Thoughts?

sberyozkin · 2020-09-04T10:45:36Z

@gavinking Would it make sense to link OIDC Multi-tenancy with Hibrernate Multi-tenancy closer ?
There was a blog post awhile back.

OIDC extension runs first, and when its mult-tenancy is activated, a tenant id is available, can be saved in RoutingContext if needed.
Then Hibernate extension would pick it up and use it to choose its own tenant configuration.

What I'm not sure about is if it makes sense to use the same tenant id to 1) select an OIDC configuration (how to verify the token etc) and then 2) to use the same id to to select the configuration to read the data. Perhaps these are conceptually different activities.

@DorianMaliszewski, can you comment please ?

DorianMaliszewski · 2020-09-04T11:36:50Z

Hi,

Maybe this feature can be implemented by adding a filter after OIDC tenant resolver and in application.properties just quarkus.hibernate.use-oidc-tenant-resolver=true or something close.

It can be a good thing to implement an easy way to link hibernate to oidc tenant resolver. 👍

gavinking · 2020-09-04T12:46:55Z

@sberyozkin yes, sure, it sounds like it would be great to integrate with that stuff as a source of tenant ids, though my main concern here in this particular issue is that I want it to be the datasource layer, not Hibernate itself, which is responsible for interpreting the tenant ids.

But yes, it sounds like a good idea.

sberyozkin · 2020-09-04T15:37:44Z

Hi @DorianMaliszewski thanks, yes, I was thinking of getting this ID automatically, without even having to do something, OIDC would set the ID in the the request context and it will be picked up if other Hibernate or lower level resolver is not available.

sberyozkin · 2020-09-04T15:39:39Z

Hi @gavinking thanks; sorry, I think I'd create a dedicated issue to discuss linking OIDC and Hibernate/datasource tenants, so as to not to overload this issue :-)

gavinking · 2020-09-04T17:33:36Z

Hi @gavinking thanks; sorry, I think I'd create a dedicated issue to discuss linking OIDC and Hibernate/datasource tenants, so as to not to overload this issue :-)

Don't apologize, it's a great idea; please open the issue!

michael-schnell · 2020-09-06T06:48:27Z

It seems to me that this concern would be better handled in Quarkus at the level of the Quarkus datasource itself. There's no particular advantage to doing this in Hibernate, other than before this we had control over Hibernate but not over the environment in which Hibernate operated. And there are disadvantages, primarily that non-Hibernate code can't take advantage of the feature.

At the time the Multitenancy feature was implemented, Hibernate ORM extension was the only place to integrate this in a short amount of time. Unfortunately Hibernate Multitenancy itself isn't really finalized since years. As stated in the manual "The JPA expert group is in the process of defining multitenancy support for an upcoming version of the specification". So this was a best effort approach to rely on what Hibernate offered, even if it is not much.

In general I agree, that the current solution is mainly a way to select the right datasource that is used for the current tenant.

michael-schnell · 2020-09-06T06:54:42Z

Hi,

Maybe this feature can be implemented by adding a filter after OIDC tenant resolver and in application.properties just quarkus.hibernate.use-oidc-tenant-resolver=true or something close.

It can be a good thing to implement an easy way to link hibernate to oidc tenant resolver. 👍

In general the mechanism for selecting the tenant is pretty much the same in the Hibernate ORM Multitenancy and the ODIC extension. It is very easy to integrate both parts using CDI.

gavinking · 2020-09-06T08:13:46Z

So I've done a bit more research into how this stuff has been done in Quarkus, and here's a more concrete proposal to kick about.

Introduce a parent/child relationship between "base" datasources and "tenant" datasources.
Tenant datasources inherit the configuration of the base datasource, except for any setting they explicitly override.
The base datasource uses the current tenant identifier (if it is set) to forward connection requests to the corresponding tenant datasource.
Remove the Hibernate-specific code.

Now, configuring a multitenant datasource in Quarkus would look like this:

# Default tenant 'base'
quarkus.datasource.base.db-kind=postgresql
quarkus.datasource.base.username=quarkus_test
quarkus.datasource.base.password=quarkus_test
quarkus.datasource.base.jdbc.url=jdbc:postgresql://localhost:5432/quarkus_test

# Tenant 'mycompany'
quarkus.datasource.mycompany.inherit=base
quarkus.datasource.mycompany.jdbc.url=jdbc:postgresql://localhost:5433/mycompany

Or:

# Default tenant 'base'
quarkus.datasource.base.db-kind=postgresql
quarkus.datasource.base.username=quarkus_test
quarkus.datasource.base.password=quarkus_test
quarkus.datasource.base.jdbc.url=jdbc:postgresql://localhost:5432/quarkus_test

# Tenant 'mycompany'
quarkus.datasource.mycompany.inherit=base
quarkus.datasource.mycompany.username=mycompany
quarkus.datasource.mycompany.password=secret

The advantages of this approach include:

Multitenancy is available as a feature to all Quarkus code, not just Hibernate.
Repetitive datasource configuration is minimized.
Things other than just the catalog and schema name can be configured per-tenant (e.g. the user id above).

Now, an open question in my mind is whether there is a real need to distinguish the schema and database strategies, since AFAIK most databases let you select a schema via JDBC connection properties. However, I must admit I'm not certain about this at all, and so perhaps there's some reason to have this as a whole different strategy.

gavinking · 2020-09-06T08:16:29Z

@Sanne I'm still waiting to hear your take on all this.

gavinking · 2020-09-06T08:29:40Z

Now, an open question in my mind is whether there is a real need to distinguish the schema and database strategies, since AFAIK most databases let you select a schema via JDBC connection properties.

Well, even if there is a good reason to use setSchema() instead of the JDBC connection properties, we can still abstract that as a property of the datasource via .schema (and for completeness .catalog) properties:

# Default tenant 'base'
quarkus.datasource.base.db-kind=postgresql
quarkus.datasource.base.username=quarkus_test
quarkus.datasource.base.password=quarkus_test
quarkus.datasource.base.jdbc.url=jdbc:postgresql://localhost:5432/quarkus_test

# Tenant 'mycompany'
quarkus.datasource.mycompany.inherit=base
quarkus.datasource.mycompany.schema=mycompany

michael-schnell · 2020-09-06T10:58:38Z

Now, an open question in my mind is whether there is a real need to distinguish the schema and database strategies, since AFAIK most databases let you select a schema via JDBC connection properties.

Well, even if there is a good reason to use setSchema() instead of the JDBC connection properties, we can still abstract that as a property of the datasource via .schema (and for completeness .catalog) properties:
# Default tenant 'base'
quarkus.datasource.base.db-kind=postgresql
quarkus.datasource.base.username=quarkus_test
quarkus.datasource.base.password=quarkus_test
quarkus.datasource.base.jdbc.url=jdbc:postgresql://localhost:5432/quarkus_test

# Tenant 'mycompany'
quarkus.datasource.mycompany.inherit=base
quarkus.datasource.mycompany.schema=mycompany

If I remember it correctly, "setSchema(..)" didn't work either on MariaDB or on PostgreSQL. Some databases like MariaDB/MySQL do not support database schemas. In these cases you have to use the DATABASE approach.

michael-schnell · 2020-09-06T11:09:54Z

So I've done a bit more research into how this stuff has been done in Quarkus, and here's a more concrete proposal to kick about.

Introduce a parent/child relationship between "base" datasources and "tenant" datasources.

Tenant datasources inherit the configuration of the base datasource, except for any setting they explicitly override.

The base datasource uses the current tenant identifier (if it is set) to forward connection requests to the corresponding tenant datasource.

Remove the Hibernate-specific code.

Now, configuring a multitenant datasource in Quarkus would look like this:
# Default tenant 'base'
quarkus.datasource.base.db-kind=postgresql
quarkus.datasource.base.username=quarkus_test
quarkus.datasource.base.password=quarkus_test
quarkus.datasource.base.jdbc.url=jdbc:postgresql://localhost:5432/quarkus_test

# Tenant 'mycompany'
quarkus.datasource.mycompany.inherit=base
quarkus.datasource.mycompany.jdbc.url=jdbc:postgresql://localhost:5433/mycompany
Or:
# Default tenant 'base'
quarkus.datasource.base.db-kind=postgresql
quarkus.datasource.base.username=quarkus_test
quarkus.datasource.base.password=quarkus_test
quarkus.datasource.base.jdbc.url=jdbc:postgresql://localhost:5432/quarkus_test

# Tenant 'mycompany'
quarkus.datasource.mycompany.inherit=base
quarkus.datasource.mycompany.username=mycompany
quarkus.datasource.mycompany.password=secret
The advantages of this approach include:

Multitenancy is available as a feature to all Quarkus code, not just Hibernate.

Repetitive datasource configuration is minimized.

Things other than just the catalog and schema name can be configured per-tenant (e.g. the user id above).

Now, an open question in my mind is whether there is a real need to distinguish the schema and database strategies, since AFAIK most databases let you select a schema via JDBC connection properties. However, I must admit I'm not certain about this at all, and so perhaps there's some reason to have this as a whole different strategy.

Sounds like a good approach, but seems mainly to optimize the configuration setup.

How do you tell Hibernate which is the currently selected tenant connection/datasource to use? Most likely this is connected to a request scope.

gavinking · 2020-09-06T14:04:19Z

@michael-schnell

How do you tell Hibernate which is the currently selected tenant connection/datasource to use?

Well you don't need to. That's the whole idea. You just point Hibernate at the base datasource which takes care of redirecting the connection requests to the correct contextual tenant datasource, just as it would do for any other client.

Hibernate multitenancy doesn't need to be involved in this at all.

gavinking · 2020-09-06T14:05:47Z

@michael-schnell

If I remember it correctly, "setSchema(..)" didn't work either on MariaDB or on PostgreSQL.

Eyeballing the code of the Postgres JDBC driver, I believe Postgres does support setSchema(), but no, of course, MySQL (and derivative MariaDB) don't have schemas, so they don't support it.

venki157 · 2020-11-18T12:18:16Z

Anything concluded on this multitenency issue. When can we expect this fix?

DorianMaliszewski · 2020-11-18T13:25:04Z

Refer to this issue please : #11949

michael-schnell · 2020-11-18T20:34:41Z

I would love to do it, but at the moment I have no time. Hopefully there is some spare time during the Christmas holidays. As suggested by Gavin the idea is to move the functionality out of Hibernate extension into the datasources extension. Definitely I will also need support from the datasources maintainers for this task.

ltsallas · 2022-03-04T19:01:09Z

Other than the shared database, different schema multitenancy, there can be a shared database/shared schema approach with Postgres Row Level Security and a session parameter as discussed in the "Alternative approach" section in this article

I made a question on stackoverflow asking for a Quarkus solution for this case - I didn't find a straight forward way to do so. I am just sharing it here since the - outside hibernate- multitenant datasources solution you are discussing so far might take into account the shared db & schema multitenancy as well

gavinking added area/hibernate-orm Hibernate ORM kind/new-feature area/persistence OBSOLETE, DO NOT USE labels Sep 3, 2020

gavinking mentioned this issue Sep 3, 2020

Feature or documentation for Programmatically Resolving Tenants Connections #11729

Open

gavinking mentioned this issue Sep 6, 2020

Support multitenancy with multiple persistence units #11772

Merged

sberyozkin mentioned this issue Sep 7, 2020

Simplify tenant resolution for the OIDC/Hibernate Multitenancy combination #11949

Closed

gavinking mentioned this issue Oct 11, 2022

Multitenancy Extension #27942

Open

1 task

yrodiere mentioned this issue Jul 15, 2024

Panache: Simplify setting the tenant identifier #41805

Open

yrodiere mentioned this issue Oct 17, 2024

Flyway Build Item (prerequisite for multitenant extension) #43892

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multitenant datasources #11861

Multitenant datasources #11861

gavinking commented Sep 3, 2020 •

edited

Loading

sberyozkin commented Sep 4, 2020 •

edited

Loading

DorianMaliszewski commented Sep 4, 2020

gavinking commented Sep 4, 2020

sberyozkin commented Sep 4, 2020

sberyozkin commented Sep 4, 2020

gavinking commented Sep 4, 2020

michael-schnell commented Sep 6, 2020

michael-schnell commented Sep 6, 2020

gavinking commented Sep 6, 2020 •

edited

Loading

gavinking commented Sep 6, 2020

gavinking commented Sep 6, 2020 •

edited

Loading

michael-schnell commented Sep 6, 2020

michael-schnell commented Sep 6, 2020

gavinking commented Sep 6, 2020 •

edited

Loading

gavinking commented Sep 6, 2020 •

edited

Loading

venki157 commented Nov 18, 2020

DorianMaliszewski commented Nov 18, 2020

michael-schnell commented Nov 18, 2020

ltsallas commented Mar 4, 2022

Multitenant datasources #11861

Multitenant datasources #11861

Comments

gavinking commented Sep 3, 2020 • edited Loading

sberyozkin commented Sep 4, 2020 • edited Loading

DorianMaliszewski commented Sep 4, 2020

gavinking commented Sep 4, 2020

sberyozkin commented Sep 4, 2020

sberyozkin commented Sep 4, 2020

gavinking commented Sep 4, 2020

michael-schnell commented Sep 6, 2020

michael-schnell commented Sep 6, 2020

gavinking commented Sep 6, 2020 • edited Loading

gavinking commented Sep 6, 2020

gavinking commented Sep 6, 2020 • edited Loading

michael-schnell commented Sep 6, 2020

michael-schnell commented Sep 6, 2020

gavinking commented Sep 6, 2020 • edited Loading

gavinking commented Sep 6, 2020 • edited Loading

venki157 commented Nov 18, 2020

DorianMaliszewski commented Nov 18, 2020

michael-schnell commented Nov 18, 2020

ltsallas commented Mar 4, 2022

gavinking commented Sep 3, 2020 •

edited

Loading

sberyozkin commented Sep 4, 2020 •

edited

Loading

gavinking commented Sep 6, 2020 •

edited

Loading

gavinking commented Sep 6, 2020 •

edited

Loading

gavinking commented Sep 6, 2020 •

edited

Loading

gavinking commented Sep 6, 2020 •

edited

Loading