-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-tenancy and dynamic schema selection #5056
Comments
I don't have bandwidth to look into this topic in full, but would like to dump few ideas & pointers to make it easier for others to do the research. Conceptually, querying a database table involves the following actors:
In a typical LB4 application, each model is associated with the same Repository class and the same DataSource instance - the wiring is static. To enable multi-tenancy, we want to make this wiring dynamic. Depending on the current user, we want to use a different repository and/or datasource configuration. Lightweight tenant isolation using schemasIn this setup, the authentication layer and all tenants share the same database name and use the same credentials (database user) to access the data. We have 1+N schemas defined in the database: the first schema is used by the authentication layer, plus we have one schema for each tenant. All database queries will use the same LB datasource and thus share the same connection pool. Implementation wise, we need to tweak the way how a LB4 model is registered with a datasource. Instead of creating the same backing juggler model for all users, we want to create tenant-specific juggler models. Conceptually, this can be accomplished by tweaking the Repository constructor. export class ProductRepository extends DefaultCrudRepository<
Product,
typeof Product.prototype.id
> {
constructor(
@inject('datasources.db') dataSource: juggler.DataSource,
@inject(SecurityBindings.USER) currentUser: UserProfile,
) {
super(
// model constructor
Product,
// datasource to use
dataSource,
// new feature to be implemented in @loopback/repository:
// allow repository users to overwrite model settings
{schema: currentUser.name},
);
}
} Datasource-based tenant isolationIf schema-based isolation is not good enough (or not supported by the target database), or if we don't want tenants to share the same database connection pool, then we can wire our application to use a different datasource for each tenant. This approach unlocks new option for tenant isolation, for example it's possible to use different database & credentials for each tenant. LB4 applications are already using Dependency Injection to obtain the datasource instance to be provided to Repository constructors. By default, a datasource is bound in a static way and configured to a singleton, see To support multi-tenancy, we need to rework the resolution of datasources to be dynamic, based on the current user. Let's start from outside. To make it easy to inject the tenant-specific datasource, let's keep the same datasource name (binding key), e.g. import {inject} from '@loopback/core';
import {juggler} from '@loopback/repository';
const config = {
name: 'tenantData',
connector: 'postgresql',
// ...
};
export class TenantDataSourceProvider implements Provider<TenantDataSource > {
constructor(
@inject('datasources.config.tenant', {optional: true})
private dsConfig: object = config,
@inject(SecurityBindings.USER)
private currentUser: UserProfile,
) {}
value() {
const config = {
...this.dsConfig,
// apply tenant-specific settings
schema: this.currentUser.name
};
// Because we are using the same binding key for multiple datasource instances,
// we need to implement our own caching behavior to support SINGLETON scope
// I am leaving this aspect as something to figure out as part of the research
const cached = // look up existing DS instance
if (cached) return cached;
const ds = new TenantDataSource(config);
// store the instance in the cache
return ds;
}
}
export class TenantDataSource extends juggler.DataSource {
static dataSourceName = 'tenant';
// constructor is not needed, we can use the inherited one.
// start/stop methods are needed, I am skipping them for brevity
} There are different ways how to implement caching of per-tenant datasources. Ideally, I would like to reuse Context for that. It turns out this is pretty simple! We want each tenant datasource to have its own datasource name and binding key. To allow repositories to obtain the datasource via export class TenantDataSourceProvider implements Provider<TenantDataSource> {
private dataSourceName: string;
private bindingKey: string;
constructor(
@inject('datasources.config.tenant', {optional: true})
private dsConfig: object = config,
@inject(SecurityBindings.USER)
private currentUser: UserProfile,
@inject.context()
private currentContext: Context,
@inject(CoreBindings.APPLICATION_INSTANCE)
private app: Application,
) {
this.dataSourceName = `tenant-${this.currentUser.name}`;
this.bindingKey = `datasources.${this.dataSourceName}`;
}
value() {
if (!this.currentContext.isBound(this.bindingKey)) {
this.setupDataSource();
}
return this.currentContext.get<juggler.DataSource>(this.bindingKey);
}
private setupDataSource() {
const resolvedConfig = {
...this.dsConfig,
// apply tenant-specific settings
schema: this.currentUser.name,
};
const ds = new TenantDataSource(resolvedConfig);
// Important! We need to bind the datasource to the root (application-level)
// context to reuse the same datasource instance for all requests.
this.app.bind(this.bindingKey).to(ds).tag({
name: this.dataSourceName,
type: 'datasource',
namespace: 'datasources',
});
}
}
export class TenantDataSource extends juggler.DataSource {
// no static members like `dataSourceName`
// constructor is not needed, we can use the inherited one.
// start/stop methods are needed, I am skipping them for brevity
} The code example above creates per-tenant datasource automatically when the first request is made by each tenant. This should provide faster app startup and possibly less pressure on the database in the situation when most tenants connect use the app only infrequently. On the other hand, any problems with a tenant-specific database connection will be discovered only after the first request was made, which may be too late. If you prefer to establish (and check) all tenant database connection right at startup, you can move the code from Open questions:
A possible solution is to enhance |
@bajtos , thank you for starting this discussion. Great start. |
Few more comments on the examples provided in my previous comment: I assumed that the name of the current user is the tenant id. In a real app, we will need to map users to tenants first. In the first example, where I am passing custom model settings to base repository constructors, we will need to include a unique model name to use, in addition to custom schema. Otherwise all tenants would share the same backing model. const tenant = currentUser.name; // for simplicity
super(
// model constructor
Product,
// datasource to use
dataSource,
// new feature to be implemented in @loopback/repository:
// allow repository users to overwrite model settings
{name: `Product_${tenant}`, schema: tenant},
); |
I see a few tiers/components to enforce multi-tenancy.
|
@bajtos Thanks for responding to my DM on twitter and starting this conversion. Mind sharing how these data sources are injected into repositories since |
FYI: I just built an example application to illustrate multi-tenancy for LoopBack 4 - #5087. |
IIUC, you are interested in Datasource-based tenant isolation. The idea is to bind a static datasource key to For example, in the app constructor: this.bind('datasources.tenant').toProvider(TenantDataSourceProvider); Then you can inject the datasource the usual way, for example: @inject('datasources.tenant')
dataSource: TenantDataSource |
Thank you for chiming in and adding wider perspective to this discussion 👍 ❤️ |
hello,
In the beginning doing some GET/POSTs tests, I started to receive/save data from/to different tenant schemas instead of user.tenantId schema. After a year, with millions of records saved to database, most of them should be on same tenant_5 schema, I still have some records (kind of 1-2k records) that was not saved on correct schema, creating some "noise" issues. My doubt, do you think that @raymondfeng example solution is "bullet proof" about this connection pool issue? Using diferent tenantDatasources, can I improve connectionLimit to 5? best regards and thanks for lb4! |
@fredvhansen Your solution is problematic.
My example multi-tenancy application has completely isolated datasources for each tenant. The action in the sequence/interceptor can enforce the tenancy by setting different bindings to control what datasources to be used. If overhead is a concern, there is a possible solution for pooling datasources - see #5681 |
@raymondfeng , personal doubt: I don't understand how
binds to Db1DataSource with "datasources.config.db1" But this works! |
@bajtos This is great. The Datasource-based tenant isolation is exactly what I was looking for. However, I want to understand why you're checking for the cached datasource in the I just hope I'm not missing something related to the binding scope since all we're intending for the cached datasources is for them to be |
+1 for providing this kind of functionality in Loopback in a defined way to guide the user's implementation. Could eventually support different implementations such as
An example of logical isolation could be considered the $owner role in LB3. However I consider it non complete since it only applies to instance methods through the usage of modelId. Extra work needed for isolation of generic CRUD queries (find, update, create) etc. using a common API that will automatic filter responses and interactions according to logged in user's token. This could be really innovating and up to today's standards solution |
As it complains, the base class only accepts two args. If you meant to configure the dataSource with |
@raymondfeng I can call dataSource before super: |
You can define a function such as: function updateDataSource(dataSource: juggler.DataSource) {} Then in the constructor: super(entityClass, updateDataSource(dataSource)) |
@raymondfeng I tried to run your example, but it didn't store {
"ids": {
"User": 4
},
"models": {
"User": {
"1": "{\"tenantId\":\"\",\"name\":\"Tom\",\"id\":\"1\"}",
"2": "{\"tenantId\":\"\",\"name\":\"Red\",\"id\":\"2\"}",
"3": "{\"tenantId\":\"\",\"name\":\"Roy\",\"id\":\"3\"}"
}
}
} |
Did you take the final decision for enabling multi-tenancy on LB? |
@bajtos Thank you for the above inputs. But how to use same Datasource to connect to multiple schema (schema selection is done at runtime), so that same connection pool is used and the number of connections to Database is limited. We have 500+ schema to connect to. |
Multi-tenancy and dynamic datasource can be handled by datasource based tenant isolation. Check the loopback4-multi-tenancy package. It may help. |
Recently, several people asked about implementing dynamic schema selection to enable schema-base multi-tenancy, where tenant isolation is achieved via DDL Schemas. (If you are not familiar with DDL schemas then you can learn the basics e.g. in PostgreSQL docs).
I am opening this Epic to do discuss possible solutions, implement necessary improvements and document how to implement multi-tenancy, possibly including an example app.
Related discussions:
Aspects to consider:
lb4 datasource
@loopback/boot
(if needed) or move the registration to a different place (e.g. from a datasource file to a boot script)The text was updated successfully, but these errors were encountered: