Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of TypeORM transactions #242

Closed
michaelbromley opened this issue Jan 8, 2020 · 12 comments
Closed

Make use of TypeORM transactions #242

michaelbromley opened this issue Jan 8, 2020 · 12 comments
Milestone

Comments

@michaelbromley
Copy link
Member

There are several places in the core service layer where transactions should be used to keep data consistent in the event of one of a series of logically-linked DB operations failing. An example is this code:

const oldIdentifier = user.identifier;
user.identifier = newEmailAddress;
customer.emailAddress = newEmailAddress;
await this.connection.getRepository(User).save(user, { reload: false });
await this.connection.getRepository(Customer).save(customer, { reload: false });
this.eventBus.publish(new IdentifierChangeEvent(ctx, user, oldIdentifier));

If the Customer save operation on line 206 fails, the result will be an inconsistent state where the user and customer have different email addresses, which is bad.

TypeORM comes with transaction support which would prevent this from occurring.

Each service should be reviewed and any such location in need of transactions should be identified and remedied.

@michaelbromley
Copy link
Member Author

michaelbromley commented Jan 23, 2020

After doing some research on this topic, it turns out that the fix is not as simple as it first appeared (surprise surprise 😂).

Consider:

class AdministratorService {

    // ...

    async create(input: CreateAdministratorInput): Promise<Administrator> {
        const administrator = new Administrator(input);
        administrator.user = await this.userService.createAdminUser(input.emailAddress, input.password);
        let createdAdministrator = await this.connection.manager.save(administrator);
        for (const roleId of input.roleIds) {
            createdAdministrator = await this.assignRole(createdAdministrator.id, roleId);
        }
        return createdAdministrator;
    }

In the above code from the AdministratorService, we would want to wrap the creation of the User entity and the creation of the Administrator entity in a transaction, so that if either one fails, no DB changes are persisted.

However, since the creation of the User entity is done in a separate service (UserService), we cannot simply use the TypeORM transaction mechanism since there is no way to pass the created transactionalEntityManager around between disparate services.

This problem is discussed at length in this issue: typeorm/typeorm#1895

The solution given there is to use the library typeorm-transactional-cls-hooked which allows you to set variables in a context which is scoped to the current async call stack. Kind of like how Zone.js is used in Angular.

My issue with this lib:

  • It is based on https://github.com/Jeff-Lewis/cls-hooked, which itself makes use of experimental Node APIs which may be removed. Plus looking over the issues, the lib seems to have some rough edges & perf issues. Would likely increase maintenance cost and bug surface area.
  • It would probably not work out-of-the-box with Nest's TypeORM module anyway, since we use Nest's @InjectConnection() decorator which would not know about the local transactionalEntityManager anyway.

Possible solutions:

  1. Do nothing. Not ideal because the lack of transactions can lead to inconsistent DB states as outlined in the original issue.
  2. Use the built-in TypeORM transaction mechanism where possible, i.e. within a single method of a Service.
  3. Create some kind of custom implementation, possibly based on Nest's Request injection scope, which allows a transaction to be passed along via the Nest DI system without the need for hacks like CLS.

I am tending towards option 3, but will postpone this work for now, as it will likely require quite a major refactoring of the entire service layer.

@Szbuli
Copy link

Szbuli commented Feb 24, 2020

Passing an optional entityManager parameter is not an option?
This would enable the use of transactions when needed.

This would mean a lot of boiler plate code... but would not need architectural change.

@bablukid
Copy link

No news about this ? I'm struggling to find a proper way to use transactions with NestJS and TypeORM...

@buruss
Copy link

buruss commented Aug 27, 2020

@michaelbromley
Copy link
Member Author

@buruss That looks like a very nice solution, thanks for the link! I'll have a go at implementing this when I have some time.

@michaelbromley michaelbromley added this to the v0.16.0 milestone Aug 31, 2020
@michaelbromley
Copy link
Member Author

Wow I just implemented a proof-of-concept for the solution from the above blog post, and it looks like this is gonna work really nicely!

Next I'll have a go at implementing it for all mutations and I'm interested to see how it impacts performance, since it relies on request-scoped providers, which Nest docs warn can degrade perf. I'll benchmark before and after.

@michaelbromley
Copy link
Member Author

Update: I ran into a pretty big issue with this solution: it relies on making the database connection (used by every service) request-scoped. This has the effect of making every service then request-scoped (as per https://docs.nestjs.com/fundamentals/injection-scopes#scope-hierarchy).

This means that on each request, every service is re-instantiated. Not only is this not acceptable performance-wise, it also breaks several other parts of the app which assume singleton services which get created at bootstrap time.

So it unfortunately looks like we need a different approach. Back to the drawing board...

@michaelbromley michaelbromley pinned this issue Sep 2, 2020
@frenchtoast747
Copy link

frenchtoast747 commented Sep 4, 2020

Hey @michaelbromley I'm sorry to hear that the solution is not working 😞 I'm still racking my brain to think of another solution 🤔 The best I can come up with (only thinking on it for a few minutes) is to create a separate, lightweight service only where transactions are needed and it's allowed to be instantiated per-request.

Not only is this not acceptable performance-wise

Just curious here, do you have evidence that the per-request scoping causes a significant degradation in performance? I'm aware of the NestJS docs mentioning a potential performance hit, but I've never noticed anything myself. Sadly, I've never had a significant load of traffic to be able to measure anything 😞 I will admit, I'm not doing any sort of expensive initialization code in my services, though, so maybe that's it?

Due to the fact that you can't have multiple, concurrent transactions on the same DB connection (at least for Postgres), you'll definitely have to have a separate DB connection per-request which makes me think that the only way around this is either making the EntityManager a required parameter everywhere or use CLS coupled with a singleton UnitOfWork so that it creates the new connection on the same thread. At least, this was the conclusion I came to months ago and for the same reasons I wanted to avoid CLS.

Good luck with this! I will definitely keep thinking on it and watching this issue so I can update my post if a better solution is found 🙏 Cheers!

@michaelbromley
Copy link
Member Author

Hey @frenchtoast747! I really appreciate you weighing in on this issue. I was also trying to think of some way to create a request-bound connection service only when needed. E.g. a regular singleton service which, when needed, can get access to the current request object and attach an EntityManager to it. But as far as I can tell this is not possible. If you want to inject the request object into the constructor, then the provider must be request-scoped.

Just curious here, do you have evidence that the per-request scoping causes a significant degradation in performance?

Great question - no. But I do have some init code in several of my services. I could architect this away of course, but there are still numerous issues which arise when my service layer are lazily instantiated on each request. Even if the performance hit was negligible, things get overly complex if I cannot reliably know when a provider will get created. For example, I have an auth guard which in turn depends on SessionService to create a session in the DB. This breaks because the auth guard runs before the SessionService is instantiated.

So the solution I have arrived at now is in essence "pass around the EntityManager everywhere". In Vendure we have a RequestContext object which is already passed around almost everywhere, and I'm piggy-backing on that and attaching the EntityManager to it. But only if the current operation is decorated with a custom @Transaction decorator (which uses a Nest Interceptor to patch the EntityManager on to the RequestContext object).

Due to the fact that you can't have multiple, concurrent transactions on the same DB connection (at least for Postgres), you'll definitely have to have a separate DB connection per-request which makes me think that the only way around this is either making the EntityManager a required parameter everywhere or use CLS coupled with a singleton UnitOfWork so that it creates the new connection on the same thread. At least, this was the conclusion I came to months ago and for the same reasons I wanted to avoid CLS.

This is interesting and I hadn't considered that yet. So does that mean that if I use a singleton provider into which I inject the TypeORM connection, and then use that connection to execute mutations in a transaction, then concurrent operations will be blocked by the transaction?

@frenchtoast747
Copy link

I could be wrong here as I'm still learning the ins and outs of TypeORM itself, but from what I gather, the getConnection() function is what's used when the injections occur and that's not a single connection, rather a wrapper around a Connection Pool. Your singleton provider should still be getting a new (or re-use an existing) connection under the hood, so any transactions should be safe. I also think that getConnection() will block if all of the connections in the connection pool are used up. This is all based on various things I've read, but pleerock (one of the maintainers) had this to say, so I think you should be safe.

michaelbromley added a commit that referenced this issue Sep 11, 2020
Relates to #242

BREAKING CHANGE: The TypeORM `Connection` should no longer be directly used. Instead, inject the new `TransactionalConnection` class, which wraps the TypeORM connection and enables database transactions to be used in conjunction with the new `@Transaction` decorator.

The `getEntityOrThrow()` and `findOneInChannel()` helper functions have been deprecated and replaced by methods with the same name (but slightly different signature) on the TransactionalConnection class.
@michaelbromley
Copy link
Member Author

In the transactions branch I've implemented transactions on all mutations and got all e2e tests passing.

However, I've started running load tests and I think there are some significant issues arising with deadlocks when running many concurrent mutations, as in the search-and-checkout load testing script.

Need to dig deeper into this, but perhaps we need more fine-grained control over the transactions that is afforded by the decorator approack.

@michaelbromley
Copy link
Member Author

Re-worked some details and now the perf is comparable to without transactions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants