Transactions & Pipeline: the current issue and ideas to fix it #681

gvolpe · 2022-03-31T07:53:29Z

Introduction

I've had this on my mind for a very long time, so let me see if I can communicate the issue so that others may help.

The current implementation of transactions is almost right, but it can't make any guarantees about commands running in order, because that's up to the runtime system (i.e. the scheduler). Thus, I've been recommending people not to use it.

Note this issue also applies to pipelining, as these share the same core implementation (see runner.scala).

Problem

Say we have a transaction mixing SET and GET commands, just as the following (taken from examples):

commandsApi.use { redis =>
  val ops =
    redis.set(key1, "sad") :: redis.set(key2, "windows") :: redis.get(key1) ::
        redis.set(key1, "nix") :: redis.set(key2, "linux") :: redis.get(key1) :: HNil

  //type Res = Unit :: Unit :: Option[String] :: Unit :: Unit :: Option[String] :: HNil
  val prog =
    RedisTransaction(cmd)
      .exec(operations)
      .flatMap {
        case _ ~: _ ~: res1 ~: _ ~: _ ~: res2 ~: HNil =>
          IO.println(s"res1: $res1, res2: $res2")
      }
      .onError {
        case TransactionAborted =>
          IO.println("[Error] - Transaction Aborted")
        case TransactionDiscarded =>
          IO.println("[Error] - Transaction Discarded")
        case _: TimeoutException =>
          IO.println("[Error] - Timeout")
        case e =>
          IO.println(s"[Error] - ${e.getMessage}")
      }

  prog *> IO.println("keep doing stuff...")
}

Most of the time, this will work, but it's unreliable. And the reason lies deep down at the core of the Java implementation, and how redis4cats models it.

A transaction consists of any set of supported commands that are queued (not run) between a call to MULTI followed by either EXEC or DISCARD. Just like database transactions are modeled via BEGIN / COMMIT / ROLLBACK.

This library models this problem as a Resource, where the acquisition means MULTI, and the release means either EXEC or DISCARD, depending on the result of the inner use block.

Because all the Java commands are basically CompletableFutures lifted into F via Async, once evaluated after a MULTI call, it will never yield a result as these are queued on Redis waiting for the transaction to be committed or rolled back.

So we fork every command via .start in order (see runner). Once the Resource hits the release part, fibers are joined or canceled to avoid leaks, depending on the transaction result.

The MAIN ISSUE with this approach, is that we cannot guarantee the commands will run in order, even if we fork them in order. This is because the execution of every fiber is up to the scheduler, not to us. It is part of the runtime system.

Another implication of transactions, is that every command needs to be run in the same thread, otherwise it would be detected as part of a different transaction (or not even a transaction). To try and mitigate this issue, we introduce RedisExecutor, which simply operates on a fixed thread-pool of size one. This is wrong on so many levels, because it affects EVERYTHING, not only transactions.

This means that by default, the library will run all your commands on a single thread, affecting the overall throughput of your application. It can be configured to a higher value, but it is 1 by default.

Solutions (mainly ideas)

Custom Scheduler

Last night (for some odd reason) this issue appeared in my brain, and made me think about schedulers. Since the main issue is the scheduling of fibers, it seems all we need is a custom scheduler used for Redis transactions, no?

It is very low level, but the issue lays right there on the runtime system, so we need to tweak the JVM scheduler to ensure commands are run sequentially.

Is this doable? I don't know, but I'm laying down the idea here. Perhaps, at some point in the future, I'll give it a try if nobody does first.

Sync API

Another idea that should work — albeit less ideal — is to re-implement transactions using the underlying synchronous API, in which we don't need to deal with CompletableFutures.

I'm pretty sure this one would work, but it would involve a lot of code duplication, and who really wants to do that work? Definitely not me.

Remove support

Not really a solution, but I'd rather delete all this code instead of having a broken implementation. I guess this should be the last resort.

How can you help?

This issue is quite fundamental, so it would require knowledgeable people who like to tinker with code and runtimes a lot, but I believe there's also a lot to learn from the experience of diving deep into this interesting feature.

Another simple issue in which anyone can help is in updating the documentation for transactions and pipelining. We should add a big fat warning sign at the top, telling people not to use it until this problem is resolved (we can link this issue).

If you would like to help, simply comment below when you're ready to take on the issue. I don't really have much time at the moment, but I can probably provide a bit of guidance. Come chat on Matrix / Gitter.

The text was updated successfully, but these errors were encountered:

kyri-petrou · 2022-03-31T09:15:48Z

I'm not very knowledgable so feel free to ignore me or tell me off if any of my suggestions are terrible / don't make sense 😄

Since Redis 2.6.5 you don't have to call DISCARD on error within a transaction. If any command errors, Redis will keep ignoring all commands within the transaction until EXEC is called, at which point it will raise an error. Would that help simplify the design in any way? (ref)
Since all of the commands are being forked to a separate fiber already, would it make sense to create an executor / execution context with a threadpool of 1 at the start of that transaction, use .startOn(ec) instead of .start and shut it down at the end of it or would that be too expensive / bad?

Again, feel free to ignore me or tell me off if any of these doesn't make sense, just thought they might help spark an idea or something

joroKr21 · 2022-03-31T09:22:54Z

It sounds like modelling the individual operations with IO might not be appropriate. E.g. Doobie has a special ConnectionIO for that purpose. Or maybe IO[IO[A]] so that you can start them independently from waiting for the result. Or just some free data structure RedisOp that you interpret later.

gvolpe · 2022-03-31T09:47:31Z

@kyri-petrou

That's good to know, but doesn't change much, besides removing the call to DISCARD.
I don't think that would work, as it is pretty much what we are doing now. The issue is that multiple calls to startOn(ec) does not guarantee the running order (that is up to the scheduler). The idea is good, though, we could use startOn(ec), where ec needs a custom scheduler.

@joroKr21

I believe it is doable to keep things within the same effect. I'm aware of Doobie's design using the Free Monad, that could definitely work (and I have considered it in the past), but it's the complexity and maintenance burden of such model that puts me off tbh. Also UX is compromised (see Doobie vs Skunk).

joroKr21 · 2022-03-31T10:20:47Z

Maybe you can use IO.syncStep if you can make sure that starting an operation is synchronous.

gvolpe · 2022-04-18T17:40:42Z

@kyri-petrou your suggestion seems to work pretty well (see #685). Still on early stages but already showing promising results.

I think we'll still have the issue of not being able to guarantee the running order, but I guess that's a compromise we can make in favor of reliable execution of transactions.

kyri-petrou · 2022-04-18T21:38:01Z

@gvolpe that's good news! I'm not too familiar with the cats / scala inners, but for running order would it make any difference if you used .foldLeft over the list instead of .traverse here? (with the necessary refactoring to make that work)

https://github.com/profunktor/redis4cats/pull/685/files#diff-c658d7aade2cc8ce11a81a5895799e8b68b7536fa836dd8745e2990aeabcf010R47

gvolpe · 2022-04-19T06:21:04Z

@kyri-petrou both traverse and foldLeft will go through the list in order. My ordering concern goes deep into the scheduler: i.e. what happens when I use start / startOn? If I start multiple operations in order, is the order of execution respected? I'd need to investigate this further, but my understanding was that the JVM scheduler doesn't make any guarantees about the order of execution when a task is "submitted" to it.

kyri-petrou · 2022-04-19T12:39:39Z

@gvolpe ah okay, I see your concern. I'm afraid I know very little on CE / JVM scheduling to have an opinion

What gave me the idea for my suggestion above came from Lettuce documentation and how they use the async API for transactions: https://github.com/lettuce-io/lettuce-core/wiki/Transactions#transactions-using-the-asynchronous-api

I might be extrapolating here but it seems to suggest that as long as you call their async api in order in a single thread it will be executed in order? Again my understanding is not too great so there might be missing other aspects I'm not aware of like the CE scheduling, but maybe this helps in some way

gvolpe · 2022-04-20T15:10:13Z

@kyri-petrou I think your understanding is correct. After all, you can only execute one transaction at a time per RedisCommands, and so far, it seems to work reliably. Perhaps the issue was always something else and I never really got it 🤷🏽

I will go ahead and merge it, I'm happy enough with the current UX, and I want to fix pipelining next.

gvolpe · 2022-04-23T20:42:41Z

FYI: #689 improves UX by a large margin 💪🏽

kyri-petrou · 2022-04-25T04:33:34Z

FYI: #689 improves UX by a large margin 💪🏽

😮 that's some great stuff there, well done!

gvolpe added bug Something isn't working help wanted Extra attention is needed pinned labels Mar 31, 2022

gvolpe mentioned this issue Apr 18, 2022

feature: transactions with single-threaded execution context #685

Merged

gvolpe closed this as completed in #685 Apr 20, 2022

gvolpe mentioned this issue Apr 20, 2022

pipelines: reliable new implementation #687

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transactions & Pipeline: the current issue and ideas to fix it #681

Transactions & Pipeline: the current issue and ideas to fix it #681

gvolpe commented Mar 31, 2022

kyri-petrou commented Mar 31, 2022

joroKr21 commented Mar 31, 2022

gvolpe commented Mar 31, 2022 •

edited

Loading

joroKr21 commented Mar 31, 2022

gvolpe commented Apr 18, 2022

kyri-petrou commented Apr 18, 2022 •

edited

Loading

gvolpe commented Apr 19, 2022

kyri-petrou commented Apr 19, 2022

gvolpe commented Apr 20, 2022

gvolpe commented Apr 23, 2022

kyri-petrou commented Apr 25, 2022

Transactions & Pipeline: the current issue and ideas to fix it #681

Transactions & Pipeline: the current issue and ideas to fix it #681

Comments

gvolpe commented Mar 31, 2022

Introduction

Problem

Solutions (mainly ideas)

Custom Scheduler

Sync API

Remove support

How can you help?

kyri-petrou commented Mar 31, 2022

joroKr21 commented Mar 31, 2022

gvolpe commented Mar 31, 2022 • edited Loading

joroKr21 commented Mar 31, 2022

gvolpe commented Apr 18, 2022

kyri-petrou commented Apr 18, 2022 • edited Loading

gvolpe commented Apr 19, 2022

kyri-petrou commented Apr 19, 2022

gvolpe commented Apr 20, 2022

gvolpe commented Apr 23, 2022

kyri-petrou commented Apr 25, 2022

gvolpe commented Mar 31, 2022 •

edited

Loading

kyri-petrou commented Apr 18, 2022 •

edited

Loading