Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Subtask] Survey transaction implementation on key-value store and remove RocskDB transactional dependency #617

Closed
Tracked by #603
yuqi1129 opened this issue Oct 27, 2023 · 5 comments · Fixed by #710
Assignees

Comments

@yuqi1129
Copy link
Contributor

Describe the subtask

Currently, our storage layer heavily relies on the transaction mechanism provided by RocksDB to ensure reliability. However some key-value pair databases do not support transaction operations, making it challenging for Gravitino to adapt to other
KV databases such as Redis, Cassandra, Hbase, and so on.

We need to eliminate transactional dependency and come up with alternative solutions.

Parent issue

#603

@yuqi1129
Copy link
Contributor Author

yuqi1129 commented Nov 3, 2023

@jerryshao https://docs.google.com/document/d/1DPvaBcUvmA4-9WOnJ_4cPabZXxlyjK7kF-lBLPiB-CY/edit#heading=h.xe72prsp3h4e
The preliminary documentation is here, and I need to reevaluate today.

@yuqi1129
Copy link
Contributor Author

yuqi1129 commented Nov 7, 2023

@jerryshao Hi, we discussed the document last night and have reached the following points.

  • The first step is to determine if we need to support transactions in Gravitino at this time. @xunliu has confirmed that KV databases used in eBay supports transaction and there's no urgency to support transaction in Gravitino.
  • Developing and implementing a mature transaction framework is a challenging task that may take months or even years. The solution I proposed in the document is also difficult to implement quickly.

Based on this, please help verify the following :

  • Whether we really need a transaction framework in Gravitino.
  • Whether the solution outlined in the document is reasonable.

I'm eagerly awaiting your response to this issue.

@jerryshao
Copy link
Contributor

I think you totally misunderstood what I'm saying. I'm not saying that we want to build a transaction layer of our own on top of kv store. I'm saying to get rid of current transactions and use other ways instead.

Thinking of the current operations we support, we are mostly doing put, get, delete, update, list on entity and entities. For most of the operations like put and get, they're operating on a single object, so we don't need transactions.

For update, currently we use transactions, I think we can change to use MVCC (copy-write-commit) mechanism somehow. For bulk delete operations, we can use mark-and-sweep mechanism instead.

Basically what I mean is that: for most of the kv stores, they support atomic operation on a single object, so we should leverage this ability to change our current layout to support single object operation, convert the several-step-operations to a single operation, using atomicity with lock to achieve transactions. If we only depend on the simple actions of kv store, so we can support different stores with no prerequisites.

We don't wanner a transaction layer at all, all we want is a new storage layout that doesn't require transactions to do the operations I mentioned above.

Please think in this way.

@jerryshao
Copy link
Contributor

Please schedule a time slot for the design discussion with us. The current design is too simple to understand the details.

@yuqi1129
Copy link
Contributor Author

Please schedule a time slot for the design discussion with us. The current design is too simple to understand the details.

OK

jerryshao pushed a commit that referenced this issue Nov 24, 2023
…anism provided by underlying storage (#710)

### What changes were proposed in this pull request?

Introducing a general 2PC transaction implementation to replace the
current transaction mechanism backed by underlying storage.

### Why are the changes needed?

Some KV databases may not support transactions on their own, so we
should not rely on them to provide transactions.

Fix: #617 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

New test class `TestKvTransactionManager` was added.
jerryshao pushed a commit that referenced this issue Nov 29, 2023
### What changes were proposed in this pull request?
1. Clean data that is potentially uncommitted and invisible.
2. Clear out old versions of data that have reached their TTL and need
to be recycled for storage space.

### Why are the changes needed?

#617 introduces the MVCC feature, which brings the problem that a key
may contain multiple versions of data, as time goes by, we should
collect old versions of data to reduce storage pressure.

Fix: #782 

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

New UT class `testCaseSensitive` was introduced.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants