-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v2.0.0] Token-aware routing #131
Comments
If I am not mistaken that would be fairly complex task.
cc-ing @tupshin @AlexPikalov what do you think? |
@harrydevnull @kw217 I apologise these days I have very limited access to internet. Next week I'll take a look on this closely. |
Hello, I'm also interested in Token-aware routing. Would like to implement high traffic query service in Rust but not having this feature actually makes Go (which has it) more effective. Is this likely to be implemented sometime in 2020? Sincerely, Artem |
Hello @Russoturisto, Yes, it must be a task with one of highest priority in 2020. |
Did a bit of digging in ScyllaDB go driver. I have now idea what I'm talking about, but hopefully this will help: Murmur3 hash appears to be used: https://github.com/gocql/gocql/blob/master/token.go On a separate note, here is the ScyllaDB protocol extension that connects to the right shard (not just node): And the pull request that makes it happen (again in Go): I don't have enough Rust + Cassandra/Scylla knowledge to help right now but will try next autumn (8 month from now) if nothing happens by then. Either way I'm implementing my read services in Rust (I'll have an LRU cache and can't afford the extra heap space required by Go garbage collection). Hope this helps! Artem |
Hi @Russoturisto , |
Hi, I'm from ScyllaDB, but this post is from my private spare time, so no warranty :) First of all, in order to have token-aware policy, the driver indeed needs to store information about the cluster. In the java driver, the policy uses cluster metadata, which is kept up to date with an additional control connection. The connection is used to fetch info from
With this, for each query, we can extract its partition key (if it's provided), and compute its token - e.g. if the cluster uses Murmur3 partitioner, we compute a murmur3 hash of the key's data. Then, since the driver knows which tokens are most likely owned by which nodes, it can pick a correct one. Also, note that token-aware policy needs a child policy to fall back to - e.g. if we don't have enough information to compute the correct node. It's also possible to have different strategies for load-balancing inside token-aware policy, more info here: https://github.com/scylladb/java-driver/blob/3.7.1-scylla/driver-core/src/main/java/com/datastax/driver/core/policies/TokenAwarePolicy.java#L62 As for a very important Scylla-specific optimisation - shard awareness - we store more information about every node (e.g. the number of its shards), and also try to create a separate connection for every shard. Then, given a partition key, we can leverage this information to compute which shard it belongs to in a specific node, and send the request to the correct shard, which results in better performance. So, in short, in order to have token-aware routing the first important thing is to have a way of fetching information about the cluster from the cluster itself. Then, once we know which partitioner is used and which nodes own which tokens, we can compute appropriate tokens and route queries in a more optimized way. I see that Rust already has several crates that offer murmur3 hashing, so that's convenient :) |
Also, as an exercise for myself, I wrote a quite useless (at least for now) snippet that reads Scylla sharding info from a node and prints it: psarna@1b25aff. Perhaps it can be used one day as a template for implementing shard awareness in cdrs on top of token awareness. It's also literally my first code in Rust, so don't judge :) |
Hi @psarna, |
Thanks. I'm a little short on time, but I'll try to figure out Rust's conditional compiling by myself, and perhaps one day I'll push something more substantial. And if token-aware routing makes it into cdrs one day, I could definitely help integrating Scylla's shard awareness on top of it. |
Please could CDRS implement token-aware routing? This is important for performance, because it reduces network hops and also reduces load on the Cassandra cluster.
(How difficult would this be to add?)
The text was updated successfully, but these errors were encountered: