-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The limit problem #1142
Comments
Can you try using the http api like this (at the same time, which would
reduce late arriving spans from skewing things)?
$ curl -s 'localhost:9411/api/v1/traces?serviceName=ycf-search&limit=10'|jq
.
$ curl -s 'localhost:9411/api/v1/traces?serviceName=ycf-search&limit=20'|jq
.
The api always returns in descending timestamp order. looking at the
output, you might be able to explain something..
|
I found it that if no request comes, |
Not all Zipkin storage backends include arbitrary ordering capabilities. The server-side sort order is a fairly well documented and tested part of
I'd recommend using the api, and returning the json you mention that works |
fwiw, I know exactly why LIMIT is not working correctly with Cassandra. In MySQL all the data is in one place, so however complex the query is, it is first satisfied against all AND clauses and then a limit is applied. With Cassandra, each AND condition may need to be resolved against a different index table, by doing direct shard key lookup. So instead of |
@yurishkuro You are right to mention the above when multiple conditions exist. We should document this somewhere besides the code, probably cassandra's README I guess. We should also make a failing test that we can skip in the cassandra module. All that said, I still think we need failing json for the issue as reported, because it isn't a complex query. Unless you know otherwise, it still seems unexpected to return less than limit when the query is simple, right? In the above screen shots, there's no query conditions except serviceName, and the cassandra logic appears to only use limit once.. for Ex. here's a trace for a simple serviceName query in cassandra: |
Yes, making the LIMIT issue reproducable would be nice. On a simple query, I wonder if this is because the same trace ID gets returned multiple times. Most index tables in Cassandra allow dups of (search_key -> trace_id) because timestamps are used to differentiate the records. Doing otherwise would've resulted in lots of tombstones, degrading performance. The LIMIT clause does not know that the same trace_id is being returned. |
I think you may be onto something.. I think this is testable. For example,
store RPC spans like they would arrive from instrumentation (server, then
client). This would ensure that mid-tier spans would all have 2 span blobs.
If this repeats the limit issue on simple query, we can test any remedy.
|
for us to see this in json, we would have needed the "raw" query
parameter, as that would show separate span parts
|
so looks like I can reproduce this issue as it came up here #1141 (comment) |
Here's the summary of what I "think" is going on. The service_name_index needs will store only one trace_id per: (bucket, timestamp (millisecond), service_name) This means it can miss traces that happen against the same service in the same millisecond. @luoyongjiee can you check your data to see if this is the case? I'm able to reproduce this by issuing identical spans that vary only on ids and timestamps (ex in #1141). I use the following query to validate.
|
Yep, sounds right, given Would've been better with this key:
|
In Cassandra primary keys: timeuuid should be used instead of timestamp. Regarding having to query individual partitions to get limits for each, this feature has been introduced in newer versions of Cassandra with the " … PER PARTITION LIMIT x" cql syntax. |
Agree that timeuuid is the typical pattern in cassandra for this. However, since the timestamps and trace_ids here are application assigned and cannot be changed, in my opinion it would be better to promote the |
@luoyongjiee update. I've reproduced this problem in a unit test (important so it doesn't creep back in). Yuri's suggestion works fine, but the index needs to be created. We'll have a release with the fix out by tomorrow. |
A schema bug resulted in Cassandra not indexing more than bucket count (10) trace ids per millisecond+search input. This manifested as less traces retrieved by UI search or Api query than expected. For example, if you had 1000 traces that happened on the same service in the same millisecond, only 10 would return. The indexes affected are `service_span_name_index`, `service_name_index` and `annotations_index` and this was a schema-only change. Those with existing zipkin installations should recreate these indexes to solve the problem. Fixes #1142
A schema bug resulted in Cassandra not indexing more than bucket count (10) trace ids per millisecond+search input. This manifested as less traces retrieved by UI search or Api query than expected. For example, if you had 1000 traces that happened on the same service in the same millisecond, only 10 would return. The indexes affected are `service_span_name_index`, `service_name_index` and `annotations_index` and this was a schema-only change. Those with existing zipkin installations should recreate these indexes to solve the problem. Fixes #1142
A schema bug resulted in Cassandra not indexing more than bucket count (10) trace ids per millisecond+search input. This manifested as less traces retrieved by UI search or Api query than expected. For example, if you had 1000 traces that happened on the same service in the same millisecond, only 10 would return. The indexes affected are `service_span_name_index`, `service_name_index` and `annotations_index` and this was a schema-only change. Those with existing zipkin installations should recreate these indexes to solve the problem. Fixes #1142
Reverted 0d51d90 as it needs more work. We need the query to return only unique trace ids, or repeat the query up to limit. Ex.
|
so.. I'll wait for experts to get an idea.. I want to get |
I think this is the last update I have on this issue.
It is a fools errand to attempt to deduplicate trace ids on the (cassandra) server side because trace ids aren't a partition key. We can only do distinct clauses on partition keys. The only way left is to compensate on the (cassandra) client side: zipkin in this case. Here's two concrete proposals: Change CassandraSpanConsumer to cache trace id indexes locallyBy caching trace id indexes locally, we can ensure that at least in the same collector shard, we don't write the same unique input more than once per trace. This will be most effective for those who run a single collector or consistently route trace ids to collector instances. Even randomly routed collectors should see smaller indexes, when spans in the same trace are bundled when reported from tracers. Change CassandraSpanStore to fetch more trace ids than
|
The previous code had a mechanism to reduce writes to two indexes: `service_name_index` and `service_span_name_index`. This mechanism would prevent writing the same names multiple times. However, it is only effective on a per-thread basis (as names were stored in thread locals). In practice, this code is invoked at collection, and collectors have many request threads per transport. By changing to a shared loading cache, we can extend the deduplication to all threads. By extracting a class to do this, we can test the edge cases and make it available for future work, such as the other indexes. TODO: write unit tests See #1142
For the second remediation (fetch more ids), I've thought a lot and I think the best way is to have a multiplier. For example, fetch 10x more ids than you want (relates to how much variance in span data there is per trace). This is nice because system people can adjust it as they see fit and break the pattern of users always asking for more than they need. |
While in some sites it will need to be higher, the lowest multiplier I could find that leads to unsurprising results is 3. |
Even when optimized, cassandra indexes will have more rows than distinct (trace_id, timestamp) needed to satisfy query requests. This side-effect in most cases is that users get less than `QueryRequest.limit` results back. Lacking the ability to do any deduplication server-side, the only opportunity left is to address this client-side. This over-fetches by a multiplier `CASSANDRA_INDEX_FETCH_MULTIPLIER`, which defaults to 3. For example, if a user requests 10 traces, 30 rows are requested from indexes, but only 10 distinct trace ids are queried for span data. To disable this feature, set `CASSANDRA_INDEX_FETCH_MULTIPLIER=1` Fixes #1142
A schema bug resulted in Cassandra not indexing more than bucket count (10) trace ids per millisecond+search input. This manifested as less traces retrieved by UI search or Api query than expected. For example, if you had 1000 traces that happened on the same service in the same millisecond, only 10 would return. The indexes affected are `service_span_name_index`, `service_name_index` and `annotations_index` and this was a schema-only change. Those with existing zipkin installations should recreate these indexes to solve the problem. Fixes #1142
final change in for this issue. will cut a release post-merge, probably tomorrow #1177 |
…a index (#1177) * Over-fetches cassandra trace indexes to improve UX Even when optimized, cassandra indexes will have more rows than distinct (trace_id, timestamp) needed to satisfy query requests. This side-effect in most cases is that users get less than `QueryRequest.limit` results back. Lacking the ability to do any deduplication server-side, the only opportunity left is to address this client-side. This over-fetches by a multiplier `CASSANDRA_INDEX_FETCH_MULTIPLIER`, which defaults to 3. For example, if a user requests 10 traces, 30 rows are requested from indexes, but only 10 distinct trace ids are queried for span data. To disable this feature, set `CASSANDRA_INDEX_FETCH_MULTIPLIER=1` Fixes #1142 * Fixes Cassandra indexes that lost traces in the same millisecond (#1153) A schema bug resulted in Cassandra not indexing more than bucket count (10) trace ids per millisecond+search input. This manifested as less traces retrieved by UI search or Api query than expected. For example, if you had 1000 traces that happened on the same service in the same millisecond, only 10 would return. The indexes affected are `service_span_name_index`, `service_name_index` and `annotations_index` and this was a schema-only change. Those with existing zipkin installations should recreate these indexes to solve the problem. Fixes #1142
hello,when i query in zipkin-ui(the data storage in cassandra),
if the limit param is 10 ,it shows Showing: 6 of 6 ;
if the limit param is 20 ,it shows Showing: 12 of 12;
What is the problem? Thank you!
The text was updated successfully, but these errors were encountered: