Linear response time on services query in elasticsearch #1526

codefromthecrypt · 2017-02-24T00:44:58Z

Getting some reports of eventual timeouts on service and span name queries on elasticsearch. Ex at 80GB/day you eventually will time out (reported by @pauldraper)

Ultimately one big issue is the shape of the json, which requires nested queries to access the service name of a span. The new reporting format (#1499) will allow us to eventually fix this part, but we can't pin to a span format-based implementation to fix a performance problem!

In cassandra3, we started with a materialized view, but had to switch to manual indexing for similar performance reasons (#1374). The downside is that it amplifies writes and also is less straightforward for those who might be simply writing json to ES (as opposed to via a collector).

Regardless of what we do, it would be nice to have a perf test which can showcase evidence of linear latency patterns on service or span name queries. This would catch any system having indexes which aren't ideal.

We have some next steps to consider...

firstly, properly diagnose the issue. As ES is primarily a cache, maybe there's some tuning of the index template to help with nested queries on service and span names
next, see what needs to be done in the collector directly vs somewhere else. For example, if we have to do manual indexing we could consider an ES pipeline stage to compose that.

cc @openzipkin/elasticsearch

codefromthecrypt · 2017-04-10T12:12:00Z

another thing I thought of is that we should collect notes about what sort of data they are putting into zipkin. For example, historically spans had very few tags and people used sampling. Particularly OpenTracing tend to put a lot of tags (each of which ends up in the subquery for service names). It will be interesting to know how many spans and how big they tend to be when users experience problems.

semyonslepov · 2017-04-10T15:42:25Z

There are some statistics about our "bad" experience:

~6 Gb of traces in each of daily indices (keeping 5 last indices)
~1 Kb per span
3-4 tags per span

( - running on m3.medium.elasticsearch in AWS, if it matters)

The main workload is writing and some users accidentally come to a web-interface for looking their traces.

codefromthecrypt · 2017-04-11T00:43:56Z

Thanks very interesting. You have very few but large tags. Is it safe to do math to extrapolate span count from this?

…

On 10 Apr 2017 11:42 pm, "Semyon Slepov" ***@***.***> wrote: There are some statistics about our "bad" experience: - ~6 Gb of traces in each of daily indices (keeping 5 last indices) - ~1 Kb per span - 3-4 tags per span ( - running on m3.medium.elasticsearch in AWS, if it matters) The main workload is writing and some users accidentally come to a web-interface for looking their traces. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1526 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAD615yRdvh0JxcWakH5BoLoEb4AkJwrks5ruk3lgaJpZM4MKsNW> .

codefromthecrypt · 2017-04-11T03:56:16Z

I have been thinking about this and believe adding a new index for service and span name is the best option. This would be just like cassandra where we keep a separate document for service/span lookup:

CREATE TABLE IF NOT EXISTS zipkin3.span_name_by_service (
    service_name text,
    span_name    text,
    PRIMARY KEY (service_name, span_name)
)

Backfilling this index would be relatively simple and could be done a myriad of ways. Unlike adding new data to the span index, this approach doesn't risk people accidentally clobbering their spans to support it. It also doesn't depend on any specific span representation (current or future).

In implementation, easiest would be a daily index like we use for dependency links. We could set the document ID to service and span name so that the data doesn't grow unbounded. If daily is too course granularity, we could make the ID naming convention include the hour of the day or minute of the hour and still significantly bound the data size.

We could make the read side look at both, and only execute the slow nested query when the ideal index contains no data. This means users don't have to do anything to start using it.

Operationally, I like this approach as being similar to cassandra means less cognitive overhead for cross-storage maintainers.

@openzipkin/elasticsearch @semyonslepov wdyt?

mansu · 2017-04-11T04:03:12Z

This sounds great! A couple of questions:

When is this new index populated? Is it when a span is written to storage?
Having the ability to set the granularity of the index name would be awesome. So, it would be nice to set the granularity to weekly or monthly also. That way, we may not need to clean up this index if we so desire.

codefromthecrypt · 2017-04-11T04:30:02Z

- When is this new index populated? Is it when a span is written to storage? yes when written, although we can also provide a script to apply to old

data

- Having the ability to set the granularity of the index name would be awesome. So, it would be nice to set the granularity to weekly or monthly also. That way, we may not need to clean up this index if we so desire. It is important to note that all of this assumes the index is the same as

the others.. this is just another document type in the existing index (another beyond span and dependencyLink). The current strategy for cleaning up any daily index would be curator or similar, and through that you can choose to retain as much as you like (caveat being performance will eventually degrade) So the naming pattern I'm referring to is on the *document id* .. and doing something different would only be needed if we want less than a day granularity on queries. ex. The proposal is that inside the zipkin-2017-04-18/serviceSpan index/type, there would be documents with ids like.. accounting|get-users accounting|update-users (this is the exact approach used for dependencyLink) If this is good enough, we should leave it at that. If for some reason less than daily is needed, we could add a timestamp, though it will make query more complex and also visually looking at the data will be annoying. In case you don't know.. I don't like this option :) ex. for hourly 00-accounting|get-users, 01-accounting|get-users, 01-accounting|update-users, 02-accounting|get-users, ... 23-accounting|get-users * to do this, we'd need to store a timestamp in the json, too, which would make the query a bit more expensive because if we go daily we don't need to do anything except grab everything in the daily bucket.

mansu · 2017-04-11T06:19:52Z

Thanks for the explanation. +1 in this case.

I was asking a longer term index since it would mean cleaning up one more index when we purge the data. But, if this is added an additional index type on the daily zipkin index, daily granularity is an awesome solution.

codefromthecrypt · 2017-04-11T06:28:33Z

np and thanks for your feedback! will wait for other feedback before heading towards impl.

semyonslepov · 2017-04-11T07:25:18Z

Is it safe to do
math to extrapolate span count from this?

Yes, it's fair enough. Actual counts may differ, but it's correct on average.

About indexes: sounds good and transparent for users. Am I right that we will use the same index for spans' names lookup and it's going to be faster too?

codefromthecrypt · 2017-04-11T07:30:03Z

About indexes: sounds good and transparent for users. Am I right that we will use the same index for spans' names lookup and it's going to be faster too?

yes this index will support both. basically the query will be a lot simpler, so it should be much faster. What won't be faster are trace queries, as they still use nested stuff. I won't be able to benchmark as I don't know how besides laptop test which isn't likely very realistic.. hopefully you can help let us know how much better it goes. is that ok?

semyonslepov · 2017-04-11T07:50:26Z

So, we still don't have huge enough data, but of course, I will run some measurements on current and new versions and post the results.

tramchamploo · 2017-04-11T11:05:14Z

+1 for a new index, but is there any payload that is from existence check when persisting service/span names?
Also would change shape of spans be a long-term consideration that will be helpful for overall performance?

codefromthecrypt · 2017-04-11T14:26:16Z

+1 for a new index, but is there any payload that is from existence check when persisting service/span names?

yes, there would have to be a payload so we can do the span name query. it would be a document including the service name and the span name ex. for id = accounting|update-users the document would be: { "serviceName": "accounting", "spanName": "update-users" } That way, when we do a span name query, we can retrieve the spanName keys with a term query where serviceName is the input. This is needed because you cannot do a query against an id field. make sense?

tramchamploo · 2017-04-12T04:22:19Z

Great, this improvement on service/span name query will greatly improve user experience.

codefromthecrypt · 2017-04-12T04:52:00Z

ok moving to impl as no one is against. thx for feedback, folks

codefromthecrypt · 2017-04-12T06:56:19Z

will take a bit to refactor properly, but the more important code change at query time will look something like this...

service name query

-    SearchRequest.Filters filters = new SearchRequest.Filters();
-    filters.addRange("timestamp_millis", beginMillis, endMillis);
-    SearchRequest request = SearchRequest.forIndicesAndType(indices, SPAN)
-        .filters(filters)
-        .addAggregation(Aggregation.nestedTerms("annotations.endpoint.serviceName"))
-        .addAggregation(Aggregation.nestedTerms("binaryAnnotations.endpoint.serviceName"));
+    SearchRequest request = SearchRequest.forIndicesAndType(indices, SERVICE_SPAN)
+        .addAggregation(Aggregation.terms("serviceName", Integer.MAX_VALUE));

span name query

-    SearchRequest.Filters filters = new SearchRequest.Filters();
-    filters.addRange("timestamp_millis", beginMillis, endMillis);
-    filters.addNestedTerms(asList(
-        "annotations.endpoint.serviceName",
-        "binaryAnnotations.endpoint.serviceName"
-    ), serviceName.toLowerCase(Locale.ROOT));
-    SearchRequest request = SearchRequest.forIndicesAndType(indices, SPAN)
-        .filters(filters)
-        .addAggregation(Aggregation.terms("name", Integer.MAX_VALUE));
+    SearchRequest request = SearchRequest.forIndicesAndType(indices, SERVICE_SPAN)
+        .term("serviceName", serviceName.toLowerCase(Locale.ROOT))
+        .addAggregation(Aggregation.terms("spanName", Integer.MAX_VALUE));

codefromthecrypt · 2017-04-13T09:08:35Z

#1560 is ready, just looking to make a backport script or at least pseudocode for adding "servicespan" into old indexes

codefromthecrypt · 2017-04-13T15:33:43Z

If I can get a shipit or change requests on #1562 and #1560, I will release the newly optimized stuff tomorrow.

codefromthecrypt · 2017-04-14T10:00:32Z

@semyonslepov @tramchamploo @mansu @devinsba @jcarres-mdsol can any of you try zipkin 1.23? I'd like to get someone to use it in real life before propagating to other projects.

codefromthecrypt · 2017-04-14T10:24:14Z

if any of you do get a chance, along with response time on /api/v1/services and /api/v1/spans please also report back any differences in storage usage, too. A lot of mapping options are now turned off, so it should be less though I don't know what percentage real life usage will end up as.

devinsba · 2017-04-14T13:38:18Z

I'll need to update the zipkin-aws image to be based off this in order for me to try it, gonna do a local build some time today

codefromthecrypt · 2017-04-15T09:56:59Z

@devinsba try 0.2.2 :)

semyonslepov · 2017-04-18T14:16:15Z

There are some result from my tests:

/api/v1/services (1 RPS):

Test 1, ~1000000 documents in index, ~250 service names with 1 span name for each:
zipkin 1.21.0:
98% - 200 ms
50% - 150 ms

zipkin 1.23.0:
98% - 50 ms
50% - 20 ms

Test 2, ~1500000 in index, ~450 service names with 1 span name for each:

zipkin 1.21.0:
98% - 250 ms
50% - 200 ms

zipkin 1.23.0: (same as Test 1)
98% - 50 ms
50% - 20 ms

When data amount increases, this difference becomes more.

/api/v1/spans (10 RPS):

(one test, something about ~5000000 documents in index, ~250 service names, 1-2 span names for each service)

zipkin 1.21.0:
98% - 150 ms
50% - 50 ms

zipkin 1.23.0:
98% - 50 ms
50% - 20 ms

I saw no significant difference in storage usage for these types of requests with my test conditions.
But there is significant change of CPU usage on Zipkin-hosts: I made two tests with writing 300 spans per second on free-tier AWS t2.micro instance and got ~25% CPU load with 1.21.0, with 1.23.0 at some point (after 3-4 minutes from test beginning) it was 100% with request dropping (ok, it's not correct enough - I set 10s timeout for requests, so my test-tool didn't get any answers after 10s).

codefromthecrypt · 2017-04-19T02:56:52Z

@semyonslepov thanks for the feedback! fair to say overall better? :)

yeah I wouldn't expect zipkin CPU to go down based on this change, though I would expect elasticsearch's CPU to go down. zipkin is actually doing slightly more, but if its CPU load when consuming becomes an issue we can probably profile a bit.

semyonslepov · 2017-04-19T16:04:11Z

Made some more test today to prove earlier theories.
Yes, we can say that service names query is much better (span names query too, but not so significant). And it's awesome!
(ES CPU goes to top on more than 3-4 /api/v1/services queries per second on our setup. Taking into account the fact of existing browser cache for service names, we don't have so many requests from our users to this API handler. But anyway we have to keep it in mind)

Regarding ES CPU, I don't see that it's going down with 1.23.0 release. Even more, in my tests it's going up on the same workload (sending ~1500 spans per second) (I'm not sure about the cleanness of my tests, maybe there is a noise, but I have done them twice for each release. Will be fine to hear something from another users). On the same ES configuration I saw something about 50-55% load of ES CPU with 1.21.0 release and 70% on 1.23.0 (70% vs 95% on another test).

Actually, Zipkin CPU doesn't increase so fast on more powerful configurations (m3.large instance with 2 CPU cores and 7.5GB RAM) and as it was in my yesterday tests (t2.micro instance with 1 CPU core and 1GB RAM, today CPUs are slighlty more powerful). So, this difference is not so critical.

devinsba · 2017-04-19T20:12:07Z

For contrast, at ~800 RPS peak for our ES cluster I am seeing no discernible change in the CPU usage (maybe 1% increase, likely just increased load), our max CPU is only like 15% though because we have more instances than we need. And none on the app either.

We are ES 2.3 with 5 m3.large data nodes

codefromthecrypt · 2017-04-20T00:27:11Z

Ahh.. i think i know what might increase the load on ES. It is probably less about finer tuned indexing on spans, rather more about having to index service and span name separately as it wasnt before. If this becomes an issue or we just want to sort it out.. we can employ a deduping approach used in cassandra. Essentially we dont write the same names twice from the same node (rather than overwriting). If your tests are clean enough (as they seem to be) we could test that deduping brings load back down..

codefromthecrypt mentioned this issue Apr 10, 2017

add internal cache for /api/v1/services with scheduled update #1554

Closed

codefromthecrypt mentioned this issue Apr 12, 2017

Refactors elasticsearch service and span name query to be like c* #1560

Merged

codefromthecrypt closed this as completed in #1560 Apr 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linear response time on services query in elasticsearch #1526

Linear response time on services query in elasticsearch #1526

codefromthecrypt commented Feb 24, 2017

codefromthecrypt commented Apr 10, 2017

semyonslepov commented Apr 10, 2017

codefromthecrypt commented Apr 11, 2017 via email

codefromthecrypt commented Apr 11, 2017

mansu commented Apr 11, 2017

codefromthecrypt commented Apr 11, 2017 via email

mansu commented Apr 11, 2017

codefromthecrypt commented Apr 11, 2017 via email

semyonslepov commented Apr 11, 2017

codefromthecrypt commented Apr 11, 2017 via email

semyonslepov commented Apr 11, 2017

tramchamploo commented Apr 11, 2017 •

edited

Loading

codefromthecrypt commented Apr 11, 2017 via email

tramchamploo commented Apr 12, 2017

codefromthecrypt commented Apr 12, 2017 via email

codefromthecrypt commented Apr 12, 2017

codefromthecrypt commented Apr 13, 2017

codefromthecrypt commented Apr 13, 2017

codefromthecrypt commented Apr 14, 2017 •

edited

Loading

codefromthecrypt commented Apr 14, 2017

devinsba commented Apr 14, 2017 •

edited

Loading

codefromthecrypt commented Apr 15, 2017

semyonslepov commented Apr 18, 2017 •

edited

Loading

codefromthecrypt commented Apr 19, 2017

semyonslepov commented Apr 19, 2017

devinsba commented Apr 19, 2017 •

edited

Loading

codefromthecrypt commented Apr 20, 2017 via email

Linear response time on services query in elasticsearch #1526

Linear response time on services query in elasticsearch #1526

Comments

codefromthecrypt commented Feb 24, 2017

codefromthecrypt commented Apr 10, 2017

semyonslepov commented Apr 10, 2017

codefromthecrypt commented Apr 11, 2017 via email

codefromthecrypt commented Apr 11, 2017

mansu commented Apr 11, 2017

codefromthecrypt commented Apr 11, 2017 via email

mansu commented Apr 11, 2017

codefromthecrypt commented Apr 11, 2017 via email

semyonslepov commented Apr 11, 2017

codefromthecrypt commented Apr 11, 2017 via email

semyonslepov commented Apr 11, 2017

tramchamploo commented Apr 11, 2017 • edited Loading

codefromthecrypt commented Apr 11, 2017 via email

tramchamploo commented Apr 12, 2017

codefromthecrypt commented Apr 12, 2017 via email

codefromthecrypt commented Apr 12, 2017

codefromthecrypt commented Apr 13, 2017

codefromthecrypt commented Apr 13, 2017

codefromthecrypt commented Apr 14, 2017 • edited Loading

codefromthecrypt commented Apr 14, 2017

devinsba commented Apr 14, 2017 • edited Loading

codefromthecrypt commented Apr 15, 2017

semyonslepov commented Apr 18, 2017 • edited Loading

codefromthecrypt commented Apr 19, 2017

semyonslepov commented Apr 19, 2017

devinsba commented Apr 19, 2017 • edited Loading

codefromthecrypt commented Apr 20, 2017 via email

tramchamploo commented Apr 11, 2017 •

edited

Loading

codefromthecrypt commented Apr 14, 2017 •

edited

Loading

devinsba commented Apr 14, 2017 •

edited

Loading

semyonslepov commented Apr 18, 2017 •

edited

Loading

devinsba commented Apr 19, 2017 •

edited

Loading