Use span name table instead of materialized view #1374

llinder · 2016-10-31T22:03:31Z

This switches span name and service name lookups to use a table instead of a materialized view. This change will fix Empty span name list in UI with Cassandra3 storage component #1360
Change limit logic to limit on a set of trace IDs instead of limiting on the provided collection first. In pracitce this didn't make a noticiable difference in the results but it seems like the intended logic.

codefromthecrypt · 2016-11-01T10:40:18Z

merging tomorrow unless someone says not too cc @openzipkin/cassandra

codefromthecrypt

might want to check the README to ensure nothing is invalidated there

michaelsembwever · 2016-11-01T10:53:51Z

please hold @adriancole
would like to check this.

codefromthecrypt · 2016-11-01T10:56:14Z

@michaelsembwever no problemo

michaelsembwever · 2016-11-03T04:56:05Z

This looks good. It's a shame that MV didn't work out, but i can't see a way around it compared to the simplicity of a manually denormalised table.

Am updating the stress yaml files appropriately for the schema change.

Looking into compatibility too. I see no incompatibility, although the MV should be dropped, and make sure the table gets created within an existing keyspace.

michaelsembwever · 2016-11-03T06:38:21Z

Stress results

Very rough runs on a dual-core laptop.
15k partitions entered into each table over 5 minutes, in parallel.
Stress yaml profiles were used from the project according to their documentation.

Write Only

Existing

traces
Latency mean              :    2.9 ms [insert: 2.9 ms]
Latency median            :    2.1 ms [insert: 2.1 ms]
Latency 95th percentile   :    5.9 ms [insert: 5.9 ms]
Latency 99th percentile   :   28.5 ms [insert: 28.5 ms]

trace_by_service_span
Latency mean              :    3.6 ms [insert: 3.6 ms]
Latency median            :    0.8 ms [insert: 0.8 ms]
Latency 95th percentile   :   12.2 ms [insert: 12.2 ms]
Latency 99th percentile   :   43.4 ms [insert: 43.4 ms]

Changes here

traces
Latency mean              :    4.4 ms [insert: 4.4 ms]
Latency median            :    3.0 ms [insert: 3.0 ms]
Latency 95th percentile   :    8.6 ms [insert: 8.6 ms]
Latency 99th percentile   :   35.1 ms [insert: 35.1 ms]

span_name_by_service
Latency mean              :    2.1 ms [insert: 2.1 ms]
Latency median            :    0.6 ms [insert: 0.6 ms]
Latency 95th percentile   :    3.5 ms [insert: 3.5 ms]
Latency 99th percentile   :   30.7 ms [insert: 30.7 ms]

trace_by_service_span
Latency mean              :    1.9 ms [insert: 1.9 ms]
Latency median            :    0.8 ms [insert: 0.8 ms]
Latency 95th percentile   :    2.8 ms [insert: 2.8 ms]
Latency 99th percentile   :   29.8 ms [insert: 29.8 ms]

Write&Read

Existing

traces
Latency mean              :    1.8 ms [by_annotation: 2.0 ms, by_trace: 1.1 ms, by_trace_ts_id: 1.2 ms, insert: 2.9 ms]
Latency median            :    1.0 ms [by_annotation: 1.3 ms, by_trace: 0.5 ms, by_trace_ts_id: 0.5 ms, insert: 2.2 ms]
Latency 95th percentile   :    4.8 ms [by_annotation: 4.5 ms, by_trace: 3.5 ms, by_trace_ts_id: 3.5 ms, insert: 7.4 ms]
Latency 99th percentile   :   14.5 ms [by_annotation: 15.4 ms, by_trace: 10.6 ms, by_trace_ts_id: 12.0 ms, insert: 17.7 ms]

trace_by_service_span
Latency mean              :    1.6 ms [by_duration: 3.0 ms, insert: 0.9 ms, select: 0.8 ms]
Latency median            :    0.7 ms [by_duration: 2.4 ms, insert: 0.6 ms, select: 0.5 ms]
Latency 95th percentile   :    4.4 ms [by_duration: 6.0 ms, insert: 1.4 ms, select: 1.2 ms]
Latency 99th percentile   :    9.2 ms [by_duration: 15.5 ms, insert: 5.9 ms, select: 4.2 ms]

Changes here

traces
Latency mean              :    3.1 ms [by_annotation: 3.9 ms, by_trace: 1.7 ms, by_trace_ts_id: 1.7 ms, insert: 5.0 ms]
Latency median            :    1.5 ms [by_annotation: 3.1 ms, by_trace: 1.1 ms, by_trace_ts_id: 1.1 ms, insert: 4.6 ms]
Latency 95th percentile   :    7.9 ms [by_annotation: 6.6 ms, by_trace: 3.2 ms, by_trace_ts_id: 3.1 ms, insert: 9.9 ms]
Latency 99th percentile   :   18.9 ms [by_annotation: 18.0 ms, by_trace: 14.2 ms, by_trace_ts_id: 15.2 ms, insert: 23.6 ms]

span_name_by_service
Latency mean              :    2.0 ms [insert: 1.4 ms, select: 3.1 ms, select_span_names: 1.5 ms]
Latency median            :    1.0 ms [insert: 0.8 ms, select: 2.5 ms, select_span_names: 0.9 ms]
Latency 95th percentile   :    3.7 ms [insert: 2.4 ms, select: 4.6 ms, select_span_names: 2.2 ms]
Latency 99th percentile   :   11.7 ms [insert: 8.7 ms, select: 14.1 ms, select_span_names: 10.1 ms]

trace_by_service_span
Latency mean              :    2.6 ms [by_duration: 5.0 ms, insert: 1.3 ms, select: 1.5 ms]
Latency median            :    1.0 ms [by_duration: 3.8 ms, insert: 0.8 ms, select: 0.9 ms]
Latency 95th percentile   :    8.0 ms [by_duration: 10.1 ms, insert: 1.5 ms, select: 1.7 ms]
Latency 99th percentile   :   15.3 ms [by_duration: 20.1 ms, insert: 7.4 ms, select: 10.2 ms]

michaelsembwever · 2016-11-03T06:41:50Z

A concern i've had with the cassandra3 schema is the partition keys of service_name in either trace_by_service or span_name_by_service are a bit broad and susceptible to hot spots and wideness.

This problem can be addressed, if need be, on the schema proposed here by re-introducing the DeduplicatingExecutor (see the previous cassandra storage code).

michaelsembwever · 2016-11-03T06:46:46Z

The stress tests are interesting in that they (potentially) show that the schema proposed here is slower than the existing materialised view. I did not expect this. I wonder if some of the complexity and its latency to the materialised view is hidden from these results. I have no objections to accepting the new schema and any (potentially) small lost in performance.

michaelsembwever · 2016-11-03T06:50:42Z

I pushed my commit to this branch containing the updates to the stress yaml profiles.

I didn't actually mean to push it to your fork. I had no idea I could do that… @llinder

codefromthecrypt · 2016-11-03T06:55:20Z

zipkin-storage/cassandra3/src/main/java/zipkin/storage/cassandra3/CassandraSpanConsumer.java

+    }
+  }
+
+  static BigInteger traceId(Span span) {


needs a rebase on master, as this shifted

basically this is in CassandraUtil now

codefromthecrypt

Change looks necessary to keep the experience up. We can re-introduce the deduper as a separate change or add it here. If we add it here, let's make sure the test is carried over, too.

just needs a rebase to buff-out the trace id thing.

llinder · 2016-11-03T19:25:25Z

@michaelsembwever thanks for updating the stress yaml profiles and running a comparison test. It is indeed interesting to see the results of the stress test. Really wish the MV worked as well, especially given it appears to perform better as well. I will review the DeduplicatingExecutor code and get this rebased today or tomorrow at the latest.

Regarding the schema change. Is the preference to create a new CQL that contains the drop/add statements?

Thanks for the detailed review and comments!

- This switches span name and service name lookups to use a table instead of a materialized view. This change will fix openzipkin#1360 - Change limit logic to limit on a set of trace IDs instead of limiting on the provided collection first. In pracitce this didn't make a noticiable difference in the results but it seems like the intended logic.

llinder · 2016-11-03T22:12:22Z

Rebased the latest upstream changes on master. Also added the deduper on writes writes on trace_by_service_span and span_name_by_service.

I think the only outstanding thing to address is the schema changes. Given that this is still marked experimental do we want to start maintaining schema change sets? If thats what is preferred I will pull the table/view changes out to a separate file.

michaelsembwever · 2016-11-03T23:43:11Z

I will review the DeduplicatingExecutor code…

Let's leave the DeduplicatingExecutor out for now.
Feel free to squash my additional commit into your commit, I would appreciate having just one commit in the branch before merging.

Regarding the schema change. Is the preference to create a new CQL that contains the drop/add statements?

Check out how the Schema class in the old cassandra storage dealt with upgrades.

codefromthecrypt · 2016-11-04T02:43:13Z

I don't think we should be dealing with upgrades at this point. One of the reasons we merged earlier was under the assumption that this is experimental. With that in mind, there should be no expectations around schema upgrade logic (that tends to very much clutter the code when in place)

codefromthecrypt · 2016-11-04T02:45:15Z

zipkin-storage/cassandra3/src/main/java/zipkin/storage/cassandra3/CassandraSpanConsumer.java

    } catch (RuntimeException ex) {
      return Futures.immediateFailedFuture(ex);
    }
  }
+
+  static BigInteger traceId(Span span) {


there's still some drift here, but I'll take care of it post merge

This reverts #1374 and associated cleanup as the tests are failing. It wasn't noticed earlier that circleci wasn't running cassandra3 tests! https://circleci.com/gh/openzipkin/zipkin/374 This reverts commit 64abfdd. This reverts commit f9c7af1. This reverts commit b2e3425.

codefromthecrypt · 2016-11-05T04:05:01Z

@llinder @michaelsembwever @abesto I think there's something wrong with our circleci or something as the cassandra3 tests skip. This led me to merge on false-green. Today, I noticed locally that the tests fail, which is bad, so I reverted this commit and the associated cleanups.

10b8552

sorry about the confusion. Once it is working locally, please raise another PR. Probably best to start by reverting the revert commit above, so you can start with cleanups applied.

michaelsembwever · 2016-11-06T04:42:32Z

One of the reasons we merged earlier was under the assumption that this is experimental.

+1

michaelsembwever · 2016-11-06T04:49:22Z

I'm not too sure what happened with this merge, but it did not include my commit.

- This switches span name and service name lookups to use a table instead of a materialized view. This change will fix #1360 - Change limit logic to limit on a set of trace IDs instead of limiting on the provided collection first. In pracitce this didn't make a noticiable difference in the results but it seems like the intended logic. - update stress profiles for new span_name_by_service tables references: - #1392 - #1360 - #1374

michaelsembwever · 2016-11-06T09:44:21Z

Re-created in #1392

- This switches span name and service name lookups to use a table instead of a materialized view. This change will fix #1360 - Change limit logic to limit on a set of trace IDs instead of limiting on the provided collection first. In pracitce this didn't make a noticiable difference in the results but it seems like the intended logic. - update stress profiles for new span_name_by_service tables references: - #1392 - #1360 - #1374

codefromthecrypt approved these changes Nov 1, 2016

View reviewed changes

llinder force-pushed the span_names branch from 88b3878 to 274c0be Compare November 2, 2016 15:27

michaelsembwever approved these changes Nov 3, 2016

View reviewed changes

codefromthecrypt reviewed Nov 3, 2016

View reviewed changes

codefromthecrypt requested changes Nov 3, 2016

View reviewed changes

llinder force-pushed the span_names branch from 260fefd to e57d95b Compare November 3, 2016 21:48

codefromthecrypt approved these changes Nov 4, 2016

View reviewed changes

codefromthecrypt reviewed Nov 4, 2016

View reviewed changes

codefromthecrypt merged commit 64abfdd into openzipkin:master Nov 4, 2016

michaelsembwever mentioned this pull request Nov 6, 2016

Use span name table instead of materialized view #1392

Merged

codefromthecrypt mentioned this pull request Feb 24, 2017

Linear response time on services query in elasticsearch #1526

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use span name table instead of materialized view #1374

Use span name table instead of materialized view #1374

llinder commented Oct 31, 2016

codefromthecrypt commented Nov 1, 2016

codefromthecrypt left a comment

michaelsembwever commented Nov 1, 2016

codefromthecrypt commented Nov 1, 2016

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016 •

edited

Loading

codefromthecrypt Nov 3, 2016

codefromthecrypt Nov 3, 2016

codefromthecrypt left a comment

llinder commented Nov 3, 2016 •

edited

Loading

llinder commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

codefromthecrypt commented Nov 4, 2016

codefromthecrypt Nov 4, 2016

codefromthecrypt commented Nov 5, 2016

michaelsembwever commented Nov 6, 2016

michaelsembwever commented Nov 6, 2016

michaelsembwever commented Nov 6, 2016

Use span name table instead of materialized view #1374

Use span name table instead of materialized view #1374

Conversation

llinder commented Oct 31, 2016

codefromthecrypt commented Nov 1, 2016

codefromthecrypt left a comment

Choose a reason for hiding this comment

michaelsembwever commented Nov 1, 2016

codefromthecrypt commented Nov 1, 2016

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

Stress results

Write Only

Write&Read

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016 • edited Loading

codefromthecrypt Nov 3, 2016

Choose a reason for hiding this comment

codefromthecrypt Nov 3, 2016

Choose a reason for hiding this comment

codefromthecrypt left a comment

Choose a reason for hiding this comment

llinder commented Nov 3, 2016 • edited Loading

llinder commented Nov 3, 2016

michaelsembwever commented Nov 3, 2016

codefromthecrypt commented Nov 4, 2016

codefromthecrypt Nov 4, 2016

Choose a reason for hiding this comment

codefromthecrypt commented Nov 5, 2016

michaelsembwever commented Nov 6, 2016

michaelsembwever commented Nov 6, 2016

michaelsembwever commented Nov 6, 2016

michaelsembwever commented Nov 3, 2016 •

edited

Loading

llinder commented Nov 3, 2016 •

edited

Loading