Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance test for integrated span collection #1148

Open
codefromthecrypt opened this issue Jun 27, 2016 · 3 comments
Open

Performance test for integrated span collection #1148

codefromthecrypt opened this issue Jun 27, 2016 · 3 comments

Comments

@codefromthecrypt
Copy link
Member

In the old repository, we mentioned we needed some work to facilitate integration benchmarking. For example, we need to be able to invoke reporters on-demand, regardless of whether that code lives here or in brave.

While many things need benchmarking, one can reasonably argue that span collection is the more critical. For example, there are far more applications reporting data then end-users of zipkin's UI or api. By benchmarking collection, we can help identify bugs, or limitations that impact zipkin's ability to perform its most basic function: storing spans.

It is important that this benchmark be something that others can run, as often laptops aren't representative. For example, in higher loads, there are likely multiple collector shards, and each may have different timeouts, thread pools, and heap size configuration than defaults.

We could have a test that produces spans against http, kafka or scribe and somehow knows how to analyze stats or otherwise to see how many actually arrived. For example, it could read the collector metrics until all messages are accepted, then look at the traces query until all spans are processed or timeout. On timeout, it could verify in a storage-specific way how many spans landed. This all is needed because storage operations are async.

Using such a tool, admins can sample or throttling writes to meet the performance characteristics of their integrated zipkin system. For example, they can set the collector sample rate accordingly or use something like zipkin-zookeeper to ensure writes don't exceed the capacity of the system.

The minimum scenarios should be tested:
Reusing the same assets we do for benchmarks, vary on span count, spans/message and messages/second. It is important that these spans have unique timestamps and ids, and that the timestamps vary on days. By using the same assets as our benchmarks, we can more consistently test improvements that may be library-specific.

See #1142 #961 #444

@codefromthecrypt
Copy link
Member Author

@jorgheymans
Copy link
Contributor

Not sure if the view that span collection is the most performance critical still applies. A laggy and slow UI makes users turn away from zipkin and not want to use it ...

The idea was insightful at the time though, observability was novel territory back then. I doubt though that we can easily provide something that will work for all sites. Most low traffic sites won't need this, and the high traffic sites are most likely well enough staffed that they can take care of this themselves. We would just end up in a whole range of support issues we don't want to get involved with.

Closing this one, if you still feel there's merit pursuing this feel free to reopen.

@codefromthecrypt
Copy link
Member Author

@anuraaga did the bulk of this with testcontainers. I think the general process could be lifted by another into multi-node somehow https://github.com/openzipkin/zipkin/blob/master/benchmarks/src/test/java/zipkin2/server/ServerIntegratedBenchmark.java

@jorgheymans jorgheymans reopened this Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants