Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data model #17

Merged
merged 4 commits into from
Apr 10, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ GEM
multipart-post (>= 1.2, < 3)
ffi (1.9.10)
gemoji (2.1.0)
github-pages (66)
github-pages (67)
RedCloth (= 4.2.9)
github-pages-health-check (= 1.1.0)
jekyll (= 3.0.3)
jekyll-coffeescript (= 1.0.1)
jekyll-feed (= 0.4.0)
jekyll-gist (= 1.4.0)
jekyll-github-metadata (= 1.9.0)
jekyll-github-metadata (= 1.10.0)
jekyll-mentions (= 1.1.2)
jekyll-paginate (= 1.1.0)
jekyll-redirect-from (= 0.10.0)
Expand Down Expand Up @@ -68,7 +68,7 @@ GEM
jekyll-feed (0.4.0)
jekyll-gist (1.4.0)
octokit (~> 4.2)
jekyll-github-metadata (1.9.0)
jekyll-github-metadata (1.10.0)
octokit (~> 4.0)
jekyll-mentions (1.1.2)
html-pipeline (~> 2.3)
Expand Down Expand Up @@ -112,7 +112,7 @@ GEM
redcarpet (3.3.3)
rouge (1.10.1)
safe_yaml (1.0.4)
sass (3.4.21)
sass (3.4.22)
sawyer (0.7.0)
addressable (>= 2.3.5, < 2.5)
faraday (~> 0.8, < 0.10)
Expand All @@ -127,8 +127,8 @@ PLATFORMS
ruby

DEPENDENCIES
github-pages (= 66)
github-pages (= 67)
json

BUNDLED WITH
1.11.2
1.12.0.pre.2
1 change: 1 addition & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ defaults:
layout: page
weight: 100 # Used to sort navbar items. Lower weight goes higher on the list.

highlighter: rouge
9 changes: 2 additions & 7 deletions index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,11 @@ weight: 0

Zipkin is a distributed tracing system. It helps gather timing data needed to
troubleshoot latency problems in microservice architectures. It manages both the
collection and lookup of this data through a Collector and a Query service.
collection and lookup of this data.
Zipkin’s design is based on the
[Google Dapper](http://research.google.com/pubs/pub36356.html) paper.

Collecting traces helps developers gain deeper knowledge about how certain
requests perform in a distributed system. Let’s say we’re having problems with
user requests timing out. We can look up traced requests that timed out and
display it in the web UI. We’ll be able to quickly find the service responsible
for adding the unexpected response time. If the service has been annotated
adequately we can also find out where in that service the issue is happening.
Applications are instrumented to report timing data to Zipkin. The Zipkin UI also presents a Dependency diagram showing how many traced requests went through each application. If you are troubleshooting latency problems or errors, you can filter or sort all traces based on the application, length of trace, annotation, or timestamp. Once you select a trace, you can see the percentage of the total trace time each span takes which allows you to identify the problem application.

## Where to go next?

Expand Down
49 changes: 23 additions & 26 deletions pages/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,53 +4,50 @@ weight: 2
---


These are the components that make up a fully fledged tracing system.

![Architecture overview]({{ site.github.url }}/public/img/architecture-0.png)

Instrumented libraries
----------------------

Tracing information is collected on each host using the instrumented libraries
and sent to Zipkin. When the host makes a request to another service, it passes
a few tracing identifers along with the request so we can later tie the data
a few tracing identifiers along with the request so we can later tie the data
together.

![Instrumentation architecture]({{ site.github.url }}/public/img/architecture-1.png)

To see if an instrumentation library already exists for your platform, see the
list of [existing instrumentations]({{ site.github.url
}}/pages/existing_instrumentations).
list of [existing instrumentations]({{ site.github.url}}/pages/existing_instrumentations).

![Architecture overview]({{ site.github.url }}/public/img/architecture-0.png)

Transport
---------

Spans must be transported from the services being traced to Zipkin collectors.
There are two primary transports, Scribe and Kafka. Scribe is deprecated.
Spans sent by the instrumented library must be transported from the services being traced to Zipkin collectors.
There are two primary transports: Scribe and Kafka. Scribe is shown in the figure. See [Span Receivers]({{ site.github.url }}/pages/span_receivers) for more information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP is also a primary transport


There are 4 components that make up Zipkin:
* collector
* storage
* search
* web UI

Zipkin Collector
----------------
### Zipkin Collector

Once the trace data arrives at the Zipkin collector daemon we check that it's
valid, store it and the index it for lookups.
Once the trace data arrives at the Zipkin collector daemon, it is validated, stored, and indexed for lookups by the Zipkin collector.

Storage
-------
### Storage

We originally built Zipkin on Cassandra for storage. It's scalable, has a
Zipkin was initially built to store data on Cassandra since Cassandra is scalable, has a
flexible schema, and is heavily used within Twitter. However, we made this
component pluggable, and we now have support for Redis and MySQL.
component pluggable. In addition to Cassandra, we support Redis and MySQL.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redis has been abandoned. Elasticsearch support exists.


Zipkin Query Service
--------------------
### Zipkin Query Service

Once the data is stored and indexed we need a way to extract it. This is where
the query daemon comes in, providing a simple JSON api for finding and retrieving
traces. The primary consumer of this api is the Web UI.
Once the data is stored and indexed, we need a way to extract it. The query daemon provides a simple JSON API for finding and retrieving traces. The primary consumer of this API is the Web UI.

Web UI
------
### Web UI

A GUI that presents a nice face for viewing traces. The web UI provides a
method for viewing traces based on service, time, and annotations. Note
that there is no built in authentication in the UI.
We created a GUI that presents a nice interface for viewing traces. The web UI provides a
method for viewing traces based on service, time, and annotations.
Note: there is no built-in authentication in the UI!
222 changes: 222 additions & 0 deletions pages/data_model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
---
title: Data Model
---

In order to illustrate the tracing data that Zipkin displays, let's relate it to the equivalent information in the Zipkin data model. By comparing these, we see that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start


+ inbound and outbound requests are in different spans
+ spans that include `cs` can log an `sa` annotation of where they are going
+ This helps when the destination protocol isn't Zipkin instrumented, such as MySQL.

First, we see one trace as shown in the Zipkin trace viewer:

![Zipkin Screenshot]({{ site.github.url }}/public/img/json_zipkin_screenshot.png)

And the same trace in the data model of Zipkin:

{% highlight json %}
[
[
{
"traceId": "bd7a977555f6b982",
"name": "get",
"id": "bd7a977555f6b982",
"timestamp": 1458702548467000,
"duration": 386000,
"annotations": [
{
"endpoint[": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548467000,
"value": "sr"
},
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548853000,
"value": "ss"
}
],
"binaryAnnotations": []
},
{
"traceId": "bd7a977555f6b982",
"name": "get-traces",
"id": "ebf33e1a81dc6f71",
"parentId": "bd7a977555f6b982",
"timestamp": 1458702548478000,
"duration": 354374,
"annotations": [],
"binaryAnnotations": [
{
"key": "lc",
"value": "JDBCSpanStore",
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
}
},
{
"key": "request",
"value": "QueryRequest{serviceName=zipkin-query, spanName=null, annotations=[], binaryAnnotations={}, minDuration=null, maxDuration=null, endTs=1458702548478, lookback=86400000, limit=1}",
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
}
}
]
},
{
"traceId": "bd7a977555f6b982",
"name": "query",
"id": "be2d01e33cc78d97",
"parentId": "ebf33e1a81dc6f71",
"timestamp": 1458702548786000,
"duration": 13000,
"annotations": [
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548786000,
"value": "cs"
},
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548799000,
"value": "cr"
}
],
"binaryAnnotations": [
{
"key": "jdbc.query",
"value": "select distinct `zipkin_spans`.`trace_id` from `zipkin_spans` join `zipkin_annotations` on (`zipkin_spans`.`trace_id` = `zipkin_annotations`.`trace_id` and `zipkin_spans`.`id` = `zipkin_annotations`.`span_id`) where (`zipkin_annotations`.`endpoint_service_name` = ? and `zipkin_spans`.`start_ts` between ? and ?) order by `zipkin_spans`.`start_ts` desc limit ?",
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
}
},
{
"key": "sa",
"value": true,
"endpoint": {
"serviceName": "spanstore-jdbc",
"ipv4": "127.0.0.1",
"port": 3306
}
}
]
},
{
"traceId": "bd7a977555f6b982",
"name": "query",
"id": "13038c5fee5a2f2e",
"parentId": "ebf33e1a81dc6f71",
"timestamp": 1458702548817000,
"duration": 1000,
"annotations": [
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548817000,
"value": "cs"
},
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548818000,
"value": "cr"
}
],
"binaryAnnotations": [
{
"key": "jdbc.query",
"val[[ue": "select `zipkin_spans`.`trace_id`, `zipkin_spans`.`id`, `zipkin_spans`.`name`, `zipkin_spans`.`parent_id`, `z[ipkin_spans`.`debug`, `zipkin_spans`.`start_ts`, `zipkin_spans`.`duration` from `zipkin_spans` where `zipkin_spans`.`trace_id` in (?)",
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
}
},
{
"key": "sa",
"value": true,
"endpoint": {
"serviceName": "spanstore-jdbc",
"ipv4": "127.0.0.1",
"port": 3306
}
}
]
},
{
"traceId": "bd7a977555f6b982",
"name": "query",
"id": "37ee55f3d3a94336",
"parentId": "ebf33e1a81dc6f71",
"timestamp": 1458702548827000,
"duration": 2000,
"annotations": [
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548827000,
"value": "cs"
},
{
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
},
"timestamp": 1458702548829000,
"value": "cr"
}
],
"binaryAnnotations": [
{
"key": "jdbc.query",
"value": "select `zipkin_annotations`.`trace_id`, `zipkin_annotations`.`span_id`, `zipkin_annotations`.`a_key`, `zipkin_annotations`.`a_value`, `zipkin_annotations`.`a_type`, `zipkin_annotations`.`a_timestamp`, `zipkin_annotations`.`endpoint_ipv4`, `zipkin_annotations`.`endpoint_port`, `zipkin_annotations`.`endpoint_service_name` from `zipkin_annotations` where `zipkin_annotations`.`trace_id` in (?) order by `zipkin_annotations`.`a_timestamp` asc, `zipkin_annotations`.`a_key` asc",
"endpoint": {
"serviceName": "zipkin-query",
"ipv4": "192.168.1.2",
"port": 9411
}
},
{
"key": "sa",
"value": true,
"endpoint": {
"serviceName": "spanstore-jdbc",
"ipv4": "127.0.0.1",
"port": 3306
}
}
]
}
]
{% endhighlight %}
4 changes: 2 additions & 2 deletions pages/existing_instrumentations.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ weight: 3
---

Tracing information is collected on each host using the instrumented libraries
and sent to Zipkin. When the host makes a request to another service, it propagates
a few tracing identifiers along with the request so we can later tie the data
and sent to Zipkin. When the host makes a request to another application, it passes
a few tracing identifiers along with the request to Zipkin so we can later tie the data
together into spans.

The following libraries exist to provide instrumentation on various platforms.
Expand Down
Loading