Save trace to permanent store #1747

naoman · 2017-09-27T23:28:06Z

Ref: #1093
Quick (and dirty) implementation of save trace. Sending out for review to get some feedback.

Following @adriancole suggestion of deploying Zipkin Server twice, one backed by regular datastore and one with permanent datastore.

Adding Save Trace button on the UI. Clicking this butting will make an AJAX request to backend api /api/v1/save/trace/<trace_id>, and on receiving this request, zipkin server will pull spans for this trace, and POST them to permanent store using api /api/v1/spans

Address of permanent datastore server can be configured by property zipkin.permanent.store

codefromthecrypt · 2017-09-28T00:16:57Z

Add screen shot?

PS: use v2 api as this is a new feature? Easier than having to do a migration of this to v2 later

codefromthecrypt

interesting.. I was thinking this was going to happen in javascript. maybe it should? that way it doesn't need to expose a new server-side endpoint.

naoman · 2017-09-28T22:24:20Z

use v2 api as this is a new feature? Easier than having to do a migration of this to v2 later

We have not migrated to V2 yet. Our datastore is in V1 format, so we have to read via V1 api. For writing to permanent store, either V1 or V2 is fine. Looks like zipkin server writes in V2 format even if we call V1 api, which seems a bit strange.
Ideally we should be able to run the server in V1 mode or V2 mode, and all APIs can then work accordingly. Do we have a config property that can do this?

interesting.. I was thinking this was going to happen in javascript. maybe it should? that way it doesn't need to expose a new server-side endpoint.

I wanted to keep the client light. Also, if its on the server side, other clients/scripts can also use it. We can move it to javascript if needed.

codefromthecrypt · 2017-09-29T01:59:55Z

I don't want to make an api like this as a part of the server api, especially not under v1 namespace. it doesn't impact the UI to do the blocking here or there.

In order to deploy this UI change, you will inherit the v2 api anyway, unless you are somehow hosting static assets independently.. is that the case?

naoman · 2017-09-29T03:42:32Z

I don't want to make an api like this as a part of the server api, especially not under v1 namespace. it doesn't impact the UI to do the blocking here or there.

So you want to go for javascript solution, where client (UI) reads form one zipkin-server and saves to permanent zipkin-server? I'll make the change.

In order to deploy this UI change, you will inherit the v2 api anyway, unless you are somehow hosting static assets independently.. is that the case?

Not sure what you mean. What I'm thinking is that UI/Javascript will read raw spans for a trace using /api/v1/trace/<trace_id> and post them to http://<permanent_zipkin_address>/api/v1/spans

codefromthecrypt · 2017-09-29T03:48:11Z

zipkin-ui/js/page/trace.js

+        e.preventDefault();
+        const traceId = this.attr.traceId;
+
+        $.ajax(`/api/v1/save/trace/${traceId}`).done(result => {


so basically I'd suggest having an alternative that reads from config.json and if a key like "permanentZipkinUrl" is present, show the link, and on clicking the link copy the result of /api/v2/trace/ID to a POST ${permanentZipkinEndpoint} (ex http://foo/bar/api/v2/spans)

you don't need to process the body in any way or even know about the v2 model. The server hosting the permanent repo doesn't need to be using v2 native storage, though obviously in the case of elasticsearch it should as v2 is a better investment for a concept like permanent.

For the handling, just copy the bytes from the first response to the request body of the second. the code will be easy and the results easy to reason with. The url associated with this task could be defined in crossroads.

If this isn't clear etc, lemme know!

one other consideration is to add a tag to the first span (easy in v2 model), just insert one in the tags dict.
ex tags.outage='disaster happened in JIRA FOO-1234' then in your permanent store, you can search for "outage" and get this trace. Such a feature could be worth a little javascript to make it work.

This will allow you to label the trace such that in your long-term storage you can look it up somehow. Otherwise, eventually, these favorite traces will become annoying to browse.

codefromthecrypt · 2017-09-29T04:06:31Z

Not sure what you mean. What I'm thinking is that UI/Javascript will read raw spans for a trace using /api/v1/trace/<trace_id> and post them to http://<permanent_zipkin_address>/api/v1/spans

I don't want to introduce new features that depend on v1 api. That's why I mentioned if your concern is about not having v2 api, yet, you would as a side-effect of deploying this update! So basically we shouldn't make a long term decision based on a short term upgrade concern. Ex save /api/v2/trace/<trace_id> to POST <permanent_zipkinv2_endpoint> not using /api/v2/spans explicitly in the config entry as some will mount it via proxy maybe to the same origin under a different path. Make sense?

codefromthecrypt · 2017-09-29T11:35:43Z

Sorry to not think of this earlier, but would this experience be ok? Ex we have a saved trace, but this UI wont be able to find saved traces.. Is that ok? If so, why? If not, why not?

naoman · 2017-09-29T20:07:03Z

Ex we have a saved trace, but this UI wont be able to find saved traces.. Is that ok? If so, why? If not, why not?

Ideally same UI should provide access to saved traces, but I think its okay to have a separate UI, if it helps keeping the system simple and maintainable. It could be considered an archiving feature, and in most systems, retrieving archived data has a separate process than active data.

Also, if a user is trying to access a trace thats not available in the regular datastore, the UI can share a link to archive/permanent store, or even pull trace from there. Search will be a bit tricker, but we can handle that too by picking the correct datastore based on the TTL for the regular store.

codefromthecrypt · 2017-09-30T10:17:48Z

Thanks for the info. I kindof like the archive term, as I don't want us necessarily to try to make the UI a fan-out system at the moment. Maybe we align the property name as such? like archiveEndpoint? Also, folks deploying an archive instance of zipkin can set the property `zipkin.ui.environment` to archive to know that's what they are looking at. I basically would like this feature to be easy to explain with some coherent suggestions on setup... and so far I think archive is it. do you agree?

…d-endpoint

naoman · 2017-10-03T23:17:15Z

... and so far I think archive is it. do you agree?

Yes.

Error message

Success message, when zipkin.ui.archive-read-endpoint is not specified.

Success message, when zipkin.ui.archive-read-endpoint is specified.

codefromthecrypt

I like it. great docs, too!

Can you poke around the zipkin-js tests section to see if there's any low-hanging fruit (ex test case that already exists that you can pop a related unit test into)?

codefromthecrypt · 2017-10-04T01:42:04Z

zipkin-server/pom.xml

@@ -161,6 +161,11 @@
    </dependency>

    <dependency>
+      <groupId>org.springframework</groupId>


not needed right?

I was getting this exception while posting spans to /api/v2/spans

2017-10-03 11:08:28.605 ERROR 46979 --- [nio-9412-exec-2] o.a.c.c.C.[.[.[/].[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.NoClassDefFoundError: org/springframework/util/concurrent/SettableListenableFuture] with root cause java.lang.NoClassDefFoundError: org/springframework/util/concurrent/SettableListenableFuture at zipkin.server.ZipkinHttpCollector.validateAndStoreSpans(ZipkinHttpCollector.java:88) ~[classes!/:na] at zipkin.server.ZipkinHttpCollector.uploadSpansJson2(ZipkinHttpCollector.java:67) ~[classes!/:na] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_101] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_101] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_101]

This stack overflow suggested this solution. I couldn't get it working without adding this dependency.

codefromthecrypt · 2017-10-04T01:44:17Z

you'll want to rebase this on master, as the thing that allows proxies to work was merged

This adds a RabbitMQ collector module along with its corresponding auto-configuration.

…openzipkin#1752) For safety, create the stress schema in a different keyspace.

…l_dc has been defined, otherwise use ONE (openzipkin#1753) Mark all writes as idempotent. Update the retry policy, honouring idempotence, consistency level, and applying appropriate back-pressure.

* Support for running Zipkin behind a reverse proxy - subfolder 'zipkin' is enforced - rewrite of the 'base' tag in the 'index.html' file is required * Fixed 'redirectedHeaderUsesOriginalHostAndPort' test to ensure that a relative URI is returned for a redirect to 'zipkin/' subpath. * Using 'contextRoot' as a code style conforming reference to the '__webpack_public_path__' global variable. * Using relative URIs for AJAX calls (which works thanks to the 'head>base' html). Appending '/' to the 'contextRoot' if not present. * Added readme regarding running behind a reverse proxy. Fixed running in a 'devServer' mode. Assuming the 'zipkin-server' runs at 'localhost:9411', execute: proxy=http://localhost:9411/zipkin/ ./npm.sh run dev

codefromthecrypt · 2017-10-05T07:39:49Z

@jcarres-mdsol so we have had a number of requests for how to achieve permanent storage w/o complicating other things. I've not found a simpler way than this..

The notes would be that if you want permanent storage, setup zipkin somewhere (or a zipkin proxy like stackdriver), then point env variables to that. If using a normal zipkin server, the archival one should probably set the environment name to "archive" (something you can configure on the UI).

Does this clarify the setup? indeed it would be new for a lot of folks, but fairly easy to provision (add a permanent namespace to storage, keep a server running which needn't deal with high volume as it is only archival)

jorgheymans · 2017-10-05T08:15:48Z

will the UI have an href somewhere then to the archive endpoint ?

naoman · 2017-10-05T16:11:40Z

@jorgheymans when a trace is archived, the UI will show a link to archived trace in the confirmation dialog. Screenshot attached above.

@naoman just curious how many traces do you save to permanent store ? Is it a regularly occuring thing ?

Not many, maybe just a few traces per day. Because of our high trace volume, regular datastore has TTL of 10 days. This feature will let users save "interesting" traces, specially something related to a bug, for longer duration.

jorgheymans · 2017-10-06T07:39:01Z

@naoman what i meant was just a top level link on the ui somewhere that points to the archiving endpoint if it is configured. That way ppl only have to remember (or bookmark) the "main" zipkin instance.

naoman · 2017-10-06T17:43:04Z

@jorgheymans Yes, this can be useful. But at this point I want to keep the code simple and clean, since this feature will not be used by many users. In my personal experience, when archiving a trace, users will save the link to the trace itself, since they are archiving the trace to save or share it.

I would say we should go ahead with the current code, and revisit based on user feedback.

codefromthecrypt · 2017-10-07T01:39:38Z

I'm not sure how much impact there is to code hygiene to present a link to the archive repo. It does feel like it is more a feature if one has the ability to know where the archive repo is without saving a trace to figure it out. There's currently code that conditionally shows the environment label when present in configuration. How bad would the code look if a similar approach was used to place a link to the archives where people can see it?

naoman · 2017-10-07T02:01:52Z

Like I mentioned in my last message, this change is ready and fully functional. Adding a link would be an improvement on top of this change that can be made in a separate pull request. Lets merge this change, get the feature out there so that people can start using it, get user feedback, and then improve iteratively.

codefromthecrypt · 2017-10-07T04:06:17Z

Naoman I dont think you understand. This change is for the community and a valid request was made. We dont merge single user stuff, so consider it lucky someone else is interested!

naoman · 2017-10-07T06:05:58Z

We don't merge single user stuff, so consider it lucky someone else is interested!

Not sure what you mean. Do you think I'm trying to push something thats Pinterest specific, and not needed by other users?

Let me try to explain one more time. While the request to add an additional url link is valid, its an add-on feature for this change, and does not block it.

Can you please explain why these two changes can not be made in two separate pull requests?

jorgheymans · 2017-10-07T06:44:08Z

For sure this can be done in a separate PR, after all it was just a suggestion. We don't really have the need to save traces right now but once the feature is in I'll definitely configure an archive endpoint because it does pop up every now and then. I'll send a PR myself if I still see the need for it then. Op za 7 okt. 2017 08:06 schreef Naoman Abbas <[email protected]>:

…

We don't merge single user stuff, so consider it lucky someone else is interested! Not sure what you mean. Do you think I'm trying to push something thats Pinterest specific, and not needed by other users? Let me try to explain one more time. While the request to add an additional url link is valid, its an add-on feature for this change, and does not block it. Can you please explain why these two changes can not be made in two separate pull requests? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1747 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAL1AMs4RtHPZeC-7SanUzvZNNXV9H08ks5spxTVgaJpZM4Pmh70> .

codefromthecrypt · 2017-10-07T07:03:23Z

Let me try to explain one more time. While the request to add an additional url link is valid, its an add-on feature for this change, and does not block it.

This is a community project. One of the user/core team members felt this was hacky. Another user asked for an adjustment to the change. No one so far has actually said they want the change actively. We have to be careful to make sure code is actually useful as often non-core people disappear after they land something. Historically, UI code is write once, very few go back and maintain changes later. I don't have enough experience with you personally to know if you would actually go back later and do a pull request. The time you have taken to intentionally not answer the question has rid me of effort I can spend elsewhere and also yourself as looking at the suggestion wouldn't have been costly. Please be more considerate especially if you want others to help merge an unpopular change.

codefromthecrypt · 2017-10-07T07:08:12Z

@openzipkin/core I personally don't want to go around in explanation circles on this. I actually am ok with the change in general, but not motivated to continue this line of inquiry. If someone else is ok driving this, please do. I'm going to unsubscribe for the time being

naoman · 2017-10-07T20:14:52Z

@jorgheymans I've added a link to archive server on the menu bar. Let me know what you think. Below is the screenshot.

Seems a hacky solution, most people will not have this setup.

@jcarres-mdsol can you please elaborate on why you feel this solution is hacky and how it can be improved.

Would the button appear in the UI by default? It may set expectations it would work when in most cases would not

@jcarres-mdsol No it would not. I've added this information in Readme as well. Let me know if you think it needs more explanation there.

jorgheymans · 2017-10-07T20:39:31Z

Spot on that's exactly where I had imagined the link to appear :-) Op za 7 okt. 2017 22:15 schreef Naoman Abbas <[email protected]>:

…

@jorgheymans <https://github.com/jorgheymans> I've added a link to archive server on the menu bar. Let me know what you think. Below is the screenshot. [image: screen shot 2017-10-07 at 1 03 43 pm] <https://user-images.githubusercontent.com/6273158/31311426-d91ed056-ab60-11e7-8e5e-0e9790596fff.png> Seems a hacky solution, most people will not have this setup. @jcarres-mdsol <https://github.com/jcarres-mdsol> can you please elaborate on why you feel this solution is hacky and how it can be improved. Would the button appear in the UI by default? It may set expectations it would work when in most cases would not @jcarres-mdsol <https://github.com/jcarres-mdsol> No it would not. I've added this information in Readme as well. Let me know if you think it needs more explanation there. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1747 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAL1AApvJYx6IVFNpFbQ41OR04fsc6Q9ks5sp9vJgaJpZM4Pmh70> .

eirslett · 2017-10-07T20:42:38Z

Zipkin already has a feature to download a trace as a JSON file (and I think also one to upload a JSON file and render it in the UI). Wouldn't that cover the use case? You could just store the JSON file wherever you want, on a hard drive, dropbox etc...

I do agree with @adriancole's comments in general.

eirslett · 2017-10-07T20:50:35Z

Another solution is to run a cache proxy like Varnish in front of Zipkin, and make it cache results for 30 days. (Assuming that the amount of Zipkin data developers look at will be very small relative to the total amount of Zipkin data, so it could comfortably fit on a cache instance)

naoman · 2017-10-07T20:51:43Z

@eirslett Yes, thats do able, however it wont be very scaleable and convenient. Consider the use case where we want to save a trace related to a bug. Without this feature, we'll first have to download the trace, attach it to the bug ticket, and upload if back to zipkin server for looking at the trace. This feature provides a convenient way to archive traces with a single click.

eirslett · 2017-10-07T20:55:49Z

Without this feature, we'll first have to download the trace, attach it to the bug ticket, and upload if back to zipkin server for looking at the trace. This feature provides a convenient way to archive traces with a single click.

With the Varnish solution, I guess you could just copy/paste a link to the trace in the bug report?

naoman · 2017-10-07T20:56:18Z

@eirslett varnish proxy based cache will have its own limitations. Below are few:

what if we want to keep data for longer than cache TTL?
how do we enable data replication?
What if other tools (service dependency graph, trace analyzer scripts, etc) want to pull this data?

jcarres-mdsol · 2017-10-11T09:08:06Z

@naoman I think it is hacky because to add this new functionality, I need to double my infrastructure.

naoman · 2017-10-16T22:18:37Z

I think it is hacky because to add this new functionality, I need to double my infrastructure.

@jcarres-mdsol Can you please elaborate on this? This solution does require some additional infrastructure setup, but by no means it would double the infrastructure. For example, below is how Elasticsearch based solution look like:

A new index for archived traces in the same Elasticsearch cluster.
A new Zipkin Server instance for archived traces. This instance can run on the same hosts running regular Zipkin Server
A load-balancer in front of Zipkin archive server. This is the only additional infrastructure component needed for this solution.

If you still feel that this is too much hassle for archiving traces, can you propose a better solution?

This is an alternative to saving on the backend or via a proxy Closes openzipkin#1747

Save trace to permanent store

2264592

codefromthecrypt reviewed Sep 28, 2017

View reviewed changes

codefromthecrypt reviewed Sep 29, 2017

View reviewed changes

naoman added 7 commits October 2, 2017 14:09

add archiveEndpoint and archiveReadEndpoint properties

f7d7cb0

archiving through javascript

64d64ce

moving to API v2

261fe1a

adding body to POST request response, and updating success dialog box

4ba0f84

cleaning server-side archive trace code, adding check for archive-rea…

c053cc3

…d-endpoint

updating Readme

0ec80ba

code and readme cleanup

22efcac

codefromthecrypt reviewed Oct 4, 2017

View reviewed changes

shakuzen and others added 9 commits October 3, 2017 19:42

RabbitMQ collector (openzipkin#1742)

781174c

This adds a RabbitMQ collector module along with its corresponding auto-configuration.

Skips null span name on v2 in-memory spanstore (openzipkin#1749)

71a6116

Allows MySQL to lazy set parentId (openzipkin#1755)

0e23726

Updates to latests deps, notably Spring Boot 1.5.7 (openzipkin#1756)

db46b64

[maven-release-plugin] prepare release 2.1.0

6fc1588

[maven-release-plugin] prepare for next development iteration

0a0e565

Keep a separate stress schema cql file, as stress can't use use UDTs. (…

fb39e21

…openzipkin#1752) For safety, create the stress schema in a different keyspace.

In Cassandra3 storage, only use consistency level LOCAL_ONE if a loca…

b0f8224

…l_dc has been defined, otherwise use ONE (openzipkin#1753) Mark all writes as idempotent. Update the retry policy, honouring idempotence, consistency level, and applying appropriate back-pressure.

adding link to archive server on the menu bar

af3de2e

adding checks fix failing test

8327caa

openzipkin deleted a comment Nov 15, 2017

Logic-32 mentioned this pull request Jan 15, 2018

Adding the ability to import traces from JSON into zipkin-ui #1884

Closed

codefromthecrypt closed this in 5f0c278 Mar 16, 2018

codefromthecrypt added enhancement ui Zipkin UI labels Oct 26, 2018

abesto pushed a commit to abesto/zipkin that referenced this pull request Sep 10, 2019

Adds "View Saved Trace" screen (openzipkin#1884)

9f7976a

This is an alternative to saving on the backend or via a proxy Closes openzipkin#1747

jorgheymans mentioned this pull request Mar 11, 2020

Add Archive Trace button #3018

Merged

Save trace to permanent store #1747

Save trace to permanent store #1747

Conversation

naoman commented Sep 27, 2017

codefromthecrypt commented Sep 28, 2017

codefromthecrypt left a comment

Choose a reason for hiding this comment

naoman commented Sep 28, 2017

codefromthecrypt commented Sep 29, 2017

naoman commented Sep 29, 2017

codefromthecrypt Sep 29, 2017

Choose a reason for hiding this comment

codefromthecrypt Sep 29, 2017

Choose a reason for hiding this comment

codefromthecrypt commented Sep 29, 2017 via email

codefromthecrypt commented Sep 29, 2017 via email

naoman commented Sep 29, 2017

codefromthecrypt commented Sep 30, 2017 via email

naoman commented Oct 3, 2017

codefromthecrypt left a comment

Choose a reason for hiding this comment

codefromthecrypt Oct 4, 2017

Choose a reason for hiding this comment

naoman Oct 4, 2017

Choose a reason for hiding this comment

codefromthecrypt commented Oct 4, 2017

codefromthecrypt commented Oct 5, 2017

jorgheymans commented Oct 5, 2017

naoman commented Oct 5, 2017

jorgheymans commented Oct 6, 2017

naoman commented Oct 6, 2017

codefromthecrypt commented Oct 7, 2017 via email

naoman commented Oct 7, 2017 • edited Loading

codefromthecrypt commented Oct 7, 2017 via email

naoman commented Oct 7, 2017 • edited Loading

jorgheymans commented Oct 7, 2017 via email

codefromthecrypt commented Oct 7, 2017 via email • edited Loading

codefromthecrypt commented Oct 7, 2017

naoman commented Oct 7, 2017

jorgheymans commented Oct 7, 2017 via email

eirslett commented Oct 7, 2017 • edited Loading

eirslett commented Oct 7, 2017

naoman commented Oct 7, 2017

eirslett commented Oct 7, 2017

naoman commented Oct 7, 2017 • edited Loading

jcarres-mdsol commented Oct 11, 2017

naoman commented Oct 16, 2017

naoman commented Oct 7, 2017 •

edited

Loading

naoman commented Oct 7, 2017 •

edited

Loading

codefromthecrypt commented Oct 7, 2017 via email •

edited

Loading

eirslett commented Oct 7, 2017 •

edited

Loading

naoman commented Oct 7, 2017 •

edited

Loading