-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch: Make the span range configurable #91
Elasticsearch: Make the span range configurable #91
Conversation
99f12ce
to
b8d4e6b
Compare
Looks like those tests are also failing in master https://travis-ci.org/github/jaegertracing/spark-dependencies/builds/607266470, testing the image manually at the moment. |
README.md
Outdated
Defaults to false | ||
* `ES_INDEX_PREFIX`: index prefix of Jaeger indices. By default unset. | ||
* `ES_SPAN_RANGE`: How far into the past should the job look, the maximum and default is 24h. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a link to ES docs that lists what values can be used in this query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename this to ES_TIME_RANGE
? And document that span start time is used for the comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, I struggled a bit with wording. How does the change sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good 👍
@@ -219,7 +228,9 @@ void run(String[] spanIndices, String[] depIndices,String peerServiceTag) { | |||
String spanIndex = spanIndices[i]; | |||
String depIndex = depIndices[i]; | |||
log.info("Running Dependencies job for {}, reading from {} index, result storing to {}", day, spanIndex, depIndex); | |||
JavaPairRDD<String, Iterable<Span>> traces = JavaEsSpark.esJsonRDD(sc, spanIndex) | |||
// Send raw query to ES to select only the docs / spans we want to consider for this job | |||
String esQuery = String.format("{\"range\": {\"startTimeMillis\": { \"gte\": \"now-%s\" }}}", spanRange); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment in the sense that it does not change the default behavior bc the daily indices are max 24h.
e686110
to
1d37d18
Compare
Reopened the PR to trigger the travis. If the travis does not start could you please change the commit SHA to trigger it again? |
1d37d18
to
75d722b
Compare
Signed-off-by: Johannes Würbach <[email protected]>
75d722b
to
9e52bee
Compare
Pushed a few times before noticing that it actually seems to trigger a CI run https://travis-ci.org/github/jaegertracing/spark-dependencies/pull_requests :-(, but the status isn't reflected here. |
@pavolloffay looks like travis passed now 😍 |
Travis had a good day finally 😄 Thanks for the PR @johanneswuerbach |
@pavolloffay any chance the new version could be pushed to docker hub? That would awesome 🙇 |
It should be there. There is only |
I already checked the docker hub, but it seems the last build was 8 months ago https://hub.docker.com/r/jaegertracing/spark-dependencies/builds When I pull the latest image I also don't see those changes. Maybe there is something broken with the integration? |
Probably I will look into it |
Which problem is this PR solving?
We don't want to load the multiple GBs/day into memory a day to determine our dependency graph. Instead we are fine with an approximation based on the connections we saw in the past 2h.
Short description of the changes
Make the loaded span range configurable. Query taken from #86 ❤️