Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ElasticSearch alias to read spans from (Resolves #68) #86

Merged
merged 1 commit into from
May 24, 2022

Conversation

frittentheke
Copy link
Contributor

@frittentheke frittentheke commented Mar 25, 2020

Which problem is this PR solving?

Currently the spark job expects the spans to always come from date-postfixed indices, even though the Jaeger Collector can be configured to use ElasticSearch aliases and to make use of the rollover API (see:jaegertracing/jaeger#1309)

If that is the case the dependencies can simply not be extracted.
See issue: #68

Short description of the changes

  • Added a ES_USE_ALIASES parameter - just like the other Jaeger components have
  • Added a range query to only return the spans of the last 24 hours (lust like per day index would do)

Creds also go to my mate @msp14 who patiently explained Spark to me so I was able to create this PR.

@pavolloffay
Copy link
Member

@frittentheke could you please sign off the commits?

@frittentheke
Copy link
Contributor Author

@pavolloffay sorry about that, missed this on the second commit I pushed this morning.
Should all be signed-off now.

Copy link
Member

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The store path will have to also change. The data should be stored in *-write index.

Before going forward with this we should resolve the support in Jaeger and then have an integration test here.

// When using rollover indices there might be none or even multiple per day - thus Jaeger expects an alias to be available
// pointing to all potential indices.
// Always applying this "filter" does not do any harm even for the indexDated ones
String esQuery = "{\"range\": {\"startTimeMillis\": { \"gte\": \"now-24h\" }}}";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 this looks good.

@frittentheke
Copy link
Contributor Author

@pavolloffay FTAL (again).

@christine-gong
Copy link

@frittentheke I'm waiting on this feature! thanks for your effort! I saw your tests did pass on es7. is there any diff in es7 and non-es7 that missed here?

@frittentheke
Copy link
Contributor Author

@christine-gong I just tried to replicate the issues locally, but apparently the error is not solid. I had success sometimes. Maybe some sort of timing issues / race condition somewhere?

I updated the ES version from 5.x to 6.8.8 as part of commit 48b6321 though, as ES 5.x is End of Life for a while now (https://www.elastic.co/pt/support/eol)

@pavolloffay
Copy link
Member

Let's keep ES5 compatibility and be aligned with Jaeger ES version support. I don't think there is anything that should prevent us from using ES5.

@frittentheke
Copy link
Contributor Author

Let's keep ES5 compatibility and be aligned with Jaeger ES version support. I don't think there is anything that should prevent us from using ES5.

@pavolloffay allright - I changed that commit to a simple update to the latest point releases for ES 5 and 7.

@frittentheke
Copy link
Contributor Author

@pavolloffay do you mind having a look at the failing CI jobs. Looking at my changes again - they don't seem to affect the code path for the tests.
I also replicated them locally and they succeed or fail randomly there as well. This seems to be the case for previous jobs (not related to this PR) as well (https://travis-ci.org/github/jaegertracing/spark-dependencies/jobs/572258243).

@christine-gong
Copy link

@pavolloffay @frittentheke another thing I observed when I tried this changeset locally: jaeger ui uses api/dependencies to fetch dependencies while https://github.com/jaegertracing/jaeger/blob/5ed586c114152dfdb9f9643b9303fd40d45cc6e5/plugin/storage/es/dependencystore/storage.go#L86 relies on dated dependency indices too. To make it work with the minimum change, i updated the (prefix)-jaeger-dependencies-write to dated one (prefix)-jaeger-dependencies-yyyy-mm-dd

@frittentheke
Copy link
Contributor Author

@christine-gong
Copy link

@frittentheke got it! Sorry I missed that! Will check that tomorrow! Btw would current solution be able to run multiple times per day? E.g. I ran the spark job twice in a hour, Ui showed the summed up call counts instead of the latest one (time stamp fields in es both were today’s date)

@Leinad0033
Copy link

Any chance of this getting updated to a state where is can be merged into master?

@frittentheke
Copy link
Contributor Author

Any chance of this getting updated to a state where is can be merged into master?

@pavolloffay ?

@pavolloffay
Copy link
Member

The prerequisite to merge this is to implement aliases support for the dependency index in the main repo jaegertracing/jaeger#2144.

As in intermediate solution the dependencies could be stored into daily indices to keep the current compatibility and then once jaegertracing/jaeger#2144 lands add a second conf property to store the results to the write alias. If we go this route we should rename ES_USE_ALIASES to include span name e.g. ES_USE_SPAN_READ_ALIAS or just ES_USE_READ_ALIAS.

The PR has to be rebased.

@alexanderccc
Copy link

Are there any updates on this? I've also started using ILM with our jaeger indices in elasticsearch which made the dependencies job unusable as they're searching for data in the timestamped indices. Are there any plans on finishing this and having It integrated in the dependency jobs?

@frittentheke frittentheke force-pushed the esUseAliases branch 3 times, most recently from 649e6e9 to 792bad0 Compare October 10, 2020 14:46
@frittentheke
Copy link
Contributor Author

The prerequisite to merge this is to implement aliases support for the dependency index in the main repo jaegertracing/jaeger#2144.

As in intermediate solution the dependencies could be stored into daily indices to keep the current compatibility and then once jaegertracing/jaeger#2144 lands add a second conf property to store the results to the write alias. If we go this route we should rename ES_USE_ALIASES to include span name e.g. ES_USE_SPAN_READ_ALIAS or just ES_USE_READ_ALIAS.

The PR has to be rebased.

I added ES_USE_READ_ALIAS and ES_USE_WRITE_ALIAS with two individual commits now.
This allows for the intermediate solution of reading from an alias and independently switch to using a write alias.
I also rebased to current master.

Could you kindly PTAL @pavolloffay .

@alexanderccc FYI ^^

@Leinad0033
Copy link

@pavolloffay Any possibility of getting a review for this?

@Korzi
Copy link

Korzi commented Apr 13, 2021

Hi, is it going to be reviewed/merged? Or is there any other solution for spark job to use aliases?

@frittentheke
Copy link
Contributor Author

@Leinad0033 @Korzi where you able to verify if my changes work for you then?

@pavolloffay I know you are the only one maintaining this repo, but I just refreshed my PR jaegertracing/jaeger#2144 and maybe we can get them both reviewed one merged. They have been open so long - and both only add the little bit of functionality to be able to use ES aliases.

@frittentheke
Copy link
Contributor Author

@pavolloffay I know you are the only one maintaining this repo, but I just refreshed my PR jaegertracing/jaeger#2144 and maybe we can get them both reviewed one merged. They have been open so long - and both only add the little bit of functionality to be able to use ES aliases.

@pavolloffay @albertteoh with jaegertracing/jaeger#2144 now merged, could you maybe PTAL at this PR here again? With all ES indices behind rollover aliases it just makes sense to support this for the reporting as well.

run(indexDate("jaeger-span"), indexDate("jaeger-dependencies") ,peerServiceTag);

String[] readIndices;
String[] writeIndex;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: writeIndices since it's an array?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True but also not really intuitive then.

The data type is an array, just because the indexDate helper does return an array (

). But this is only for read indices.

I gladly clean this up by either splitting the helpers for read and write or by removing (breaking) compatibility with the old prefixBefore19 format, but it's changing more than the feature I wanted to add. Just let me know that you'd rather like

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that makes sense, given there'd only be one write index.

@albertteoh
Copy link
Collaborator

@frittentheke looks like there are merge conflicts that need addressing.

@frittentheke
Copy link
Contributor Author

@albertteoh ... Thanks for your lightning fast response and remarks!

To be brutally honest: I did not bother to take another look at the code (yet), until I knew you kindly would look at it.
I shall be fixing those issues and rebase the PR later.

Copy link
Collaborator

@albertteoh albertteoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

run(indexDate("jaeger-span"), indexDate("jaeger-dependencies") ,peerServiceTag);

String[] readIndices;
String[] writeIndex;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that makes sense, given there'd only be one write index.

@AhHa45
Copy link

AhHa45 commented Feb 22, 2022

bump @pavolloffay

@Karth54
Copy link

Karth54 commented Mar 2, 2022

Eagerly awaiting the release of this feature!


// write dependencies to alias index name
if (this.useWriteAlias) {
writeIndex = new String[] {prefix(indexPrefix) + "jaeger-dependencies-write"};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heya, just raising a bug I found from this - I enabled readAlias and writeAlias and realised that when it finds no dependency on the first loop, it gives me this error

 22/04/13 20:05:46 INFO ElasticsearchDependenciesJob: Running Dependencies job for 2022-04-13T00:00Z, reading from jaeger-span-read index, result storing to jaeger-dependencies-write     │
│ 22/04/13 20:05:56 INFO ElasticsearchDependenciesJob: Done, 0 dependency objects created                                                                                                   │
│ Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1

It is because writeIndex is one element in the array but readIndices has two.

Will need to do

Suggested change
writeIndex = new String[] {prefix(indexPrefix) + "jaeger-dependencies-write"};
writeIndex = new String[] {prefix(indexPrefix) + "jaeger-dependencies-write", prefixBefore19(indexPrefix) + "jaeger-dependencies-write"};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your feedback and the suggested change.
I, as you might have guessed, did not use the older prefixBefore19 style so I did not run into this issue.
But fair point. I rebased my PR and added your suggested change.

At some point it might be sensible to drop support for two styles and cleanup the code in this regard.
@albertteoh Do you mind taking another look and maybe provide a path to have this whole PR merged?

@frittentheke
Copy link
Contributor Author

@pavolloffay @albertteoh could you kindly take another peek at this and either suggest more changes, merge it or close it?

@Karth54
Copy link

Karth54 commented May 1, 2022

In an attempt to get this working for our production jaeger instance, I took your fork @frittentheke, and created and published a separate public docker image to use as a temporary solution. I didn't pass anything special when I built it. After deploying via helm, the job has been running for over 15 hours now. The environment variables that are passed are:

STORAGE=elasticsearch
ES_NODES=http://elasticsearch-master:9200
ES_NODES_WAN_ONLY='false'
ES_USE_READ_ALIAS='true'
ES_USE_WRITE_ALIAS='true'

Am I missing something? The only logs I see in the cronjob are:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/app/jaeger-spark-dependencies-0.0.1-SNAPSHOT.jar) to method java.nio.Bits.unaligned()
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
22/04/30 22:50:10 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
22/04/30 22:50:10 INFO ElasticsearchDependenciesJob: Running Dependencies job for 2022-04-30T00:00Z, reading from jaeger-span-read index, result storing to jaeger-dependencies-write

@Karth54
Copy link

Karth54 commented May 9, 2022

Okay so totally user error 😅 I hadn't upgraded the Jaeger collector/query version yet. But, what I did notice is that in the System Architecture section, I only see Force Directed Graph and not the other DAG tab. Could that be related to this or me just missing a change in the UI? Or something else?
Screen Shot 2022-05-09 at 2 43 43 PM

@frittentheke
Copy link
Contributor Author

@albertteoh @pavolloffay ... any chance somebody can get this PR merged?
Is more and more people using this by manually applying this really the right way to go?

…ollover indices

 * Added ES_USE_ALIASES flag uses 'jaeger-span-read' index alias instead of indexDate postfixed one and
   'jaeger-dependencies-write' to store the dependencies
 * Added range query to only fetch the spans for the last 24 hours (based on startTimeMillies field)

Signed-off-by: Christian Rohmann <[email protected]>
@frittentheke
Copy link
Contributor Author

@pavolloffay PTAL

@frittentheke
Copy link
Contributor Author

@albertteoh @pavolloffay is there anybody else who you take a peek. I'd really love to have this off my list.

@albertteoh albertteoh merged commit 278726d into jaegertracing:main May 24, 2022
@frittentheke frittentheke deleted the esUseAliases branch May 24, 2022 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants