-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing dropwizard emitter for druid #7363
Conversation
@jihoonson @b-slim: Can you please help with the review here ? |
## Introduction | ||
|
||
This extension integrates [Dropwizard](http://metrics.dropwizard.io/3.1.0/getting-started/#) metrics library with druid so that dropwizard users can easily absorb druid into their monitoring ecosystem. | ||
It accumulates druid metrics as dropwizard metrics, and emits them to various sinks via dropwizard supported reporters. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is scary, how much it accumulates is this onheap or offheap? is there a config to bound this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the accumulation does not mean it queues things up, it stores the latest values in the metrics registry, the memory usage depends on the number of metrics and metric types.
Note: We are not handling all individual query metrics here, it's just the metrics defined in the extension config which are counters, gauges and timers. So have minimal impact on heap usage.e.g. for a counter it uses a single AtomicLong, for gauge it gives an instantaneous value and we are caching that in the metrics extensions impl.
https://metrics.dropwizard.io/4.0.0/getting-started.html has more details on the individual event types.
|--------|-----------|---------|-------| | ||
|`druid.emitter.dropwizard.reporters`|List of dropwizard reporters to be used.|yes|none| | ||
|`druid.emitter.dropwizard.prefix`|Optional prefix to be used for metrics name|no|none| | ||
|`druid.emitter.dropwizard.includeHost`|Flag to include the hostname as part of the metric name.|no|yes| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this includes host and ports ? or only host?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
host and port both, It is essentially ServiceMetricEvent.getHost() value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you please make the doc more clear that it includes both host and port?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|`druid.emitter.dropwizard.reporters`|List of dropwizard reporters to be used.|yes|none| | ||
|`druid.emitter.dropwizard.prefix`|Optional prefix to be used for metrics name|no|none| | ||
|`druid.emitter.dropwizard.includeHost`|Flag to include the hostname as part of the metric name.|no|yes| | ||
|`druid.emitter.dropwizard.includeDimensionNames`|Flag to include the dimension names as part of the metric name.|no|yes| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am trying to understand here, why a used would like to get a metric without dimension name? am not sure am getting this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is mostly a copy of other stats-d emitter extension and left over of that, Agree, It is not much useful for dropwizard, will remove
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
|`druid.emitter.dropwizard.prefix`|Optional prefix to be used for metrics name|no|none| | ||
|`druid.emitter.dropwizard.includeHost`|Flag to include the hostname as part of the metric name.|no|yes| | ||
|`druid.emitter.dropwizard.includeDimensionNames`|Flag to include the dimension names as part of the metric name.|no|yes| | ||
|`druid.emitter.dropwizard.dimensionMapPath`|Path to JSON file defining the StatsD type, and desired dimensions for every Druid metric|no|Default mapping provided. See below.| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is statsD a typo mistake or it is actually called statsD ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left over of copy and paste, fixed.
|
||
Each metric emitted using Dropwizard must specify a type, one of `[timer, counter, guage, meter, histogram, ]`. Dropwizard Emitter expects this mapping to | ||
be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric. | ||
If the user does not specify their own JSON file, a default mapping is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add a link to the default mapping ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
Each metric emitted using Dropwizard must specify a type, one of `[timer, counter, guage, meter, histogram, ]`. Dropwizard Emitter expects this mapping to | ||
be provided as a JSON file. Additionally, this mapping specifies which dimensions should be included for each metric. | ||
If the user does not specify their own JSON file, a default mapping is used. | ||
All metrics are expected to be mapped. Metrics which are not mapped will log an error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about ignored or warning ? seems like error is too much, especially in Druid we add metrics pretty often.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to ignored.
metricMap = readMap(mapper, dimensionMapPath); | ||
} | ||
|
||
@Nullable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please add docs on how and why this will return null ? plus i see this nullable annotation all over the Druid code, am not sure what is used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added javadoc. null is returned when there is no mapping for a metric.
String service, | ||
String metric, | ||
Map<String, Object> userDims, | ||
Map<String, String> builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reading the signature of this method i can not tell what is actually doing and why a map object is called a builder ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed the builder to filteredDimensions and mentioned it in javadoc.
This is because some metrics are reported differently, but with the same name, from different services. | ||
*/ | ||
DropwizardMetricSpec metricSpec = null; | ||
if (metricMap.containsKey(metric)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do this in one get and a null check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
} | ||
return metricSpec; | ||
} else { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the meaning of null ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it means no mapping found for metric
{ | ||
}).readValue(is); | ||
} | ||
catch (IOException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about closing the stream on exception ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, why is is
not closed on normal exit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought ObjectReader.readValue will close the stream in normal flow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectReader.readValue(..)
doesn't mention closing the InputStream by itself, it might be safer to close it in finally block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
|
||
@Override | ||
public void start() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why we are not guarding this and check for start not set ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added guards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, when can it be called twice? It should be managed by the lifecycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it should be called through lifecycle, It seems other emitter has guards too so i added a it here also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LifecycleLock
utility is useful for this usecase
public class DropwizardConverter | ||
{ | ||
private static final Logger log = new Logger(DropwizardConverter.class); | ||
private Map<String, DropwizardMetricSpec> metricMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
public class DropwizardEmitter implements Emitter | ||
{ | ||
private static Logger log = new Logger(DropwizardEmitter.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
|
||
@Override | ||
public void close() | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as for start, shouldn't we check for start is true ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added guard
public void flush() | ||
{ | ||
for (DropwizardReporter reporter : reporters) { | ||
reporter.flush(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make sure this is not blocking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is non-blocking, basically a no-op for the current two implementations of reporters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could mention that it must be non blocking in the javadoc of DropwizardReporter.flush()
, so that if someone implements another then they know about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
final Injector injector | ||
) | ||
{ | ||
List<Emitter> alertEmitters = Lists.transform( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use native JDK functions/streams and move away from guava.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@Override | ||
public void close() | ||
{ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing to close ? this reporter has not state to clean ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
~ KIND, either express or implied. See the License for the | ||
~ specific language governing permissions and limitations | ||
~ under the License. | ||
~ Licensed to Metamarkets Group Inc. (Metamarkets) under one |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why this change is needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
@@ -126,6 +126,9 @@ | |||
* <p> | |||
* The RemoteTaskRunner uses ZK for job management and assignment and http for IPC messages. | |||
*/ | |||
|
|||
|
|||
//KubernetesTaskRunner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed.
@nishantmonu51 am very ignorant of this DropWizard stuff, can you please explain where the metrics are buffered ? and how often cleaned ? how we can make sure it does not overflow ? and how the JMX one works do we need some Jvm flags to be set to turn it on can you please document how we can test that ? |
private final DropwizardEmitterConfig config; | ||
// Note: the gauges do not represent the actual instantaneous value for the metrics. | ||
// Instead they have the last known value for it. | ||
private final ConcurrentHashMap<String, Number> gagues = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so if let say we have a metric where the set of the key grows over time eg queryId or something similar, this map will grow without any bound and that can lead to big troubles, how about making the keys weak refs or having an actual limit on the size of this map ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking into it, we can add a limit to the map size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a limit to the size of map.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions. |
This pull request/issue is no longer marked as stale. |
1 similar comment
This pull request/issue is no longer marked as stale. |
1c234e0
to
be2b9a0
Compare
@jihoonson @b-slim : Updated PR based on review comments, please check. |
Used to report druid metrics via JMX. | ||
``` | ||
|
||
druid.emitter.dropwizard.reporter={"type":"jmx"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
druid.emitter.dropwizard.reporter={"type":"jmx"} | |
druid.emitter.dropwizard.reporters=[{"type":"jmx"}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don e.
|
||
``` | ||
|
||
druid.emitter.dropwizard.reporter={"type":"console","emitIntervalInSecs":30}"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
druid.emitter.dropwizard.reporter={"type":"console","emitIntervalInSecs":30}"} | |
druid.emitter.dropwizard.reporters=[{"type":"console","emitIntervalInSecs":30}"}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
``` | ||
|
||
### Default Metrics Mapping | ||
```json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it is possible to point this to actual file in the source code because this has the potential to go out of sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a link too. Initially it was a link only but added the file as part of review comments.
nameBuilder.add(config.getPrefix()); | ||
} | ||
nameBuilder.add("metric=" + metric); | ||
nameBuilder.add("service=" + service); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use string format instead of concatenations here and in all other places as this code is gonna be called many many times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@JsonProperty("maxGaugeCount") Integer maxGaugeCount | ||
) | ||
{ | ||
this.reporters = reporters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
null/empty check ? it would be useless to set it up with no reporters and probably a configuration error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
private final DropwizardEmitterConfig config; | ||
// Note: the gauges do not represent the actual instantaneous value for the metrics. | ||
// Instead they have the last known value for it. | ||
private final Map<String, Number> gagues; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private final Map<String, Number> gagues; | |
private final Map<String, Number> gauges; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
this.config = config; | ||
this.reporters = config.getReporters(); | ||
this.converter = new DropwizardConverter(mapper, config.getDimensionMapPath()); | ||
this.gagues = Collections.synchronizedMap(new GaugesCache<>(config.getMaxGaugeCount())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we use guava Cache instead of doing our own ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. replaced it use caffeine cache instead.
making metric manager and alert emitters as optional
more improvements improve docs refactrings
review comments review comments
db901a6
to
8775f16
Compare
LGTM overall, can you please address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix the build
This is an update for previous PR - #4196
Corresponding issues are
#3927
#1996
This extension integrates Dropwizard metrics library with druid so that dropwizard users can easily absorb druid into their monitoring ecosystem. It accumulates druid metrics as dropwizard metrics, and emits them to various sinks via dropwizard supported reporters.
Currently supported dropwizard metrics types counter, gauge, meter, timer and histogram.
These metrics can be emitted using either Console or JMX reporter.
Support for other reporters can be easily added in future PRs.
Note - I have cherry-picked the commit from previous author to provide him the credit of his work.