-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to library #180
Convert to library #180
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mostly looks good to me, very non-invasive transition to a JAR usable by the APM tracer. Added a few small comments, but this looks real promising. Also, thanks for the cleanup on a few things! 🙇
@@ -24,8 +24,10 @@ public static void setup(Level level, String logLocation) { | |||
Logger.getRootLogger().addAppender(fa); | |||
LOGGER.info("File Handler set"); | |||
} else { | |||
System.out.println("Log location is not set, will output log to stdout."); | |||
ConsoleAppender consoleAppender = new ConsoleAppender(new PatternLayout(LOGGER_LAYOUT)); | |||
ConsoleAppender consoleAppender = new ConsoleAppender(new PatternLayout(LOGGER_LAYOUT), logLocation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per the log4j
docs I believe logLocation
can be either "System.out" or "System.err", not sure if we're being defensive enough here, also unsure what the behavior is otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I should not have passed logLocation there, it is handled later. Is this what you meant the problem is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you revert this line, this looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done. Sorry, I've reverted that a while back but apparently pushed into a wrong place. My bad, fixed now.
} | ||
|
||
@Override | ||
public void displayMatchingAttributeName(JMXAttribute jmxAttribute, int rank, int limit) { | ||
System.out.println(" Matching: " + rank + "/" + limit + ". " + jmxAttribute); | ||
LOGGER.debug(" Matching: " + rank + "/" + limit + ". " + jmxAttribute); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These methods have associated commands in datadog-agent
(stuff like datadog-agent jmxfetch list-matching
) that is expected to print to stdout always, as it's a debugging command. These LOGGER.debug
changes, unless I'm missing something, would make it subject to the log level, thus potentially breaking the agent commands. If we're going to keep this, it should probably be info
or warning
at the least - still maybe preferable to print to stdout
.... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConsoleReporter is accessible via tracing agent as well, so it would be nice to be consistent with it. I'll make this an info.
configurationList.add(new Configuration((LinkedHashMap<String, Object>) new YamlParser(this.getClass().getResourceAsStream("/jmx-2.yaml")).getParsedYaml())); | ||
loadMetricConfigFiles(appConfig, configurationList); | ||
|
||
ArrayList<LinkedHashMap<String, Object>> defaultConf = (ArrayList<LinkedHashMap<String, Object>>) new Yaml().load(this.getClass().getResourceAsStream("default-jmx-metrics.yaml")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess in your opinion we should always be submitting the resource configs with jmxfetch, even if those beans aren't configured. That might be a valid point, still I think the issue here might be the fact that jmxfetch metrics are only "free" for the first 350 metrics - after this limit they will be billed as custom metrics. Although this limit is normally more than enough, we should double-check the implications, maybe make it an opt-in feature when using jmxfetch on "standalone" mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change specifically just replaces jmx-1 and jmx-2 with single configuration file, and extends that configuration slightly. So this is sort of preexisting behavior.
Do you see a problem with this?
return configs; | ||
} | ||
|
||
private void loadFileConfigs(AppConfig config, ConcurrentHashMap<String, YamlParser> configs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you probably saw, currently on Agent6 the default way to provide configs to jmxfetch is via the datadog-agent
API. That's where we get the metrics, as a JSON payload from. Not sure if that use-case makes sense to you guys - maybe grabbing those from the agent itself, or from the APM agent. Just mentioning it in case that might apply, the more consistent we can be across use-cases (lib vs standalone, etc), the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, 'load files' was a preexisting functionality - I've just refactored some code into this helper function.
The new thing is below - loading files via resources. And this is kind of useful because this allows APM agent to package resource with itself and pass it to jmxfetch without too much hassle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. This is done in a way that does not affect the logic of Jmxfetch so I am not worried about that part.
The main thing we need to be careful with is keeping compatibility with preexisting setups.
- include: | ||
domain: java.lang | ||
type: GarbageCollector | ||
name: Copy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not seeing a mention to this filter in https://docs.datadoghq.com/integrations/java/#description-of-the-filters. What does it do ? And how do we differenciate jvm.gc.minor_collection_count
from Copy
vs ParNew
for ex ? Does it appends something to the metric name or add a tag ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thad documentation says On top of these parameters, the filters support “custom” keys which means that you can filter by bean parameters.
- which is what we are using here. We are just filtering by specific GC name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I understand now, seems like I missed that part 😅 . Now for the second part of my comment, I guess you can only have one Young Gen Collector
and one Old Gen Collector
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, JVM allows only one Old and one New gen collector - by there are a few different possibly types of those.
@@ -0,0 +1,145 @@ | |||
# Memory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see multiple differences between default-jmx-metrics.yaml
and the default metrics that we used to collect in jmx-1
and jmx-2
. It's very important that we don't remove/rename any metrics as it would break the graphs, monitors, etc built around them.
Here are the diffs that I found Before => Now :
jvm.thread_count
=>thread.count
jvm.gc.parnew.time
=>jvm.gc.minor_collection_time
?jvm.gc.cms.count
=>jvm.gc.minor_collection_count
?
There are also multiple new metrics introduced, do we need to add them and would it be possible to move them to another PR ? That way we can focus on keeping compatibility. Improving the testing here might make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially the idea is that JVM may run a few different types of garbage collection - but it is always one for new gen and one for old gen. Current ('old') version of the config doesn't really take this into account and sort of lumps the together - and also names metrics with GC type name, making it further confusing since it is not what actually is being reported.
The change I'm proposing is using example from here: https://github.com/DataDog/integrations-core/blob/bf11a2812c56b129c56f5cc389e74ee469bbd414/cassandra/datadog_checks/cassandra/data/conf.yaml (and couple of other places) that covers all known GC types and put them into predefined metrics. This would mean clients get GC metrics properly separated by GC type which seems like a useful thing.
Would it be ok if I leave current change in place and put back old metrics I've removed/changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed that change - please let me know if this is an acceptable solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that. This sounds acceptable and adding 4 new metrics should not be an issue. This would also have the nice side effect of allowing us to remove some complexity from the config files of other jmx integrations like cassandra.
We should remove jvm.gc.parnew.time
and jvm.gc.cms.count
from the docs and add the new ones after this is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll discuss this changes with my team to see if there is anything I am missing on why this was not added previously. I'll then do some sanity check with A5 and A6 and if everything looks good I'll approve
@@ -0,0 +1,43 @@ | |||
# This is a reduced set of 'default' metrics to make tests more predictable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, let's keep compatibility with the metric names. You can use https://docs.datadoghq.com/integrations/java/#metrics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is actually use only by tests, so it is not terribly important what specific metrics are inhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know but I was hoping that the tests would catch the metric name changes to avoid regressions. Did they not ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to make some changes to tests to make build pass... So I guess they did to some extent.
pom.xml
Outdated
<packaging>jar</packaging> | ||
<groupId>datadog</groupId> | ||
<artifactId>jmxfetch</artifactId> | ||
<version>0.20.3</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll bump that in another PR (and probably increase the minor since it's a pretty big change 😛 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me now, sorry that it was so long to merge this. Once https://github.com/DataDog/jmxfetch/pull/180/files#r222989521 is addressed, feel free to merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allow JMX fetch to be used as a library.
At this point this is more of an RFC and not intended to be merged directly. Main changes include:
System.exit()
is never called from main run function.