-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Drop instrumentation-api-caching module and move weak cache implementation to instrumentation-api #4667
Conversation
…ation to instrumentation-api
The bigger problem is |
I think |
private final Cache<ClassLoader, Boolean> cache = | ||
Cache.builder().setWeakKeys().setMaximumSize(25).build(); | ||
Cache.builder().build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could just remove the max size here? This usage is very similar to the one in MuzzleMatcher
, and we don't have any limit there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for just weak keys (no max) here
* Keys are always referenced weakly and are compared using identity comparison, not | ||
* {@link Object#equals(Object)}. | ||
*/ | ||
public interface Cache<K, V> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to keep the name Cache
? It's now just a weak-keyed map, since we removed caffeine et al.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not want to rename in this PR, to avoid too many changes at once. We can rename later, if we want
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now there are two different implementations, so we better keep this generic name
Only 2 problems remain. One is simple: Another one is more complicated: |
6ec3420
to
830e1d2
Compare
This is an important cache for an important use case (jdbc). I'm not sure what alternative we have? I think LRU (CLHM) should be enough, not sure if we need more advanced LFU eviction (caffeine), which would avoid the complexity of caffeine2/3 for now at least (in both library and agent).
My guess based on prior work is that this is used to avoid re-analyzing classes during instrumentation (e.g. looking up structure/hierarchy of parent class/interfaces). This can help a lot with startup overhead. I recall the DataDog folks spent a good amount of time tuning this particular cache. I'd be hesitant to mess with it without doing a good amount of testing. I think CLHM should be good here also (no need for more advanced eviction). |
Cache.builder().setMaximumSize(32).build(); | ||
private static final Cache<String, AttributeKey<List<String>>> responseKeysCache = | ||
Cache.builder().setMaximumSize(32).build(); | ||
private static final ConcurrentHashMap<String, AttributeKey<List<String>>> requestKeysCache = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I replace this with bounded cache now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are naturally limited, I just pushed comment to this file
CLHM uses unsafe and thus fails our jpms tests. Should I switch to |
Cache.builder().setMaximumSize(32).build(); | ||
private static final Cache<String, AttributeKey<List<String>>> responseKeysCache = | ||
Cache.builder().setMaximumSize(32).build(); | ||
private static final ConcurrentHashMap<String, AttributeKey<List<String>>> requestKeysCache = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are naturally limited, I just pushed comment to this file
instrumentation-api/build.gradle.kts
Outdated
implementation("com.blogspot.mydailyjava:weak-lock-free") | ||
implementation("com.googlecode.concurrentlinkedhashmap:concurrentlinkedhashmap-lru:1.4.2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's shade these in. less transitive dependencies will make instrumentation-api
more palatable for libraries to embed directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we copied in CLHM in order to update code to use default CHM
I copied in weak-lock-free also (instead of shading), just to avoid gradle shadow problems (only 4 files)
I keep forgetting all the branches of this problem 😞 This is what using Caffeine 3 on Java 11 would solve for us. Cache2k seems like a reasonable option to keep things simple for now. As long as it's all hidden / shaded dependency we can improve on it later. |
CLHM has a backport of CHM v8 which uses unsafe. Most just copy the code into their project and replace with the standard CHM. For example ms-jdbc, groovy, micronaut. This is pretty much equivalent to shading. |
thx @ben-manes! 👍 |
instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/cache/BoundedCache.java
Outdated
Show resolved
Hide resolved
} | ||
return new WeakReference<>(wrapper); | ||
}); | ||
return resultReference.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may need to be careful here, I think the cache can return a "cleared" weak value reference now (previously caffeine took care of not making "cleared" values visible).
Instead, I think we could use VirtualField instead of Cache, essentially reverting #3059 now that #4182 solves the underlying problem with VirtualField not being applied to lambdas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I sent two commits converting from Cache to VirtualField
The second commit was needed to workaround a problem caused by the VirtualField instrumenting some of our own helper class lambdas. Normally I don't think helper classes are instrumented since they are injected from inside of ClassFileTransformer.transform (which is not recursive). But lambdas are different since they don't go through the normal class definition / transformation process.
I was already having second thoughts about the first commit above after typing up the warning about the memory leak that will occur if this code is ever used inside of library instrumentation.
I did add a commit to handle cleared weak values.
instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/cache/BoundedCache.java
Outdated
Show resolved
Hide resolved
instrumentation-api/src/main/java/io/opentelemetry/instrumentation/api/cache/Cache.java
Outdated
Show resolved
Hide resolved
...api/src/main/java/io/opentelemetry/instrumentation/api/cache/concurrentlinkedhashmap/LICENSE
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,7 @@ | |||
ConcurrentLinkedHashMap | |||
Copyright 2008, Ben Manes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add this line above the Copyright 2010 Google Inc. into the file license headers. We could probably remove the NOTICE then (otherwise I would put the NOTICE in resources, not java)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way I noticed micronaut doesn't reference the NOTICE, nor update license headers
IANAL, but I think the author
tag satisfies "readable copy of attribution notices contained within such NOTICE file...within the Source form or documentation"
https://github.com/ben-manes/concurrentlinkedhashmap/blob/master/LICENSE#L111
Mostly not a fan of having license type files within the java
source - converting to a docstring on a package-info.java
would also be better to me if needing to include it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converting to a docstring on a
package-info.java
would also be better to me if needing to include it
went with this 👍
I also re-generated the license report, which pulls in NOTICE files if they're in the dependencies jar file, but in this case it's not.
.../java/io/opentelemetry/instrumentation/api/cache/weaklockfree/AbstractWeakConcurrentMap.java
Show resolved
Hide resolved
...c/main/java/io/opentelemetry/instrumentation/api/cache/weaklockfree/DetachedThreadLocal.java
Outdated
Show resolved
Hide resolved
...src/main/java/io/opentelemetry/instrumentation/api/cache/weaklockfree/WeakConcurrentSet.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just point about NOTICE but not a huge deal
// remove the wrapped listener from the netty future, and if the value is collected prior to the | ||
// key, that means it's no longer used (referenced) by the netty future anyways. | ||
// | ||
// also note: this is not using VirtualField in case this is ever converted to library |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively we could add an option to use weak values when VirtualField
fallback map is used. When we know that VirtualField
value references key we could just use VirtualField
with weak values and would not need to worry whether it is going to use fallback map because something wasn't instrumented or because it is used inside library instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open a issue for this 👍
import io.opentelemetry.javaagent.instrumentation.api.internal.ClassLoaderMatcherCacheHolder; | ||
import io.opentelemetry.javaagent.instrumentation.api.internal.InClassLoaderMatcher; | ||
import net.bytebuddy.matcher.ElementMatcher; | ||
|
||
class ClassLoaderHasClassesNamedMatcher extends ElementMatcher.Junction.AbstractBase<ClassLoader> { | ||
|
||
private final Cache<ClassLoader, Boolean> cache = | ||
Cache.builder().setWeakKeys().setMaximumSize(25).build(); | ||
private final Cache<ClassLoader, Boolean> cache = Cache.weak(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this, in my opinion 25 was way too low.
|
||
import io.opentelemetry.instrumentation.api.cache.concurrentlinkedhashmap.ConcurrentLinkedHashMap; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put CLHM and WeakLockFree in internal
package(s)? They shouldn't be a part of our public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 created #4735 to track
…ation to instrumentation-api (open-telemetry#4667) * Drop instrumentation-api-caching module and move weak cache implementation to instrumentation-api * Some test fixes * Some cleanup * Temporary workaround for using weak values in FutureListenerWrappers * Spotless * Update ClassNames and SpanNames * Compilation and comment * Add bounded cache and clean interface * Polish * Add comment * Vendor ConcurrentLinkedHashMap in * Let errorprone ignore vendored CLHM for now * Keep license in java files too * Convert Netty wrapper cache to VirtualField * Work around lambda instrumentation failure Ideally we would ignore instrumenting helper classes... * Revert "Work around lambda instrumentation failure" This reverts commit 6d63815. * Revert "Convert Netty wrapper cache to VirtualField" This reverts commit dac1522. * Handle cleared weak values * Fix comment * Delete instrumentation-api-caching * Copy in weak-lock-free * Remove caffeine remnants * Fix checkstyle * Rename BoundedCache to MapBackedCached * Remove duplicate LICENSE * Remove outdated comment * Sync with SDK copy of weaklockfree * Enable checkstyle:off comment * Re-generate license report * Move NOTICE file to package-info.java Co-authored-by: Trask Stalnaker <[email protected]>
This if first draft PR to verify my understand of what we agreed upon on Tuesday.
We have couple of usages that still don't work with
WeakLockFreeCache
:FutureListenerWrappers
wants weak valuesClassLoaderHasClassesNamedMatcher
andAgentCachingPoolStrategy
want to specify max capacityHow did we want to solve that?