-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow custom geoip databases to be updated at runtime. #68901
Allow custom geoip databases to be updated at runtime. #68901
Conversation
Custom geoip databases can be provided via the config/ingest-geoip directory, which are loaded at node startup time. This change adds the functionality that reloads custom databases at runtime.
Pinging @elastic/es-core-features (Team:Core/Features) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand idea in LocalDatabases
around configDatabases
access, left some comments
String databasePath = databaseFile.toString(); | ||
int counter = 0; | ||
for (CacheKey key : cache.keys()) { | ||
if (key.databasePath.equals(databasePath)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this synchronized in any way? Is it possible that we will add new key during invalidation of old ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. This does need some synchronization, otherwise there may be a chance that keys are added for a database to be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added reference counting to DatabaseReaderLazyLoader
, so that keys for the old databases are only removed immediately after the update if there are no usages from geoip processor instances or when the last geoip processor has completed doing geo enrichment.
This also addresses another issue; when a geoip processor gets a DatabaseReaderLazyLoader
instance and then a database update occurs for the same database then the maxmind db reader (which is wrapped by DatabaseReaderLazyLoader
) may fail with an 'com.maxmind.db.ClosedDatabaseException: The MaxMind DB has been closed.' error. The reference counting avoids closing the reader if it is being used.
void updateDatabase(Path file, boolean update) { | ||
String databaseFileName = file.getFileName().toString(); | ||
try { | ||
Map<String, DatabaseReaderLazyLoader> databases = new HashMap<>(this.configDatabases); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use ConcurrentHashMap
for this.configDatabases
, then we don't have to do copying and it should be safer for multi-threading.
Currently there's no guarantee that this.configDatabases
in line 109 is the same as in 127 - get-and-update access should always include while loop and atomic swap (or something like Atomic<Something>.updateAndGet()
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I picked this approach because of the following reason:
- I think this method will not be invoked concurrently if there are multiple file updates. At least this is how understand the
ResourceFileWatcher
. The synchronizations happens here:ResourceWatcherService
line 165, therun()
method is synchronized. This then delegates toFileWatcher.FileObserver#checkAndNotify()
and checks whether the files to be monitored have changed and then invoke thisGeoipDirectoryListener
for each change. - I opted for the copy on write update mechanism here, because that is more efficient in cases where incidental changes are made and the resource is mainly read. This is a pattern we use elsewhere in the code base, where there is a resources, that is mainly read and rarely updated.
I'm okay with changing the configDatabases field to use ConcurrentHashMap
, then we don't need the update logic here anymore, which makes this class simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that ConcurrentHashMap
will be actually faster here - get
has no penalty at all and put
/remove
has to do less than copying whole map (with no contention it's just 1)find slot 2)add to list, done). And it gives you safety if we ever refactor updateDatabase
to be invoked concurrently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 let's then go with ConcurrentHashMap
databases.put(databaseFileName, loader); | ||
} | ||
|
||
return Map.copyOf(databases); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for creating copy here? Collections.unmodifiableMap(databases)
should suffice I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only needs to be immutable, so like you suggest Collections.unmodifiableMap(...)
suffices.
} | ||
} | ||
|
||
return Map.copyOf(databases); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for creating copy here? Collections.unmodifiableMap(databases) should suffice I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change this, see the previous comment.
DatabaseReaderLazyLoader database = configDatabases.get(name); | ||
if (database != null) { | ||
return database; | ||
} else { | ||
return defaultDatabases.get(name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DatabaseReaderLazyLoader database = configDatabases.get(name); | |
if (database != null) { | |
return database; | |
} else { | |
return defaultDatabases.get(name); | |
return configDatabases.getOrDefault(name, defaultDatabases.get(name)); |
…ance, this to avoid closing a maxmind db reader while it is still being used. There is a small window of time where this might happen during a database update. A DatabaseReaderLazyLoader instance (which wraps a Maxmind db reader) from config database directory: * Is closed immediately if there are no usages of it by any geoip processor instance as part of the database reload. * When there are usages, then it is not closed immediately after a database reload. It is closed by the caller that did the last geoip lookup using this DatabaseReaderLazyLoader instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One problem found + 2 small nits
@@ -175,8 +208,34 @@ private DatabaseReader get() throws IOException { | |||
} | |||
|
|||
@Override | |||
public synchronized void close() throws IOException { | |||
public void close() throws IOException { | |||
closed = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's still race here that can lead to loader not being closed, consider this:
t1 calls close()
and line 212 is executed
t2 calls preLookup()
t1 checks line 213 and doesn't close the loader (currentUsages == 1
)
t2 checks line 95 LocalDatabases
and sees loader "closed"
t2 never calls postLookup()
and loader is never really closed
We should either relay on single atomic variable for both checks (Integer.MIN_VALUE
as close state for example) or call postLookup()
in else
clause in LocalDatabases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like that should work but it's not tested (it uses negative values as indicator of close state, you have to check result of preLookup
instead of isClosed
)
boolean preLookup() {
return currentUsages.updateAndGet(current -> current < 0 ? current : current + 1) > 0;
}
void postLookup() throws IOException {
if (currentUsages.updateAndGet(current -> current > 0 ? current - 1 : current + 1) == -1) {
doClose();
}
}
@Override
public void close() throws IOException {
if (currentUsages.updateAndGet(u -> -1 - u) == -1) {
doClose();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, my forgot to add an else clause to LocalDatabases#getDatabase(...)
method.
I prefer your approach (only one atomic integer that does all the accounting) over this approach.
Also stress tested it and your approach works correctly.
@@ -49,6 +57,13 @@ | |||
// cache the database type so that we do not re-read it on every pipeline execution | |||
final SetOnce<String> databaseType; | |||
|
|||
private volatile boolean closed = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need for explicit initialization I think
while (true) { | ||
DatabaseReaderLazyLoader instance = configDatabases.getOrDefault(name, defaultDatabases.get(name)); | ||
if (instance == null) { | ||
return instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe return null
here to make it explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @martijnvg !
Backport of elastic#68901 to 7.x branch. Custom geoip databases can be provided via the config/ingest-geoip directory, which are loaded at node startup time. This change adds the functionality that reloads custom databases at runtime. Added reference counting when getting a DatabaseReaderLazyLoader instance, this to avoid closing a maxmind db reader while it is still being used. There is a small window of time where this might happen during a database update. A DatabaseReaderLazyLoader instance (which wraps a Maxmind db reader) from config database directory: * Is closed immediately if there are no usages of it by any geoip processor instance as part of the database reload. * When there are usages, then it is not closed immediately after a database reload. It is closed by the caller that did the last geoip lookup using this DatabaseReaderLazyLoader instance.
Backport of #68901 to 7.x branch. Custom geoip databases can be provided via the config/ingest-geoip directory, which are loaded at node startup time. This change adds the functionality that reloads custom databases at runtime. Added reference counting when getting a DatabaseReaderLazyLoader instance, this to avoid closing a maxmind db reader while it is still being used. There is a small window of time where this might happen during a database update. A DatabaseReaderLazyLoader instance (which wraps a Maxmind db reader) from config database directory: * Is closed immediately if there are no usages of it by any geoip processor instance as part of the database reload. * When there are usages, then it is not closed immediately after a database reload. It is closed by the caller that did the last geoip lookup using this DatabaseReaderLazyLoader instance.
Custom geoip databases can be provided via the config/ingest-geoip directory,
which are loaded at node startup time. This change adds the functionality
that reloads custom databases at runtime.