Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix delete_expired_data/nightly maintenance when many model snapshots need deleting #57041

Merged
merged 3 commits into from
May 26, 2020

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented May 21, 2020

The queries performed by the expired data removers pull back entire documents where only a few fields are required. For ModelSnapshots in particular this is a problem as they contain quantiles which may be 100s of KB and the search size is set to 10,000.

If the user is suffering with many accumulated snapshots that were not cleaned up due to #47103 the size of this search response could be very large. This change makes the search more efficient by only requesting the fields needed to work out which expired data should be deleted.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

All my comments on the code are minor nits.

My main comment is that we should consider this a bug rather than an enhancement and backport it to 7.8.0 and 7.7.1. From the perspective of an end user who is suffering from this the product doesn't work.

} else {
value[0] = 0L;
}
value[0] = TimeUtils.parseToEpochMs((String)value[0]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indenting is out by 1 space by the look of it

* @param epoch The
* @return The epoch value.
*/
public static Long parseToEpochMs(String epoch) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to return long? In the case where the caller wants Long it can be boxed just as efficiently on the caller's side. And if the caller wants long then there's no boxing and unboxing.

long timestampMs = modelSnapshot.getTimestamp().getTime();
String timestamp = stringFieldValueOrNull(hit, ModelSnapshot.TIMESTAMP.getPreferredName());
if (timestamp == null) {
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should log an error here, as I don't think this should never happen unless a document is corrupt. So if it does happen it would be useful to know about when debugging why results haven't been deleted.

if (timestamp == null) {
continue;
}
Long timestampMs = TimeUtils.parseToEpochMs(timestamp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use long instead of Long and avoid unboxing in the loop.

@droberts195 droberts195 changed the title Request required fields only in expired data deleters Fix delete_expired_data/nightly maintenance when many model snapshots need deleting May 21, 2020
@droberts195
Copy link
Contributor

I changed the PR title so affected users can find it more easily in the release notes.

*/
public static Long parseToEpochMs(String epoch) {
int dotPos = epoch.indexOf('.');
long epochMs = 0L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could get rid of epochMs and use return statement in each of 3 if branches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that does read better

* JSON. So String is really the only format that could be used, but the consumers of time
* are expecting a number.
*
* @param epoch The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write sth here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄

@davidkyle davidkyle force-pushed the required-fields-only branch from 8a7366d to a69f90a Compare May 21, 2020 15:33
@@ -84,7 +78,11 @@ public void remove(float requestsPerSec, ActionListener<Boolean> listener, Suppl
.filter(QueryBuilders.termQuery(Result.RESULT_TYPE.getPreferredName(), ForecastRequestStats.RESULT_TYPE_VALUE))
.filter(QueryBuilders.existsQuery(ForecastRequestStats.EXPIRY_TIME.getPreferredName())));
source.size(MAX_FORECASTS);
source.trackTotalHits(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will removing this make:

         if (hits.getTotalHits().value > MAX_FORECASTS) {
             LOGGER.info("More than [{}] forecasts were found. This run will only delete [{}] of them", MAX_FORECASTS, MAX_FORECASTS);
         }

less useful? (Around line 141/145.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot I couldn't see where it was checked so removed it.

The benefit of setting trackTotalHits(true) as the only purpose is to log this message saying there are more forecasts to be deleted later. However, in the interest of making this change as small as possible I've reverted this line

@davidkyle davidkyle merged commit cde2026 into elastic:master May 26, 2020
@davidkyle davidkyle deleted the required-fields-only branch May 26, 2020 08:25
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request May 26, 2020
… need deleting (elastic#57041)

The queries performed by the expired data removers pull back entire documents
when only a few fields are required. For ModelSnapshots in particular this is
a problem as they contain quantiles which may be 100s of KB and the search size
is set to 10,000.

This change makes the search more efficient by only requesting the fields 
needed to work out which expired data should be deleted.
davidkyle added a commit that referenced this pull request May 26, 2020
…pshots need deleting (#57041) (#57136)

Fix delete_expired_data/nightly maintenance when 
many model snapshots need deleting (#57041)

The queries performed by the expired data removers pull back entire 
documents when only a few fields are required. For ModelSnapshots in 
particular this is a problem as they contain quantiles which may be 
100s of KB and the search size is set to 10,000.

This change makes the search more efficient by only requesting the 
fields needed to work out which expired data should be deleted.
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request May 26, 2020
… need deleting (elastic#57041)

The queries performed by the expired data removers pull back entire documents
when only a few fields are required. For ModelSnapshots in particular this is
a problem as they contain quantiles which may be 100s of KB and the search size
is set to 10,000.

This change makes the search more efficient by only requesting the fields
needed to work out which expired data should be deleted.
# Conflicts:
#	x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/job/retention/ExpiredForecastsRemover.java
#	x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/job/retention/ExpiredModelSnapshotsRemover.java
#	x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/job/retention/MlDataRemover.java
#	x-pack/plugin/ml/src/test/java/org/elasticsearch/xpack/ml/job/retention/ExpiredModelSnapshotsRemoverTests.java
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request May 26, 2020
… need deleting (elastic#57041)

The queries performed by the expired data removers pull back entire documents
when only a few fields are required. For ModelSnapshots in particular this is
a problem as they contain quantiles which may be 100s of KB and the search size
is set to 10,000.

This change makes the search more efficient by only requesting the fields 
needed to work out which expired data should be deleted.
davidkyle added a commit that referenced this pull request May 27, 2020
… need deleting (#57041) (#57145)

The queries performed by the expired data removers pull back entire documents
when only a few fields are required. For ModelSnapshots in particular this is
a problem as they contain quantiles which may be 100s of KB and the search size
is set to 10,000.

This change makes the search more efficient by only requesting the fields
needed to work out which expired data should be deleted.
davidkyle added a commit that referenced this pull request May 27, 2020
…pshots need deleting (#57041) (#57146)

The queries performed by the expired data removers pull back entire documents
when only a few fields are required. For ModelSnapshots in particular this is
a problem as they contain quantiles which may be 100s of KB and the search size
is set to 10,000.

This change makes the search more efficient by only requesting the fields 
needed to work out which expired data should be deleted.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants