Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add translog files age to Translog Stats (#28613) #28613

Merged
merged 4 commits into from
Feb 16, 2018

Conversation

justinwyer
Copy link
Contributor

#28189 Implemented earliest_generation_last_modified_age as a translog stats item. This uses the last modified time of the earliest generation.

…translog stats item. This uses the last modified time of the earliest generation.
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @justinwyer . Looking good. I left some comments.

@@ -444,6 +450,20 @@ public long sizeOfGensAboveSeqNoInBytes(long minSeqNo) {
}
}

private long earliestGenerationLastModifiedAge(long minGeneration) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can reduce complexity here as we don't need min generation control. Can we remove the parameter please?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make this method static, give it a list/stream of readers and the current time as parameters? this would allow for proper testing with mocks and fixed time.

Copy link
Contributor Author

@justinwyer justinwyer Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A static method for testing is a good idea. If we made it static we would still need to wrap it in a non static method to include:

try (ReleasableLock ignored = readLock.acquire()) {
    ensureOpen();
    ...
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure.

@@ -66,6 +72,9 @@ public void add(TranslogStats translogStats) {
this.translogSizeInBytes += translogStats.translogSizeInBytes;
this.uncommittedOperations += translogStats.uncommittedOperations;
this.uncommittedSizeInBytes += translogStats.uncommittedSizeInBytes;
this.earliestGenerationLastModifiedAge =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use Math.min()?

ensureOpen();
Optional<? extends BaseTranslogReader> earliestGeneration = Stream.concat(readers.stream(), Stream.of(current))
.filter(r -> r.getGeneration() >= minGeneration)
.min(Comparator.comparingLong(BaseTranslogReader::getGeneration));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we maybe map this to a stream of longs via getLastModifiedTime and collect using Collectors.minBy(Comparator.comparing(BaseTranslogReader:: getLastModifiedTime)) or something similar? also, can we also take the writer into account for completeness?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear - I think we want to know how the oldest file is, regardless of the generations. It will always be the oldest generation and the first in the reader list, but I don't think we want to rely on it. Part of the role of the stats is to validate things are correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but we can't use BaseTranslogReader:: getLastModifiedTime as the method throws a checked exception.

Does Stream.concat(readers.stream(), Stream.of(current)) not include the writer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but we can't use BaseTranslogReader:: getLastModifiedTime as the method throws a checked exception.

Fair enough. No streams for us - we need to do it the old fashion way :D

Does Stream.concat(readers.stream(), Stream.of(current)) not include the writer?

Yes. Current is a TranslogWriter.

Copy link
Contributor Author

@justinwyer justinwyer Feb 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've settled on the following WDYT?

    long earliestLastModifiedAge() {
        try (ReleasableLock ignored = readLock.acquire()) {
            ensureOpen();
            return findEarliestLastModifiedAge(System.currentTimeMillis(), Stream.concat(readers.stream(), Stream.of(current)));
        } catch (ElasticsearchException e) {
            throw new TranslogException(shardId, e.getDetailedMessage(), e);
        }
    }

    /**
     * Returns the age of the oldest entry in the translog files in seconds
     */
    static long findEarliestLastModifiedAge(long currentTime, Stream<BaseTranslogReader> baseTranslogReaders) {
        return currentTime - baseTranslogReaders.mapToLong(Translog::uncheckedGetLastModifiedTime).min().orElse(currentTime);
    }

    private static long uncheckedGetLastModifiedTime(BaseTranslogReader r) {
        try {
            return r.getLastModifiedTime();
        } catch (IOException e) {
            throw new ElasticsearchException("Unable to get the earliest last modified time for the transaction log");
        }
    }

On the testing side I've added:

    public void testFindEarliestLastModifiedAge() throws IOException {
        long fixedTime = System.currentTimeMillis();
        long earlyTime = fixedTime - 10000;
        BaseTranslogReader early = mock(BaseTranslogReader.class);
        stub(early.getLastModifiedTime()).toReturn(earlyTime);
        long earlierTime = fixedTime - 20000;
        BaseTranslogReader earlier = mock(BaseTranslogReader.class);
        stub(earlier.getLastModifiedTime()).toReturn(earlierTime);
        long earliestTime = fixedTime - 30000;
        BaseTranslogReader earliest = mock(BaseTranslogReader.class);
        stub(earliest.getLastModifiedTime()).toReturn(earliestTime);
        Stream<BaseTranslogReader> readers = Stream.of(early, earlier, earliest);
        assertThat(Translog.findEarliestLastModifiedAge(fixedTime, readers), equalTo(30000L));
    }

./gradlew check running now, feel free to comment soonest :D

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather not have the uncheckedGetLastModifiedTime method. Instead just do old school iteration on the reader list, keeping the min of the last modified time. Then do a one off extra min with the writer's value.

Re the test - I would prefer if you make an array of random positive long values, map readers to them and make sure the method returns the right value when compared to a calculation that is made directly on the array.

.filter(r -> r.getGeneration() >= minGeneration)
.min(Comparator.comparingLong(BaseTranslogReader::getGeneration));
try {
return earliestGeneration.isPresent() ? earliestGeneration.get().getLastModifiedTime() : 0L;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the method indicates we return an age by this returns last modified time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Derp.

@bleskes
Copy link
Contributor

bleskes commented Feb 14, 2018

@justinwyer can you ping me when this is ready for another cycle? I saw you pushed something but I don't want to look until you feel it's ready.

@justinwyer
Copy link
Contributor Author

@bleskes I hope I covered all the concerns, you can take a look.

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This is getting into shape. I left some minor comments. it will be great to have a rest test too, to make share the xcontent rending includes this info

for (BaseTranslogReader r : readers) {
earliestTime = Math.min(r.getLastModifiedTime(), earliestTime);
}
return currentTime - Math.min(earliestTime, writer.getLastModifiedTime());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a Math.max(0, currentTime - Math.min()) ? we rely on this being non negative, but time may go back and the FS may have other quirks.

@@ -356,6 +358,25 @@ protected TranslogStats stats() throws IOException {
return stats;
}

public void testFindEarliestLastModifiedAge() throws IOException {
long fixedTime = System.currentTimeMillis();
long[] periods = new long[10];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we sometime test the case where have no readers?

@@ -356,6 +358,25 @@ protected TranslogStats stats() throws IOException {
return stats;
}

public void testFindEarliestLastModifiedAge() throws IOException {
long fixedTime = System.currentTimeMillis();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be good to use randomized numbers here. It will be easier to reproduce failures and it will also test time go back.

@justinwyer
Copy link
Contributor Author

@bleskes just to confirm when you say a rest test is adding checks to rest-api-spec/test/indices.stats/20_translog.yml what you mean or are there other places as well?

@bleskes
Copy link
Contributor

bleskes commented Feb 16, 2018 via email

@justinwyer
Copy link
Contributor Author

@bleskes you can take another look 😄

Copy link
Contributor

@bleskes bleskes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one little ask. I'll push this once it's done.

Thanks @justinwyer !

long fixedTime = System.currentTimeMillis();
long[] periods = new long[10];
for (int i = 0; i < 9; i++) {
final int numberOfReaders = scaledRandomIntBetween(1, 10);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this go down to 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly test the 0 readers case in that test as well as a random number of readers between 1 and 10 I can do the change of you prefer however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I see. I missed it.

@bleskes
Copy link
Contributor

bleskes commented Feb 16, 2018

@elasticmachine please test this.

@bleskes bleskes changed the title #28189 translog age in stats Add translog files age to Translog Stats (#28613) Feb 16, 2018
@bleskes bleskes merged commit 5aeb479 into elastic:master Feb 16, 2018
@bleskes bleskes added :Data Management/Stats Statistics tracking and retrieval APIs :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.0.0 v6.3.0 backport pending labels Feb 16, 2018
dnhatn pushed a commit that referenced this pull request Feb 16, 2018
Expose the age of translog files in the translog stats. This is useful to reason about your translog retention policy.

Closes #28189
dnhatn added a commit that referenced this pull request Feb 16, 2018
dnhatn added a commit that referenced this pull request Feb 16, 2018
dnhatn added a commit that referenced this pull request Feb 16, 2018
We added a rest test for the translog last modified age without a
version check. This causes BWC failed because the stats are not
available in the old versions.

Relates #28613
dnhatn added a commit that referenced this pull request Feb 16, 2018
We added a rest test for the translog last modified age without a
version check. This causes BWC failed because the stats are not
available in the old versions.

Relates #28613
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Aug 12, 2018
russcam pushed a commit to elastic/elasticsearch-net that referenced this pull request Aug 20, 2018
…arch#28613… (#3358)

This commit adds additional translog properties as per elastic/elasticsearch#28613 and add properties we weren't mapping
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Sep 3, 2018
…arch#28613… (#3358)

This commit adds additional translog properties as per elastic/elasticsearch#28613 and add properties we weren't mapping
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Sep 3, 2018
…arch#28613… (#3358)

This commit adds additional translog properties as per elastic/elasticsearch#28613 and add properties we weren't mapping
Mpdreamz added a commit to elastic/elasticsearch-net that referenced this pull request Sep 6, 2018
…arch#28613… (#3358)

This commit adds additional translog properties as per elastic/elasticsearch#28613 and add properties we weren't mapping


(cherry picked from commit 7ff63ab)
@jpountz jpountz removed the :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. label Jan 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants