Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Replace cached time with system clock in MasterService debug logs #7902

Merged
merged 7 commits into from
Jun 29, 2023

Conversation

sandeshkr419
Copy link
Contributor

@sandeshkr419 sandeshkr419 commented Jun 2, 2023

Description

Cached Time gives imprecise duration in computing multiple computation time since it caches every 200ms and multiple operation in MasterService regarding cluster state creation, update, publish require time frames less than 200ms during debugging. So changing the computation time to consume System.nanotime() to get clock time instead of cached time. This is essential when debug logs are enabled - else this is not logged by default.

Before:

[2023-05-31T11:38:34,142][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to compute cluster state update for [index-aliases]
[2023-05-31T11:38:34,307][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to notify listeners on successful publication of cluster state (version: 55109, uuid: Fc58ns6mRHCNuxcvADvfOg) for [index-aliases]
[2023-05-31T11:38:32,416][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to compute cluster state update for [cluster_reroute(reroute after starting shards)]
[2023-05-31T11:38:32,416][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to notify listeners on unchanged cluster state for [cluster_reroute(reroute after starting shards)]
[2023-05-31T11:38:34,307][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to notify listeners on successful publication of cluster state (version: 55109, uuid: Fc58ns6mRHCNuxcvADvfOg) for [index-aliases]

After:

[2023-06-02T12:28:49,551][DEBUG][o.o.c.s.MasterService    ] [master1] took [2ms] to notify listeners on unchanged cluster state for [index-aliases]
[2023-06-02T13:10:17,855][DEBUG][o.o.c.s.MasterService    ] [master1] took [93ms] to compute cluster state update for [index-aliases]
[2023-06-02T13:10:47,875][DEBUG][o.o.c.s.MasterService    ] [master1] executing cluster state update for [index-aliases]
[2023-06-02T13:10:47,876][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to compute cluster state update for [index-aliases]
[2023-06-02T13:10:47,877][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to notify listeners on unchanged cluster state for [index-aliases]
[2023-06-02T13:34:24,247][DEBUG][o.o.c.s.MasterService    ] [master1] took [0s] to notify listeners on successful publication of cluster state (version: 60931, uuid: G8fs82mZROGv4C0ufnehKA) for [index-aliases]

Related Issues

Resolves #7849

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@sandeshkr419 sandeshkr419 changed the title Replace cached time with system clock in MasterService debug logs [Draft] Replace cached time with system clock in MasterService debug logs Jun 2, 2023
@sandeshkr419 sandeshkr419 marked this pull request as ready for review June 2, 2023 20:38
@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2023

Gradle Check (Jenkins) Run Completed with:

@andrross
Copy link
Member

andrross commented Jun 5, 2023

Looks like some tests are failing?

CHANGELOG.md Outdated Show resolved Hide resolved
@shwetathareja
Copy link
Member

shwetathareja commented Jun 6, 2023

We should change ClusterApplierService class as well.

Copy link
Collaborator

@Bukhtawar Bukhtawar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want us to double click on this to see if we really need this accuracy. Based on my understanding we are simply using this for debugging. What are the actionable items if these numbers stack around few seconds lets say.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      3 org.opensearch.remotestore.RemoteStoreRefreshListenerIT.testRemoteRefreshRetryOnFailure
      1 org.opensearch.search.backpressure.SearchBackpressureIT.testSearchShardTaskCancellationWithHighCpu

@sandeshkr419 sandeshkr419 changed the title [Draft] Replace cached time with system clock in MasterService debug logs [Fix] Replace cached time with system clock in MasterService debug logs Jun 19, 2023
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testNodeDropWithOngoingReplication
      1 org.opensearch.indices.replication.SegmentReplicationRelocationIT.testPrimaryRelocationWithSegRepFailure

@dblock
Copy link
Member

dblock commented Jun 28, 2023

@andrross want to take a look at the updated version? much appreciated

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.snapshots.RestoreSnapshotIT.testRestoreInSameRemoteStoreEnabledIndex
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testNodeDropWithOngoingReplication
      1 org.opensearch.cluster.allocation.AwarenessAllocationIT.testThreeZoneOneReplicaWithForceZoneValueAndLoadAwareness

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      2 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testNodeDropWithOngoingReplication
      2 org.opensearch.remotestore.SegmentReplicationRemoteStoreIT.classMethod
      1 org.opensearch.remotestore.SegmentReplicationUsingRemoteStoreIT.testDropPrimaryDuringReplication
      1 org.opensearch.remotestore.SegmentReplicationRemoteStoreIT.testIndexReopenClose

@andrross andrross added the backport 2.x Backport to 2.x branch label Jun 29, 2023
@andrross andrross merged commit 73b7a85 into opensearch-project:main Jun 29, 2023
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-7902-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 73b7a85d0a41580870a392aac9d47192adefa37a
# Push it to GitHub
git push --set-upstream origin backport/backport-7902-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-7902-to-2.x.

andrross pushed a commit to andrross/OpenSearch that referenced this pull request Jun 29, 2023
…gs (opensearch-project#7902)

* Replace cached time with system clock in MasterService debug logs

Signed-off-by: Sandesh Kumar <[email protected]>

* Supply System.nanaoTime via TimeSupplier

Signed-off-by: Sandesh Kumar <[email protected]>

* Add absolute time fetch method in Threadpool

Signed-off-by: Sandesh Kumar <[email protected]>

* Rename absoluteTimeInNanos to preciseRelativeTimeInNanos

Signed-off-by: Sandesh Kumar <[email protected]>

---------

Signed-off-by: Sandesh Kumar <[email protected]>
(cherry picked from commit 73b7a85)
@sandeshkr419 sandeshkr419 deleted the debugTime branch June 30, 2023 02:36
andrross pushed a commit to andrross/OpenSearch that referenced this pull request Jul 1, 2023
…gs (opensearch-project#7902)

* Replace cached time with system clock in MasterService debug logs

Signed-off-by: Sandesh Kumar <[email protected]>

* Supply System.nanaoTime via TimeSupplier

Signed-off-by: Sandesh Kumar <[email protected]>

* Add absolute time fetch method in Threadpool

Signed-off-by: Sandesh Kumar <[email protected]>

* Rename absoluteTimeInNanos to preciseRelativeTimeInNanos

Signed-off-by: Sandesh Kumar <[email protected]>

---------

Signed-off-by: Sandesh Kumar <[email protected]>
(cherry picked from commit 73b7a85)
andrross pushed a commit to andrross/OpenSearch that referenced this pull request Jul 5, 2023
…gs (opensearch-project#7902)

* Replace cached time with system clock in MasterService debug logs

Signed-off-by: Sandesh Kumar <[email protected]>

* Supply System.nanaoTime via TimeSupplier

Signed-off-by: Sandesh Kumar <[email protected]>

* Add absolute time fetch method in Threadpool

Signed-off-by: Sandesh Kumar <[email protected]>

* Rename absoluteTimeInNanos to preciseRelativeTimeInNanos

Signed-off-by: Sandesh Kumar <[email protected]>

---------

Signed-off-by: Sandesh Kumar <[email protected]>
(cherry picked from commit 73b7a85)
baba-devv pushed a commit to baba-devv/OpenSearch that referenced this pull request Jul 29, 2023
…gs (opensearch-project#7902)

* Replace cached time with system clock in MasterService debug logs

Signed-off-by: Sandesh Kumar <[email protected]>

* Supply System.nanaoTime via TimeSupplier

Signed-off-by: Sandesh Kumar <[email protected]>

* Add absolute time fetch method in Threadpool

Signed-off-by: Sandesh Kumar <[email protected]>

* Rename absoluteTimeInNanos to preciseRelativeTimeInNanos

Signed-off-by: Sandesh Kumar <[email protected]>

---------

Signed-off-by: Sandesh Kumar <[email protected]>
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
…gs (opensearch-project#7902)

* Replace cached time with system clock in MasterService debug logs

Signed-off-by: Sandesh Kumar <[email protected]>

* Supply System.nanaoTime via TimeSupplier

Signed-off-by: Sandesh Kumar <[email protected]>

* Add absolute time fetch method in Threadpool

Signed-off-by: Sandesh Kumar <[email protected]>

* Rename absoluteTimeInNanos to preciseRelativeTimeInNanos

Signed-off-by: Sandesh Kumar <[email protected]>

---------

Signed-off-by: Sandesh Kumar <[email protected]>
Signed-off-by: Shivansh Arora <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Imprecise duration logged by MasterService
6 participants