Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

DR6 performance issues #7753

Closed
xtrapower opened this issue Jan 10, 2020 · 3 comments
Closed

DR6 performance issues #7753

xtrapower opened this issue Jan 10, 2020 · 3 comments
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Milestone

Comments

@xtrapower
Copy link

xtrapower commented Jan 10, 2020

Problem

Our performance in the recent DryRun6 was bad and we ended up unhealthy during most of the 'Ramp TPS' rounds. Currently, I'm at a loss for what might have caused us (Staking Facilities) missing so many slots.

I believe our machine isn't the problem (32 core, 96GB RAM, NVMe storage, 3x 2080Ti). I ran benchmarks today (v0.21.5) and was able to squeeze out 150k Max TPS with almost 90k sustained average TPS. No additional software was running on that machine & I stopped RPC'ing the node early on.

So basically, peering/latency/networking issues remain as a potential error cause. Our machine is co-located in an Equinix DC with a 100Mbit connection which can be bursted without limit though - according to Equinix.

Our log files: Google Drive

Let's look at epoch 82 (a Ramp TPS round). We were scheduled for 64 slots and missed 22 (see epoch82.log). I think 8 missed slots can be attributed to issue #7588

But our validator missed quite a lot of its slots on its own and contributed the the above problem. A typical pattern seems to be that it produces 1-2 slots and then misses the other 3-4. This indicates that we were timed out by the next leader. Why? And what can we do about it?

Two examples from epoch82.log:

312848 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U
312849 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U SKIPPED
312850 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U SKIPPED
312851 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U SKIPPED

312516 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U
312517 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U
312518 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U SKIPPED
312519 55nmQ8gdWpNW5tLPoBPsqDkLm1W24cmY5DbMMXZKSP8U SKIPPED

Looking at the validator log file, can you tell what went wrong in those two examples?

@mvines mvines added this to the Supertubes v0.22.3 milestone Jan 10, 2020
@xtrapower
Copy link
Author

Appearently our node scored pretty good in the 'TdS DR6 winner-tool results'

That seems counterintuitive to what the logs suggest (being timed out by the next leader)

@stale
Copy link

stale bot commented Jan 30, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jan 30, 2021
@stale
Copy link

stale bot commented Feb 7, 2021

This stale issue has been automatically closed. Thank you for your contributions.

@stale stale bot closed this as completed Feb 7, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

No branches or pull requests

2 participants