Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measure memory and file descriptor usage of ya-relay server under "heavy load" #191

Closed
9 tasks done
Tracked by #2094 ...
mfranciszkiewicz opened this issue Jul 13, 2022 · 10 comments
Closed
9 tasks done
Tracked by #2094 ...
Assignees

Comments

@mfranciszkiewicz
Copy link
Contributor

mfranciszkiewicz commented Jul 13, 2022

Detect any potential memory and file-descriptor leaks in the current implementation.

  • investigate mem & file-descriptor leaks: shutdown tests and check relay
    • No memory or fd leaks.
  • get from grafana get_node / get_neighbourhood calls proportion
    • neighbours - 6%, find_node - 7%, ping - 87%.
  • measure timeouts:
    • UDP drops
      • UDP drops aren't due to buffer overflows, average request in my test scenarios was 27.8 bytes and even at the maximum of 50k concurrent connections, it wasn't possible to reach the limit of my 32MiB buffers.
      • See this comment for the scaling of packet drops in terms of the number of connections. All noted drops are likely due to UDP.
      • At worst (50k connections) ~20% of requests aren't responded to.
    • overloaded relay
      • The average response time is proportional to the number of concurrent connections.
      • The proportionality ratio is 1s / 43k connections.
      • The default timeout is 3.5s, so 50% of requests will timeout at ~150k nodes.
  • measure how many idle nodes can be concurrently connected
    • The target of 50k nodes can be reached and sustained.
    • Assuming only the automatic ping requests.
  • measure how many nodes can perform discovery calls (as measured in point 2) concurrently
    • Reached 50k nodes doing the discovery calls.
    • Packet drop rate at 50k nodes was 20%.
  • model memory usage of the relay
    • Mem usage is ~1MiB / 1k connections.
  • create a short summary of this investiagtion.
    • See the points above. :)

Relevant investigation of ya-relay scaling: #215

@mateuszsrebrny
Copy link
Contributor

M

@kamirr
Copy link
Contributor

kamirr commented Sep 8, 2022

The proportion of different requests since September 1. until noon today is as follows: neighbours - 6%, find_node - 7%, ping - 87%.

@kamirr
Copy link
Contributor

kamirr commented Sep 9, 2022

At an optimal number of connections (50), making discovery requests of the proportions described above, the relay reaches 170k responses/sec. This would be enough to sustain a network of over 3M nodes assuming the same average requests rate for connected nodes as now (approx 2.6 requests per minute).

@kamirr
Copy link
Contributor

kamirr commented Sep 14, 2022

Blocked on #209.

@kamirr
Copy link
Contributor

kamirr commented Sep 22, 2022

  1. Descriptor usage stays at constant 10 -> no leak
  2. Memory usage grows proportionally to the number of connections, never freed -> no leak

@kamirr
Copy link
Contributor

kamirr commented Sep 23, 2022

Reached 50k concurrently connected idle nodes.

@kamirr
Copy link
Contributor

kamirr commented Sep 23, 2022

Memory usage:

Image

It's clear that the memory usage of ya-relay grows about linearly at the rate of ~1MiB per 1000 connections.

@kamirr
Copy link
Contributor

kamirr commented Sep 26, 2022

Load test run for up to 50k connections, weights find=1, neighbours=1, ping=8, 10 requests per connection and 25k req/s rate limit:

Established 10 connections
| Failed connections: 0
| Total requests / sec: 23659.729.
| Average request-response delay: 0.00019259228s
| Dropped packets: 0
| Connections after filtering dropped: 10
Established 20 connections
| Failed connections: 0
| Total requests / sec: 20564.63.
| Average request-response delay: 0.00043507927s
| Dropped packets: 0
| Connections after filtering dropped: 20
Established 100 connections
| Failed connections: 0
| Total requests / sec: 23721.914.
| Average request-response delay: 0.0016799923s
| Dropped packets: 0
| Connections after filtering dropped: 100
Established 200 connections
| Failed connections: 0
| Total requests / sec: 23617.314.
| Average request-response delay: 0.0022316263s
| Dropped packets: 0
| Connections after filtering dropped: 200
Established 500 connections
| Failed connections: 0
| Total requests / sec: 15066.296.
| Average request-response delay: 0.058720216s
| Dropped packets: 71
| Connections after filtering dropped: 500
Established 1000 connections
| Failed connections: 0
| Total requests / sec: 10210.948.
| Average request-response delay: 0.09038644s
| Dropped packets: 350
| Connections after filtering dropped: 1000
Established 2000 connections
| Failed connections: 0
| Total requests / sec: 11623.276.
| Average request-response delay: 0.12302283s
| Dropped packets: 1024
| Connections after filtering dropped: 2000
Established 3000 connections
| Failed connections: 0
| Total requests / sec: 11637.868.
| Average request-response delay: 0.16153425s
| Dropped packets: 1856
| Connections after filtering dropped: 3000
Established 4000 connections
| Failed connections: 0
| Total requests / sec: 11398.173.
| Average request-response delay: 0.2528429s
| Dropped packets: 2358
| Connections after filtering dropped: 4000
Established 5000 connections
| Failed connections: 0
| Total requests / sec: 12747.932.
| Average request-response delay: 0.24257876s
| Dropped packets: 3163
| Connections after filtering dropped: 5000
Established 6000 connections
| Failed connections: 0
| Total requests / sec: 13088.596.
| Average request-response delay: 0.3376408s
| Dropped packets: 3642
| Connections after filtering dropped: 6000
Established 7000 connections
| Failed connections: 0
| Total requests / sec: 12947.235.
| Average request-response delay: 0.338364s
| Dropped packets: 4893
| Connections after filtering dropped: 7000
Established 8000 connections
| Failed connections: 0
| Total requests / sec: 13722.738.
| Average request-response delay: 0.29519024s
| Dropped packets: 7544
| Connections after filtering dropped: 8000
Established 9000 connections
| Failed connections: 0
| Total requests / sec: 12439.43.
| Average request-response delay: 0.36848822s
| Dropped packets: 9750
| Connections after filtering dropped: 9000
Established 10000 connections
| Failed connections: 0
| Total requests / sec: 12355.304.
| Average request-response delay: 0.44554806s
| Dropped packets: 12360
| Connections after filtering dropped: 10000
Established 11000 connections
| Failed connections: 0
| Total requests / sec: 11258.634.
| Average request-response delay: 0.64638424s
| Dropped packets: 14369
| Connections after filtering dropped: 11000
Established 12000 connections
| Failed connections: 0
| Total requests / sec: 12362.913.
| Average request-response delay: 0.56893057s
| Dropped packets: 16773
| Connections after filtering dropped: 12000
Established 13000 connections
| Failed connections: 0
| Total requests / sec: 11547.944.
| Average request-response delay: 0.7455934s
| Dropped packets: 18520
| Connections after filtering dropped: 13000
Established 14000 connections
| Failed connections: 0
| Total requests / sec: 12860.999.
| Average request-response delay: 0.6902675s
| Dropped packets: 20323
| Connections after filtering dropped: 14000
Established 15000 connections
| Failed connections: 0
| Total requests / sec: 11692.787.
| Average request-response delay: 0.7964773s
| Dropped packets: 22295
| Connections after filtering dropped: 15000
Established 16000 connections
| Failed connections: 0
| Total requests / sec: 12184.959.
| Average request-response delay: 0.8926443s
| Dropped packets: 23737
| Connections after filtering dropped: 16000
Established 17000 connections
| Failed connections: 0
| Total requests / sec: 13131.121.
| Average request-response delay: 0.77501076s
| Dropped packets: 26371
| Connections after filtering dropped: 17000
Established 18000 connections
| Failed connections: 0
| Total requests / sec: 13860.973.
| Average request-response delay: 0.7339702s
| Dropped packets: 28806
| Connections after filtering dropped: 18000
Established 19000 connections
| Failed connections: 0
| Total requests / sec: 13957.561.
| Average request-response delay: 0.80036736s
| Dropped packets: 31071
| Connections after filtering dropped: 19000
Established 20000 connections
| Failed connections: 0
| Total requests / sec: 14254.854.
| Average request-response delay: 0.8201631s
| Dropped packets: 33250
| Connections after filtering dropped: 20000
Established 21000 connections
| Failed connections: 0
| Total requests / sec: 14948.654.
| Average request-response delay: 0.7807064s
| Dropped packets: 35907
| Connections after filtering dropped: 21000
Established 22000 connections
| Failed connections: 0
| Total requests / sec: 15930.579.
| Average request-response delay: 0.704516s
| Dropped packets: 38551
| Connections after filtering dropped: 22000
Established 23000 connections
| Failed connections: 0
| Total requests / sec: 16182.949.
| Average request-response delay: 0.7205822s
| Dropped packets: 41036
| Connections after filtering dropped: 23000
Established 24000 connections
| Failed connections: 0
| Total requests / sec: 16423.83.
| Average request-response delay: 0.74673784s
| Dropped packets: 43194
| Connections after filtering dropped: 24000
Established 25000 connections
| Failed connections: 0
| Total requests / sec: 17314.04.
| Average request-response delay: 0.70349336s
| Dropped packets: 45518
| Connections after filtering dropped: 25000
Established 26000 connections
| Failed connections: 0
| Total requests / sec: 16848.113.
| Average request-response delay: 0.79856133s
| Dropped packets: 47419
| Connections after filtering dropped: 26000
Established 27000 connections
| Failed connections: 0
| Total requests / sec: 16796.406.
| Average request-response delay: 0.9134657s
| Dropped packets: 49567
| Connections after filtering dropped: 27000
Established 28000 connections
| Failed connections: 0
| Total requests / sec: 17471.947.
| Average request-response delay: 0.83176434s
| Dropped packets: 51898
| Connections after filtering dropped: 28000
Established 29000 connections
| Failed connections: 1
| Total requests / sec: 18106.266.
| Average request-response delay: 0.80756134s
| Dropped packets: 54261
| Connections after filtering dropped: 29000
Established 30000 connections
| Failed connections: 0
| Total requests / sec: 18410.33.
| Average request-response delay: 0.85279614s
| Dropped packets: 56192
| Connections after filtering dropped: 30000
Established 31000 connections
| Failed connections: 0
| Total requests / sec: 18515.232.
| Average request-response delay: 0.8146225s
| Dropped packets: 58206
| Connections after filtering dropped: 31000
Established 32000 connections
| Failed connections: 0
| Total requests / sec: 18516.807.
| Average request-response delay: 0.898815s
| Dropped packets: 60351
| Connections after filtering dropped: 32000
Established 33000 connections
| Failed connections: 0
| Total requests / sec: 19322.541.
| Average request-response delay: 0.8019248s
| Dropped packets: 62883
| Connections after filtering dropped: 33000
Established 34000 connections
| Failed connections: 0
| Total requests / sec: 19502.52.
| Average request-response delay: 0.91279507s
| Dropped packets: 64816
| Connections after filtering dropped: 34000
Established 35000 connections
| Failed connections: 0
| Total requests / sec: 19692.396.
| Average request-response delay: 0.90973985s
| Dropped packets: 66842
| Connections after filtering dropped: 35000
Established 36000 connections
| Failed connections: 0
| Total requests / sec: 19899.275.
| Average request-response delay: 0.9510804s
| Dropped packets: 68960
| Connections after filtering dropped: 36000
Established 37000 connections
| Failed connections: 1
| Total requests / sec: 19952.955.
| Average request-response delay: 1.0106871s
| Dropped packets: 70984
| Connections after filtering dropped: 37000
Established 38000 connections
| Failed connections: 0
| Total requests / sec: 20160.33.
| Average request-response delay: 0.9222769s
| Dropped packets: 73252
| Connections after filtering dropped: 38000
Established 39000 connections
| Failed connections: 0
| Total requests / sec: 20328.514.
| Average request-response delay: 0.99669087s
| Dropped packets: 75145
| Connections after filtering dropped: 39000
Established 40000 connections
| Failed connections: 0
| Total requests / sec: 20493.295.
| Average request-response delay: 0.9984271s
| Dropped packets: 77251
| Connections after filtering dropped: 40000
Established 41000 connections
| Failed connections: 0
| Total requests / sec: 20613.367.
| Average request-response delay: 1.0681931s
| Dropped packets: 79457
| Connections after filtering dropped: 41000
Established 42000 connections
| Failed connections: 0
| Total requests / sec: 20693.314.
| Average request-response delay: 1.0825506s
| Dropped packets: 81437
| Connections after filtering dropped: 42000
Established 43000 connections
| Failed connections: 0
| Total requests / sec: 20833.22.
| Average request-response delay: 1.0702146s
| Dropped packets: 83615
| Connections after filtering dropped: 43000
Established 44000 connections
| Failed connections: 1
| Total requests / sec: 20924.31.
| Average request-response delay: 1.2260469s
| Dropped packets: 85622
| Connections after filtering dropped: 44000
Established 45000 connections
| Failed connections: 0
| Total requests / sec: 21045.178.
| Average request-response delay: 1.020397s
| Dropped packets: 87822
| Connections after filtering dropped: 45000
Established 46000 connections
| Failed connections: 0
| Total requests / sec: 21140.227.
| Average request-response delay: 1.0487715s
| Dropped packets: 90113
| Connections after filtering dropped: 46000
Established 47000 connections
| Failed connections: 1
| Total requests / sec: 21232.172.
| Average request-response delay: 1.1847394s
| Dropped packets: 92165
| Connections after filtering dropped: 47000
Established 48000 connections
| Failed connections: 1
| Total requests / sec: 21315.678.
| Average request-response delay: 1.2217032s
| Dropped packets: 94217
| Connections after filtering dropped: 48000
Established 49000 connections
| Failed connections: 1
| Total requests / sec: 21417.95.
| Average request-response delay: 1.2247626s
| Dropped packets: 96517
| Connections after filtering dropped: 49000
Established 50000 connections
| Failed connections: 0
| Total requests / sec: 21517.559.
| Average request-response delay: 1.2195059s
| Dropped packets: 98538
| Connections after filtering dropped: 50000

@kamirr
Copy link
Contributor

kamirr commented Sep 26, 2022

Average packet size for this setup during loading in 27.8 bytes, which shouldn't overflow a 32MiB buffers I've set via sysctl -w net.core.{r,w}mem_{max,default}=33554432.

@kamirr
Copy link
Contributor

kamirr commented Sep 26, 2022

The typical response time grows ~linearly, and it appears that 50% of requests will time out at ~150k connections, assuming current load and no retransmissions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants