Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking automation extension #3350

Merged

Conversation

axsaucedo
Copy link
Contributor

@axsaucedo axsaucedo commented Jun 29, 2021

Now we currently run the benchmark tests in an independent node-group that gets created on-demand automatically (with node taints and node selectors) we have a standardised environment where benchmarks can be carried out in a reproducible manner. This PR extends on the benchmark work to ensure a set of tests can be carried out but also documentation can be updated and it is added to the releaes. Current work includes:

In this PR

  • Adding protocol support to automated benchmarking workflow
  • Benchmarks for python v1 vs v2 for python protocol
  • Service orchestrator benchmarks for different protocols triton and tfserving (TODO)

Still do be done in a separate PR

  • Benchmarking for varied data sizes
  • Adding benchmarking into documentation
  • Ensuring these benchmarking tests are part of release

@axsaucedo
Copy link
Contributor Author

/test benchmark

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results

  • All mean performance latency under 5ms: True
  • All 99th performance latenc under 10ms: True
  • REST throughput above 200rps: True
  • GRPC throughput above 250rps: True
  • Orch added mean latency under 2ms: True
  • Orch added 99th latency under 2ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 4.24579 4.10363 5.05953 5.472 6.95265 235.444 7065 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 4.22866 4.08512 5.06194 5.46561 6.78393 236.394 7093 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
2 3.09144 2.96612 3.78295 4.14485 5.38389 311.463 9346 1 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 3.08594 2.97064 3.75724 4.15926 5.15946 311.994 9362 1 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@seldondev
Copy link
Collaborator

Benchmark Python Wrapper V1 vs V2

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 4.44942 4.33682 5.25495 5.66135 6.94333 224.663 6741 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 3.18332 3.04489 3.92962 4.33793 5.39277 302.205 9071 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 2.37641 2.26739 2.81202 3.07082 4.17083 0 0 12616 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results

  • All mean performance latency under 5ms: False
  • All 99th performance latenc under 10ms: True
  • REST throughput above 200rps: False
  • GRPC throughput above 250rps: False
  • Orch added mean latency under 2ms: True
  • Orch added 99th latency under 2ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 5.15296 4.97755 6.16447 6.64955 8.33006 193.967 5822 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 5.17847 4.99935 6.19308 6.71758 8.56459 193.033 5793 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
2 3.90665 3.76633 4.71451 5.15116 6.4635 253.329 7601 1 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 3.96593 3.82184 4.81238 5.28647 6.73393 249.623 7492 0 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@seldondev
Copy link
Collaborator

Benchmark Python Wrapper V1 vs V2

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 3.57514 3.42844 4.29059 4.70679 6.01861 276.721 8304 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 30000.7 30000.7 30000.9 30000.9 30000.9 0 0 2 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 3.66264 3.54139 4.51547 4.94492 6.24725 272.828 8186 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 2.84756 2.68863 3.61763 4.0318 5.22273 345.934 10381 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results

  • All mean performance latency under 5ms: True
  • All 99th performance latenc under 10ms: True
  • REST throughput above 200rps: True
  • GRPC throughput above 250rps: True
  • Orch added mean latency under 2ms: True
  • Orch added 99th latency under 2ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 4.93739 4.76997 5.89323 6.39761 8.02119 202.454 6075 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
1 3.74998 3.59055 4.57984 5.0708 6.38041 264.013 7922 1 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
2 3.72525 3.57722 4.52464 4.96769 6.53629 265.746 7975 1 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 4.99223 4.83279 5.94388 6.48668 8.15842 200.238 6008 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

@seldondev
Copy link
Collaborator

Benchmark Python Wrapper V1 vs V2

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 4.99073 4.8272 5.955 6.49627 8.10774 200.301 6010 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 3.77308 3.62463 4.58957 5.05169 6.46848 262.319 7872 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 3.23305 3.10672 3.91435 4.31273 5.43829 309.121 9275 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 2.56709 2.40994 3.20814 3.58957 4.91539 383.923 11523 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results

  • All mean performance latency under 5ms: True
  • All 99th performance latenc under 10ms: True
  • REST throughput above 200rps: True
  • GRPC throughput above 250rps: True
  • Orch added mean latency under 2ms: True
  • Orch added 99th latency under 2ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 4.70667 4.5411 5.63263 6.08851 7.78293 212.383 6373 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
1 3.46746 3.33207 4.25709 4.70014 6.04747 285.452 8566 1 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 3.4812 3.35163 4.24358 4.65636 6.06432 284.322 8534 0 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
3 4.71839 4.56243 5.61017 6.07867 7.50235 211.84 6356 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

@seldondev
Copy link
Collaborator

Benchmark Python Wrapper V1 vs V2

  • Mean latency MLServer lower than V1 Wrapper: True
  • 99th latency MLServer lower than V1 Wrapper: True
  • Throughput MLServer larger than V1 Wrapper: True

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 4.95829 4.83411 5.85367 6.29807 7.8838 201.569 6048 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 3.55863 3.41793 4.41678 4.8505 6.10991 277.809 8336 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 3.06731 2.94289 3.74137 4.08847 5.16266 325.6 9769 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 2.4228 2.27005 3.06195 3.44944 4.66497 406.227 12190 1 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results

  • Orch added mean latency under 4ms: True
  • Orch added 99th latency under 10ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 6.8184 5.6304 8.8922 14.4083 29.4334 145.288 4363 1 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 4.4155 4.05728 6.11208 7.03026 10.6065 223.504 6708 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
1 9.13061 8.73492 11.416 12.3218 14.8557 109.484 3286 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 7.81662 7.44449 9.9064 10.8297 13.3229 127.872 3838 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@seldondev
Copy link
Collaborator

Benchmark results

Results for NDArray

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
3 71.6766 67.7274 97.2662 102.912 111.543 13.9239 417 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 11.793 10.3577 17.3481 21.2033 30.1872 84.7655 2544 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
4 6442.46 6882.19 10368.8 12347.3 15481.9 23.1996 549 149 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
2 555.351 514.757 827.09 926.523 1230.01 265.919 8061 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
5 2299.66 2347.33 2762.06 3042.91 4250.53 21.7195 602 50 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 203.484 196.907 296.572 304.395 369.658 240.152 7231 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for Tensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 7.64145 5.9514 13.0974 17.8981 28.6452 129.515 3888 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 11.8302 10.5622 17.3944 20.331 28.8974 84.501 2537 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
4 483.302 467.135 692.284 800.851 1385.1 308.217 9120 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
5 542.918 504.791 808.941 950.558 1189.69 272.554 8228 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 187.483 191.684 276.385 300.152 331.906 264.588 7900 47 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 202.719 197.587 293.14 304.35 376.536 239.021 7189 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for TFTensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 7.97326 6.74053 12.6986 15.7614 22.5158 123.836 3717 0 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 12.0013 10.6623 17.6986 21.0736 28.5857 83.2669 2499 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 515.605 497.936 659.918 690.523 790.84 290.054 8562 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 605.227 590.341 893.809 997.878 1225.2 244.042 7398 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
5 173.517 181.859 225.376 248.024 296.755 287.146 8575 49 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 218.365 202.102 304.228 331.509 402.912 222.896 6705 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results - Testing Service Orchestrator

  • Orch added mean latency under 4ms: True
  • Orch added 99th latency under 10ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 5.15403 3.89944 6.02333 11.3927 31.4884 192.641 5782 1 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 3.22051 2.98861 4.17858 4.81459 7.79193 307.02 9215 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
0 11.0216 10.2335 14.6559 16.5246 21.0272 90.6968 2723 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 9.36717 8.69793 12.2781 14.2591 18.8423 106.689 3202 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@seldondev
Copy link
Collaborator

Benchmark Results - Python Wrapper V1 vs V2

  • V1 base mean performance latency under 10ms: True
  • V1 base 99th performance latenc under 10ms: True
  • V1 base throughput above 180rps: True
  • V1 base throughput above 250rps: True
  • V2 mean performance latency under 5ms: True
  • V2 99th performance latenc under 10ms: True
  • V2 REST throughput above 250rps: True
  • V2 throughput above 300rps: True
  • Mean latency MLServer lower than V1 Wrapper: False
  • Throughput MLServer larger than V1 Wrapper: False

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
4 4.08986 3.66395 5.60625 6.83834 10.541 241.878 7259 1 seldon-benchmark-sdep-3 1 1 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 5.16367 4.54065 7.28882 9.71349 13.9216 193.578 5809 0 seldon-benchmark-sdep-0 1 1 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 400.934 401.084 428.741 439.069 458.643 373.382 11070 150 seldon-benchmark-sdep-5 1 1 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
1 415.53 400.694 491.901 497.442 518.089 356.74 10880 0 seldon-benchmark-sdep-2 1 1 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
5 133.86 132.434 144.927 150.397 162.953 373.022 11141 50 seldon-benchmark-sdep-4 1 1 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
2 137.107 106.477 198.766 201.292 205.485 361.454 10886 0 seldon-benchmark-sdep-1 1 1 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 4.12082 3.43671 6.00235 8.30387 16.4189 238.548 7160 0 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 3.65189 3.26385 4.98153 6.14327 9.53051 273.537 8207 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 187.676 192.684 229.117 281.205 326.426 791.143 23650 149 seldon-benchmark-sdep-5 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 373.978 387.042 508.624 601.274 757.677 396.372 12016 0 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 77.6699 85.1183 110.742 118.833 169.029 638.356 19121 48 seldon-benchmark-sdep-4 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 126.207 102.975 197.376 200.925 212.091 391.329 11763 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@seldondev
Copy link
Collaborator

Benchmark results - Testing Seldon V1 Data Types

Results for NDArray

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
3 68.1328 64.9898 90.0858 93.9834 109.741 14.6491 439 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 9.72465 8.25725 14.6195 17.3003 22.5055 102.808 3085 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 6308.58 6564.85 14578.5 15749.3 17138.2 23.6806 562 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 510.149 491.97 778.677 854.659 1094.9 289.217 8746 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
1 2143.12 2178.94 2337.28 3044.03 4200.76 23.2947 651 49 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
5 181.188 189.69 274.966 289.593 305.456 268.48 8091 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for Tensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
1 5.81137 4.3411 9.90641 13.5387 22.7451 170.702 5125 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 9.18743 7.61367 14.4127 16.8671 23.6237 108.808 3265 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 416.115 393.207 638.766 758.437 1015.7 356.528 10607 135 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
2 502.524 488.258 786.331 881.115 1063.06 294.078 8897 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 162.537 176.269 213.678 232.851 304.843 304.117 9099 42 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
0 178.064 189.311 269.392 288.554 306.565 273.839 8229 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for TFTensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
3 8.21936 6.81466 13.0282 16.296 22.6892 120.23 3608 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 11.8728 10.5678 17.489 20.49 27.0432 84.0393 2522 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 549.301 538.618 654.32 697.077 757.088 272.486 8049 138 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 603.912 585.876 911.368 1057.95 1335.62 244.451 7407 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
1 180.782 186.568 221.317 242.708 292.174 275.74 8257 26 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
0 212.247 202.386 295.785 306.535 375.944 227.099 6865 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results - Testing Service Orchestrator

  • Orch added mean latency under 4ms: True
  • Orch added 99th latency under 10ms: True
  • Orch added 99th latency under 20ms: False

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 4.90751 3.75487 5.86977 10.0065 29.9701 202.198 6067 1 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 2.85435 2.62116 3.76296 4.40825 6.93841 346.293 10391 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
1 7.05389 6.75762 8.57473 9.22972 11.0321 141.73 4253 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 5.87834 5.60745 7.37147 8.03516 9.63059 170.055 5103 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@seldondev
Copy link
Collaborator

Benchmark Results - Python Wrapper V1 vs V2

  • V1 base mean performance latency under 10ms: True
  • V1 base 99th performance latenc under 10ms: True
  • V1 base throughput above 180rps: True
  • V1 base throughput above 250rps: True
  • V2 mean performance latency under 5ms: True
  • V2 99th performance latenc under 10ms: True
  • V2 REST throughput above 250rps: True
  • V2 throughput above 300rps: True

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
1 3.8921 3.35946 5.78556 7.17891 11.2162 253.771 7615 1 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 5.42425 4.52794 8.09048 10.783 16.9794 184.285 5531 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 341.154 340.141 368.707 380.665 398.234 438.532 13045 150 seldon-benchmark-sdep-5 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 308.099 298.686 415.63 492.34 599.008 482.092 14521 0 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 113.071 112.006 123.806 128.895 146.884 441.699 13203 49 seldon-benchmark-sdep-4 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
0 113.988 100.627 193.683 198.203 205.545 432.718 13031 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
5 2.76987 2.34974 3.62858 4.85175 11.2628 355.559 10670 1 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 3.33193 2.94124 4.5088 6.01422 9.62315 299.872 8997 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
4 160.767 179.65 209.068 218.003 277.699 923.943 27643 136 seldon-benchmark-sdep-5 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 295.65 295.411 405.972 497.889 598.887 502.885 15199 0 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
2 64.047 73.8454 103.148 108.785 120.907 774.006 23183 50 seldon-benchmark-sdep-4 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 108.627 100.064 186.074 195.36 210.477 454.092 13664 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@seldondev
Copy link
Collaborator

Benchmark results - Testing Seldon V1 Data Types

Results for NDArray

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
4 62.3106 58.548 81.963 88.4372 105.088 16.0162 480 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 8.78941 7.18349 14.1115 16.9139 22.9719 113.738 3413 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 5611.92 6147.92 9916.78 10549 12911 26.6725 652 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
0 447.102 411.507 688.712 769.403 906.463 330.109 9965 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
1 2103.61 2089.39 2210.46 2999.03 4114.11 23.7446 663 50 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
5 163.738 186.891 204.186 213.329 296.271 299.343 9002 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for Tensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 5.76382 4.26853 10.0626 13.7692 23.1642 171.822 5156 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 8.63553 7.44773 12.7511 14.8581 19.1817 115.766 3475 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 374.875 362.651 599.177 669.478 830.728 396.033 11772 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 450.015 415.873 687.875 756.224 946.016 327.919 9939 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 144.923 130.06 203.124 213.819 285.176 341.404 10211 43 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 162.385 185.024 204.034 212.197 295.821 301.782 9084 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for TFTensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
1 4.82383 3.74498 8.15413 10.7113 17.6007 205.211 6158 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 7.83063 6.7784 11.6627 13.7981 18.858 127.668 3831 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
4 325.775 309.89 399.872 417.328 590.189 458.817 13626 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
2 464.534 445.39 694.765 795.962 992.598 317.225 9585 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 113.284 105.379 173.399 187.005 206.623 438.165 13101 50 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
0 166.388 185.56 206.135 260.192 293.772 293.385 8831 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results - Testing Service Orchestrator

  • Orch added mean latency under 4ms: True
  • Orch added 99th latency under 10ms: False
  • Orch added 99th latency under 20ms: False

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
3 4.24752 3.37341 5.07965 7.65699 26.0987 233.872 7018 1 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 0.883817 nan nan nan nan 1093.57 0 32818 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
1 6.82495 6.48407 8.33517 8.98417 10.8026 146.471 4395 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 5.70936 5.40244 7.23149 7.83358 9.61033 175.091 5254 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results - Testing Service Orchestrator

  • Orch added mean latency under 4ms: True
  • Orch added 99th latency under 10ms: True
  • Orch added 99th latency under 20ms: False

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 4.52744 3.4123 5.23325 8.93742 30.4707 219.406 6583 1 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 2.65708 2.44403 3.52362 4.13786 6.50324 372.25 11172 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
1 6.1767 5.89703 7.50543 8.09237 9.63994 161.861 4857 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 5.15675 4.91453 6.45019 7.05105 8.4752 193.866 5817 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@axsaucedo
Copy link
Contributor Author

/test benchmark

@seldondev
Copy link
Collaborator

Benchmark results - Testing Service Orchestrator

  • Orch added mean latency under 4ms: True
  • Orch added 95th latency under 5ms: True
  • Orch added 99th latency under 10ms: True

Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 3.05604 2.91424 3.74038 4.19462 5.76141 323.732 9715 1 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 2.5212 2.38453 3.23671 3.62385 4.67847 391.351 11745 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true
0 4.81854 4.71247 5.64963 6.05518 7.42155 207.448 6225 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
1 3.79547 3.64428 4.69612 5.17197 6.4236 0 0 7901 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 true

@seldondev
Copy link
Collaborator

Benchmark Results - Python Wrapper V1 vs V2

  • V1 base mean performance latency under 10ms: True
  • V1 base 99th performance latenc under 10ms: True
  • V1 base throughput above 180rps: True
  • V1 base throughput above 250rps: True
  • V2 mean performance latency under 5ms: True
  • V2 99th performance latenc under 10ms: True
  • V2 REST throughput above 250rps: True
  • V2 throughput above 300rps: True

Python V1 Wrapper Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
3 4.51205 3.7245 6.52654 9.10316 18.0315 219.049 6573 1 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 5.48869 4.60204 8.11257 11.0101 16.5157 182.116 5465 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
4 349.612 346.366 375.072 388.289 448.731 428.257 12725 143 seldon-benchmark-sdep-5 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
1 312.32 298.626 430.871 499.202 603.78 475.345 14338 0 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
5 116.666 115.514 128.848 134.224 144.511 427.985 12792 49 seldon-benchmark-sdep-4 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
0 115.213 101.021 194.914 199.564 204.956 428.455 12878 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Python V2 MLServer Results table

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
2 2.88296 2.32552 3.48678 5.5434 15.5918 342.066 10265 1 seldon-benchmark-sdep-3 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 3.59347 3.04333 4.77537 7.24763 13.1018 278.139 8345 0 seldon-benchmark-sdep-0 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 159.303 181.222 207.448 214.475 233.631 932.234 27839 149 seldon-benchmark-sdep-5 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 295.708 296.204 406.256 498.435 600.956 503.623 15263 0 seldon-benchmark-sdep-2 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 65.7001 74.7523 105.956 112.661 123.774 754.839 22624 48 seldon-benchmark-sdep-4 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 108.801 100.454 186.211 195.435 207.463 454.31 13667 0 seldon-benchmark-sdep-1 1 5 1 gs://seldon-models/sklearn/iris-0.23.2/lr_model SKLEARN_SERVER rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@seldondev
Copy link
Collaborator

Benchmark results - Testing Seldon V1 Data Types

Results for NDArray

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
0 66.8707 63.6629 87.6114 93.7489 105.32 14.9208 447 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
2 9.66414 8.593 14.4534 17.1444 24.1056 103.442 3104 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
4 5506.95 6228.21 9192.02 10075.6 11816.3 27.1919 667 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
1 444.925 416.15 679.579 723.618 887.075 331.812 10051 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
3 2022.59 2099.33 2250.41 2435.8 3196.99 24.6837 691 50 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
5 161.282 181.485 206.454 220.586 292.397 301.254 9081 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for Tensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
3 5.97613 4.51508 10.3704 13.8279 23.2594 165.744 4976 0 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 8.85852 7.65901 12.8637 15.0374 20.2541 112.848 3388 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
5 369.078 361.582 583.212 621.36 839.455 402.308 11938 150 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
2 447.854 413.191 689.183 790.88 994.9 329.975 10019 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
4 143.891 131.75 204.622 215.355 278.894 344.132 10281 50 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 162.292 185.225 204.248 212.38 298.306 301.405 9073 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

Results for TFTensor

mean 50th 90th 95th 99th throughputAchieved success errors iteration_name replicas serverWorkers serverThreads modelUri image server apiType requestsCpu requestsMemory limitsCpu limitsMemory benchmarkCpu concurrency duration rate disableOrchestrator
4 5.09014 4.02705 8.40656 10.9267 17.6478 194.502 5836 1 seldon-benchmark-sdep-3 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
0 8.32181 7.14604 12.5144 15.0654 20.4664 120.13 3606 0 seldon-benchmark-sdep-0 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 1 30s 0 false
3 338.354 320.587 404.778 429.318 493.222 441.213 13139 138 seldon-benchmark-sdep-5 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
5 467.002 465.676 704.293 800.162 992.245 315.821 9555 0 seldon-benchmark-sdep-2 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 150 30s 0 false
2 114.977 106.906 171.188 185.381 207.109 431.468 12905 49 seldon-benchmark-sdep-4 1 5 1 seldonio/seldontest_predict:1.10.0-dev grpc 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false
1 166.962 186.964 205.154 261.972 296.961 292.298 8787 0 seldon-benchmark-sdep-1 1 5 1 seldonio/seldontest_predict:1.10.0-dev rest 2000Mi 500Mi 2000Mi 500Mi 1 50 30s 0 false

@axsaucedo axsaucedo requested a review from ukclivecox July 2, 2021 09:03
@axsaucedo
Copy link
Contributor Author

Ready to merge now

@ukclivecox
Copy link
Contributor

/approve

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cliveseldon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ukclivecox ukclivecox merged commit 7673af6 into SeldonIO:master Jul 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants