Skip to content

bentoml/llm-bench

Repository files navigation

BentoCloud Benchmark Client

Usage

1. Set up environment variables

Make sure you have logged into Huggingface

huggingface-cli login

Set environment variables for benchmarking

export BASE_URL=<BentoCloud Service URL>
export SYSTEM_PROMPT=1      // 1 or 0

2. Run benchmark

python benchmark.py --max_users 10 --session_time 300 --ping_correction
  • max_users is the max number of concurrent users to spawn
  • session_time is the duration of the benchmark sesssion, in seconds
  • ping_correction is a flag that determines whether ping latency should be deducted from the metrics

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages