MLPerf Client is a benchmark for Windows and macOS, focusing on client form factors in ML inference scenarios like AI chatbots, image classification, etc. The benchmark evaluates performance across different hardware and software configurations, providing command line interface.
- Features
- Running the Application
- Command Line Arguments
- Configuration File Format
- Supported Execution Providers and Configurations
- Supported Platforms by Execution Provider
The MLPerf Client benchmark measures the performance of inference tasks on personal computers. It supports the following features:
- The application provides reference models for inference, including support for models such as Llama 2.
- Independent Hardware Vendors {IHVs} offer paths as Execution Providers, with their own custom model format and stack dependencies, which adhere to IHV APIs.
- The client bundles all enabled/active paths from the EP list (see below)
- Configure hardware and inference parameters using a JSON configuration file.
- The final build product is a single binary for each platform, with model and data files downloaded as needed.
You can run the application using:
.\mlperf-windows.exe [options]
Below are the available command line arguments that control the tool's behavior:
Argument | Description | Required | Default Value | Available Values |
---|---|---|---|---|
-h, --help |
Show help message and exit | No | ||
-v, --version |
Show the version of the tool | No | ||
-c, --config |
Path to the configuration file | Yes | Any valid path | |
-o, --output-dir |
Specify output directory. If not provided, default output directory is used. | No | ||
-d, --data-dir |
Specify directory, where all required data files will be downloaded | No | data |
|
-p, --pause |
Flag to allow pausing the program at the end of execution. | No | true |
true , false |
-l, --logger |
Path to the log4cxx configuration file | No | log4cxx.xml |
Any valid path |
-m, --list-models |
List all the available models supported by the tool | No | ||
-b, --download_behaviour |
Controls The files download behaviour | No | normal |
forced , prompt , skip_all , deps_only , normal |
Typical usage of the tool would look like this:
.\mlperf-windows.exe -c NVIDIA_ORTGenAI-DML_GPU.json
This command runs the benchmark using NVIDIA_ORTGenAI-DML_GPU.json
as the configuration file.
-b
Controls the behavior of the application's file downloading process.forced
: The application will download the necessary files even if they are already present.prompt
: The application will check for the existence of required files. If any are missing, the user will be prompted to decide whether to download the missing files or abort the operation.skip_all
: The application will skip downloading any files. If a required file is missing, an error will be displayed, and the operation will be aborted.deps_only
: The application will download only the required dependencies and exit.normal
(default): The application will check the local cache first. If files are not cached or their URIs have changed, it will download the missing/updated files and cache them for future use.
Now, let's see how to create a configuration file for the tool.
Before reading this section, please check the repository's data folder for sample configuration files. The configuration file is a JSON file that stores the tool's settings. This is an example of a configuration file.
{
"SystemConfig": {
"Comment": "Default config",
"TempPath": ""
},
"Scenarios": [
list
of
scenarios
here
]
}
Where:
- SystemConfig: Contains the system configuration settings for the tool.
Comment
: A comment for the configuration file that describes the configuration.TempPath
: The path to the temporary directory where the tool will store the downloaded files. If this is not provided, the tool will use the system's temporary directory.BaseDir
: TheBaseDir
is an Optional field that can be used to specify a base directory that will be prepended to all relative paths defined in the configuration file. When a relative path is detected in the configuration, the tool automatically prepends the basedir value to it.DownloadBehavior
: TheDownloadBehavior
is an Optional field that do exactly what the-b
command line argument does. It controls the behavior of the application's file downloading process. If theDownloadBehavior
provided in both the configuration file and the command line, the command line argument will take precedence.
- Scenarios: A list of scenarios to run using the tool. Each scenario has the following format:
Where:
{ "Name": "Llama2", "Models": [list of models here], "InputFilePath": [ "input file path1", "input file path2", ...], "AssetsPath": [ "assets path1", "assets path2", ...], "ResultsVerificationFile": "path to results verification file", "DataVerificationFile": "path to data verification file", "Iterations": 1000, "WarmUp": 1, "Delay": 0, "ExecutionProviders": [list of execution providers here] }
-
Name
: the name of the scenario. -
Models
: a list of models to use for the scenario, the tool will run the scenario using each of the models in the list. each model has the following format:{ "ModelName": "Llama2 llama-2-7b-chat-dml", "FilePath": "https://client.mlcommons-storage.org/deps/0.5/scenario_files/llm/llama2/models/OrtGenAI/llama-2-7b-chat-dml/model.onnx", "DataFilePath": "https://client.mlcommons-storage.org/deps/0.5/scenario_files/llm/llama2/models/OrtGenAI/llama-2-7b-chat-dml/model.onnx.data.zip", "TokenizerPath": "https://client.mlcommons-storage.org/deps/0.5/scenario_files/llm/llama2/models/OrtGenAI/llama-2-7b-chat-dml/tokenizer.zip" }
Where:
ModelName
: Specifies the model name.FilePath
: Specifies the path to the main model file (e.g., llama2-cpu-int4/model.onnx).DataFilePath
: Specifies the path to the data file associated with the model (e.g., llama-2-7b-chat-dml/model.onnx.data). This file is required for executing the Llama2 scenario. Some models might have this file splitted into multiple files. To avoid setting manually all the files in this dictionary, these can be zipped into a single file - the app will take care of the unzipping.TokenizerPath
: Specifies the path to the tokenizer file (e.g., llama-2-7b-chat/tokenizer.zip). This file is required for executing the Llama2 scenario and contains the necessary tokenizer configuration and data.(The file does not need to be in ZIP format; it can be any supported format compatible with the model.)
-
InputFilePath
: is a list of paths to the data files used in the scenario (input data) this could be a list of json files or even zip files. -
AssetsPath
: is a list of paths to the assets files used in the scenario. -
ResultsVerificationFile
: is the path to the file that contains the expected results of the scenario, ie the excepted output for each input. -
DataVerificationFile
: a path to the file that contains the SHA256 hash of all the files used in the scenario, this is used to verify the integrity of the data files. Please note that this file is optional, but if it's provided, it should contain the hashes of all the files used in the scenario. The format of theDataVerificationFile
should be:{ "FileHashes": [ "8b13ac3ccb407ecbbcae0fe427f065c44dd96e259935de35699e607ec5ba12bc", ... ] }
The order of the hashes shouldn't matter, the tool will verify the integrity of the files by comparing the hashes of the files with the hashes in the
DataVerificationFile
. -
Iterations
: the number of iterations to run the scenario for. -
WarmUp
: the number of warmup iterations to run the scenario for. -
Delay
: the delay in seconds between each scenario iteration. This is an optional argument, and the default values varies by model. Forllama2
. -
ExecutionProviders
: a list of execution providers to use for the scenario, the tool will run the scenario using each of the execution providers in the list. each execution provider has the following format:{ "Name": "OrtGenAI", "Config": { "device_type": "GPU", "device_id": 0 }, "Models": [ overriden models here ] }
Where:
Name
: the name of the execution provider.Config
: the configuration for the execution provider, this is specific to each execution provider, for example, the configuration for OrtGenAI execution provider may look like this:
{ "device_type": "GPU", "device_id": 0 }
Models
: this is an optional section that can be used to override specific models for the current execution provider. It has the same format as theModels
section of the scenario. You need to provide theModelName
and override values for theFilePath
,DataFilePath
, andTokenizerPath
if necessary.
-
This tool accepts URIs in the following formats:
- Local files:
file://<path>
(must be a valid absolute local path or relative toBaseDir
(if provided). Please note that these files are stored in cache later.) - Remote resources:
https://<url>
(must be a valid URL)
You can find specific tested workflows for different vendors in:
data/configs/vendors_default/
The following table contains a list of the supported execution providers (EPs) and the possible configurations we support for each of them.
Execution Provider | Configuration Options | Configuration Type | Required | Default Value | Available Values | Description |
---|---|---|---|---|---|---|
NativeOpenVINO(✅ Active) | device_type | string | True | ["GPU"] | ||
num_of_threads | integer | False | 0 | >= -1 | ||
num_streams | integer | False | 0 | >= 1 | ||
OrtGenAI(✅ Active) | device_type | string | True | ["GPU"] | ||
device_id | integer | False | 0 | |||
device_vendor | string | False |
- ✅ Active: Enabled for production and development builds.
- OrtGenAI:
device_id
is not required. If not present, the app will use any adapters it finds, unless they are software/default render devices.devide_type
- optional; "GPU" is a valid option. If it is used:- If
device_id
is present, it will determine whether the device is GPU. If it does not match, it won't run. - If
device_id
is not provided,device_type
will function as a filter, only running on devices that are " GPU" or "NPU".
- If
device_vendor
: It can be a vendor name, such as "Nvidia", "AMD" or "Intel". Devices will be filtered to match the vendor. Case insensitive:- If
device_id
is present, it will check to see if the device vendor matches. If it does not match, it will not run. - If
device_id
is not specified, device_vendor will act as a filter, only running on devices that match the value.
- If
Execution Provider | Windows x64 | Windows ARM | macOS | Comments |
---|---|---|---|---|
NativeOpenVINO | ✅ | ❌ | ❌ | |
OrtGenAI | ✅ | ❌ | ❌ |