Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
bert_dynamic.yaml		bert_dynamic.yaml
bert_static.yaml		bert_static.yaml
prepare_dataset.py		prepare_dataset.py
prepare_model.sh		prepare_model.sh
requirements.txt		requirements.txt
run_benchmark.sh		run_benchmark.sh
run_engine.py		run_engine.py
run_glue.py		run_glue.py
run_tuning.sh		run_tuning.sh
utils.py		utils.py
xlm_roberta_base_export.py		xlm_roberta_base_export.py

README.md

Step-by-Step

Prerequisite

1. Installation

1.1 Install python environment

conda create -n <env name> python=3.7
conda activate <env name>
cd <neural_compressor_folder>/examples/baremetal/nlp/stsb/paraphrase_xlm_r_multilingual_v1
pip install -r requirements.txt

Preload libiomp5.so can improve the performance when bs=1.

export LD_PRELOAD=<path_to_libiomp5.so>

Preloading libjemalloc.so can improve the performance. It has been built in third_party/jemalloc/lib.

export LD_PRELOAD=<path_to_libjemalloc.so>

2. Prepare Dataset and pretrained model

2.1 Get dataset

python prepare_dataset.py --output_dir=./data

2.2 Get model

bash prepare_model.sh

Run

1. To get the tuned model and its accuracy:

run python

GLOG_minloglevel=2 python run_engine.py --tune

or run shell

bash run_tuning.sh --config=bert_static.yaml --input_model=paraphrase_xlm_r_multilingual_v1_stsb.onnx --output_model=ir --dataset_location=data

2. To get the benchmark of tuned model:

2.1 accuracy run python

GLOG_minloglevel=2 python run_engine.py --input_model=./ir --benchmark --mode=accuracy --batch_size=4

or run shell

bash run_benchmark.sh --config=bert_static.yaml --input_model=ir --dataset_location=data --batch_size=4 --mode=accuracy

2.2 performance run python

GLOG_minloglevel=2 python run_engine.py --input_model=./ir --benchmark --mode=performance --batch_size=4

or run shell

bash run_benchmark.sh --config=bert_static.yaml --input_model=ir --dataset_location=data --batch_size=4 --mode=performance

or run C++ The warmup below is recommended to be 1/10 of iterations and no less than 3.

export GLOG_minloglevel=2
export OMP_NUM_THREADS=<cpu_cores>
export DNNL_MAX_CPU_ISA=AVX512_CORE_AMX
export UNIFIED_BUFFER=1
numactl -C 0-<cpu_cores-1> <neural_compressor_folder>/engine/bin/inferencer --batch_size=<batch_size> --iterations=<iterations> --w=<warmup> --seq_len=128 --config=./ir/conf.yaml --weight=./ir/model.bin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paraphrase_xlm_r_multilingual_v1

paraphrase_xlm_r_multilingual_v1

README.md

Step-by-Step

Prerequisite

1. Installation

2. Prepare Dataset and pretrained model

2.1 Get dataset

2.2 Get model

Run

1. To get the tuned model and its accuracy:

2. To get the benchmark of tuned model:

Files

paraphrase_xlm_r_multilingual_v1

Directory actions

More options

Directory actions

More options

Latest commit

History

paraphrase_xlm_r_multilingual_v1

Folders and files

parent directory

README.md

Step-by-Step

Prerequisite

1. Installation

2. Prepare Dataset and pretrained model

2.1 Get dataset

2.2 Get model

Run

1. To get the tuned model and its accuracy:

2. To get the benchmark of tuned model: