Skip to content

Commit

Permalink
Use existing importance matrix files for all quant formats
Browse files Browse the repository at this point in the history
  • Loading branch information
countzero committed Jun 20, 2024
1 parent fd1785e commit 98a1e1c
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 5 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ TARGET_DIRECTORY=.\gguf
# physical drive to improve the quantization speed.
CACHE_DIRECTORY=.\cache
# Path to the directory for importance matrix files.
IMPORTANCE_MATRIX_DIRECTORY=.\imatrix
#
# Comma separated list of quantization types.
#
Expand Down Expand Up @@ -106,7 +109,7 @@ QUANTIZATION_TYPES=Q5_K_M,Q3_K_S
Clone a Git repository containing an LLM into the `SOURCE_DIRECTORY` without checking out any files and downloading any large files (lfs).

```PowerShell
git -C "./source" clone --no-checkout https://huggingface.co/openchat/openchat-3.5-0106
git -C "./source" clone --no-checkout https://huggingface.co/openchat/openchat-3.6-8b-20240522
```

### 2. Download model sources
Expand Down
3 changes: 3 additions & 0 deletions imatrix/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Ignore everything in this directory except this file.
*
!.gitignore
10 changes: 6 additions & 4 deletions quantize_weights_for_llama.cpp.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ Get-Content "./.env" | ForEach {
$llamaCppDirectory = Resolve-Path -Path $env:LLAMA_CPP_DIRECTORY
$sourceDirectory = Resolve-Path -Path $env:SOURCE_DIRECTORY
$targetDirectory = Resolve-Path -Path $env:TARGET_DIRECTORY
$importanceMatrixDirectory = Resolve-Path -Path $env:IMPORTANCE_MATRIX_DIRECTORY
$cacheDirectory = Resolve-Path -Path $env:CACHE_DIRECTORY
$trainingDataPath = Resolve-Path -Path $env:TRAINING_DATA
$cleanCache = [System.Convert]::ToBoolean($env:CLEAN_CACHE)
Expand Down Expand Up @@ -42,7 +43,7 @@ ForEach ($repositoryName in $repositoryDirectories) {

# Note that we are not removing *.importance-matrix.dat files because
# they are relatively small but take a _very_ long time to compute.
$importanceMatrixPath = Join-Path -Path $targetDirectoryPath -ChildPath "${repositoryName}.importance-matrix.dat"
$importanceMatrixPath = Join-Path -Path $importanceMatrixDirectory -ChildPath "${repositoryName}.importance-matrix.dat"

# If a repository already contains an unquantized GGUF file we are using it directly.
$unquantizedModelPathFromSource = Join-Path -Path $sourceDirectory -ChildPath $repositoryName | Join-Path -ChildPath "${repositoryName}.gguf"
Expand All @@ -64,8 +65,8 @@ ForEach ($repositoryName in $repositoryDirectories) {
Invoke-Expression "$convertCommand --outfile `"${unquantizedModelPath}`" `"${sourceDirectoryPath}`""
}

# We need to compute an importance matrix for all i-quants and
# small k-quants to enhance the quality of the quantum models.
# We need to compute an importance matrix for all i-quants
# and small k-quants to enhance the quality of the models.
# https://github.com/ggerganov/llama.cpp/tree/master/examples/imatrix
$requiresImportanceMatrix = $type.Contains("IQ") -or "Q2_K Q2_K_S".Contains($type)

Expand All @@ -84,7 +85,8 @@ ForEach ($repositoryName in $repositoryDirectories) {

$quantizeCommand = "${llamaCppDirectory}\build\bin\Release\llama-quantize.exe"

if ($requiresImportanceMatrix) {
# If an importance matrix file is available we are using it.
if (Test-Path -Path $importanceMatrixPath) {
$quantizeCommand = "${quantizeCommand} --imatrix `"${importanceMatrixPath}`""
}

Expand Down

0 comments on commit 98a1e1c

Please sign in to comment.