Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Loading a Subset of Tensors for LoRA Models #399

Closed
skeskinen opened this issue Mar 22, 2023 · 6 comments · Fixed by #820
Closed

Support for Loading a Subset of Tensors for LoRA Models #399

skeskinen opened this issue Mar 22, 2023 · 6 comments · Fixed by #820
Labels
enhancement New feature or request model Model specific 🦙. llama

Comments

@skeskinen
Copy link

Firstly, thank you for the awesome project. I'm new to LLMs so I hope this suggestion makes sense.

LoRA is a technique used to reduce the number of parameters during finetuning, that is really hitting off with the recent Alpaca stuff. In LoRA models, typically, only the weight matrices Wq and Wv are fine-tuned.

For projects shipping multiple LoRA fine-tuned models, most of the tensors remain unchanged during the fine-tuning process. Storing all weights multiple times would lead to a significant waste of storage space (e.g., ~3.5 GB of data per fine-tune for a 7B model, multiplied by the number of tasks or personalities you want to ship). Supporting the loading of a subset of tensors for LoRA models would enable efficient storage and loading of these models in llama.cpp, reducing storage space requirements, and maybe memory footprint if you wanted to keep multiple models in memory at the same time.

I propose to extend llama.cpp's functionality by adding support for loading a subset of tensors from separate .bin files. This way all the business of merging the LoRA weights would still be done in python. And also the model subset .bin files could be quantized like usual.

An alternative could be to natively support LoRA in llama.cpp. However, this approach would likely not be compatible with pre-quantization of the weights (afaict).

@Green-Sky Green-Sky added enhancement New feature or request model Model specific labels Mar 22, 2023
@ggerganov ggerganov added the 🦙. llama label Mar 22, 2023
@ggerganov
Copy link
Owner

Thank you for the useful summary of LoRA - I wasn't familiar and was wondering what it actually means.
The proposed functionality sounds like something that can be achieved relatively easy in the existing framework.

Just curious - is this functionality currently available in other frameworks?
Loading multiple personalities of the model in-memory with reduced storage and dynamically switching between them.

@BadisG
Copy link

BadisG commented Mar 22, 2023

@ggerganov Loras are used a lot in Stable Diffusion and in the webui version of llama aswell oobabooga/text-generation-webui#332 (it doesn't work for the 4 bits for them atm though)

@bakkot
Copy link
Contributor

bakkot commented Mar 23, 2023

Loading multiple personalities of the model in-memory with reduced storage and dynamically switching between them.

With Stable Diffusion loading LoRAs separately from models is very popular - there's a whole ecosystem of LoRAs distributed on places like civita. Many people end up with dozens or hundreds of LoRAs around, which is much more practical than keeping dozens of 4GB+ models. That will be even more so with LLaMA, given its larger size.

I expect this to be popular for LLaMA as well once the process for fine-tuning models gets to be more accessible.

@redthing1
Copy link

See related technique: #528

@edwios
Copy link

edwios commented Mar 29, 2023

There are already related discussions and attempts here: #172

and an implementation (using the original LLaMA checkpoints) here: https://github.com/tloen/alpaca-lora#inference-generatepy

If Lora can be made to use with q4 it'd be an awesome feature to both text generation and chat, very much like Lora for images with Stable Diffusion.

@skeskinen
Copy link
Author

That discussion is kind of orthogonal to this feature request. alpaca-lora has the script for merging lora weights and converting back to pytorch format, the result of which can be used with llama.cpp as usual. That already works today.

@ggerganov ggerganov changed the title Support for Loading a Subset of Tensors for LoRA Models Support for Loading a Subset of Tensors for LoRA Models Apr 14, 2023
@ggerganov ggerganov linked a pull request Apr 14, 2023 that will close this issue
@ggerganov ggerganov moved this to In Progress in llama : add LoRA support Apr 14, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in llama : add LoRA support Apr 17, 2023
AAbushady pushed a commit to AAbushady/llama.cpp that referenced this issue Jan 27, 2024
* koboldcpp-ROCm Port

commit 3416c98
Merge: 5eb17f0 4c4e435
Author: YellowRoseCx <[email protected]>
Date:   Fri Aug 25 13:46:56 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 5eb17f0
Author: YellowRoseCx <[email protected]>
Date:   Fri Aug 25 13:38:21 2023 -0500

    ROCm Port update

    * use hipblas based on cublas
    * Update Makefile for the Cuda kernels
    * Expand arch list and make it overrideable
    * Fix multi GPU on multiple amd architectures with rocblas_initialize() (ggerganov#5)
    * add hipBLAS to README
    * new build arg LLAMA_CUDA_MMQ_Y
    * fix half2 decomposition
    * Add intrinsics polyfills for AMD
    * AMD assembly optimized __dp4a
    * Allow overriding CC_TURING
    * use "ROCm" instead of "CUDA"
    * ignore all build dirs
    * Add Dockerfiles
    * fix llama-bench
    * fix -nommq help for non CUDA/HIP

    ---------

    Co-Authored-By: YellowRoseCx <[email protected]>
    Co-Authored-By: ardfork <[email protected]>
    Co-Authored-By: funnbot <[email protected]>
    Co-Authored-By: Engininja2 <[email protected]>
    Co-Authored-By: Kerfuffle <[email protected]>
    Co-Authored-By: jammm <[email protected]>
    Co-Authored-By: jdecourval <[email protected]>

commit b34f4bd
Author: YellowRoseCx <[email protected]>
Date:   Sat Aug 19 17:12:52 2023 -0500

    Update README.md

commit 7d11961
Author: YellowRoseCx <[email protected]>
Date:   Mon Aug 14 23:03:12 2023 -0500

    remove force DMMV

commit cd61aa0
Author: YellowRoseCx <[email protected]>
Date:   Sat Aug 12 17:24:31 2023 -0500

    restore main_gpu parameter

commit 4a042f3
Author: Henri Vasserman <[email protected]>
Date:   Sat Aug 12 10:51:46 2023 +0300

    gfx1100 support

    ---------

    Co-authored-by: ardfork <[email protected]>
    Co-authored-by: jammm <[email protected]>
    Co-authored-by: jdecourval <[email protected]>

commit 8913bc6
Author: Henri Vasserman <[email protected]>
Date:   Fri Aug 11 10:16:02 2023 +0300

    Allow overriding CC_TURING

commit e77a4c3
Author: Henri Vasserman <[email protected]>
Date:   Fri Aug 11 10:00:07 2023 +0300

    Merge 'origin/master' into hipblas

commit cc4c4e3
Author: Engininja2 <[email protected]>
Date:   Fri Aug 11 09:43:14 2023 +0300

    New __dp4a assembly

    Now compatible with gfx900 and faster as well.

commit 1a03b70
Author: Henri Vasserman <[email protected]>
Date:   Fri Aug 11 09:30:28 2023 +0300

    Undo mess

    ---------

    Co-authored-by: ardfork <[email protected]>

commit 4366ff9
Author: DannyDaemonic <[email protected]>
Date:   Thu Aug 10 13:11:36 2023 -0700

    Handle `ENABLE_VIRTUAL_TERMINAL_PROCESSING` more gracefully on earlier versions of Windows.

commit 811ff85
Author: Christian Demsar <[email protected]>
Date:   Thu Aug 10 10:28:27 2023 -0400

    Add --n-predict -2 for stopping generation on full context (ggerganov#2565)

commit 37c9717
Author: Martin Krasser <[email protected]>
Date:   Thu Aug 10 12:16:38 2023 +0200

    Fix grammar-based sampling issue in server (ggerganov#2566)

commit d18ecd5
Author: YellowRoseCx <[email protected]>
Date:   Thu Aug 10 13:19:41 2023 -0500

    make mmq gen faster for amd

commit 243894a
Author: Henri Vasserman <[email protected]>
Date:   Thu Aug 10 12:14:40 2023 +0300

    ws fix

commit ac2f14d
Author: Engininja2 <[email protected]>
Date:   Thu Aug 10 12:11:27 2023 +0300

    AMD assembly optimized __dp4a

    Doesn't seem to work for gfx900, so commented out.

commit 9dba0c9
Author: Henri Vasserman <[email protected]>
Date:   Thu Aug 10 12:09:28 2023 +0300

    Fix merge

    ---------

    Co-authored-by: ardfork <[email protected]>
    Co-authored-by: Kerfuffle <[email protected]>

commit f570b5c
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 22:11:20 2023 -0500

    Revert "revert cuda changes as they are bugggy"

    This reverts commit 1541bf8.

commit 1541bf8
Author: Concedo <[email protected]>
Date:   Wed Aug 9 22:36:41 2023 +0800

    revert cuda changes as they are bugggy

commit bacc202
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 20:37:17 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit b7cb4cf
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 20:00:52 2023 -0500

    additional fixes

commit fadae72
Merge: 518eb2a 8f8ab6c
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 18:45:50 2023 -0500

    Merge branch 'hipblas' into develop4Main

commit 518eb2a
Merge: bda0215 cae6a84
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 18:32:10 2023 -0500

    Merge remote-tracking branch 'upstream/concedo' into develop2Main

commit bda0215
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 18:17:54 2023 -0500

    update makefile to multisystem path

commit 8f8ab6c
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 9 18:05:03 2023 -0500

    hipLDFLAG Path change Unix to multisystem in Makefile

    changed the hardcoded linux distro hipblas LD path from -L/opt/rocm/lib to use the defined ROCM_PATH variable to be flexible with ROCm on non-Linux OS

commit 610ba4c
Merge: 4024f91 25d43e0
Author: Henri Vasserman <[email protected]>
Date:   Wed Aug 9 23:54:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 4024f91
Author: Henri Vasserman <[email protected]>
Date:   Wed Aug 9 01:56:44 2023 +0300

    Add intrinsics polyfills for AMD

    ---------

    Co-authored-by: ardfork <[email protected]>
    Co-authored-by: funnbot <[email protected]>
    Co-authored-by: Engininja2 <[email protected]>

commit ab62128
Merge: d91456a f5bfea0
Author: Henri Vasserman <[email protected]>
Date:   Wed Aug 9 00:37:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ee9fa2a
Author: YellowRoseCx <[email protected]>
Date:   Wed Aug 2 01:53:58 2023 -0500

    Update Makefile

commit d91456a
Author: ardfork <[email protected]>
Date:   Mon Jul 31 20:35:00 2023 +0300

    fix half2 decomposition

commit c1cb70d
Author: Henri Vasserman <[email protected]>
Date:   Mon Jul 31 19:56:44 2023 +0300

    new build arg LLAMA_CUDA_MMQ_Y

commit c1664a0
Merge: 4336231 0728c5a
Author: Henri Vasserman <[email protected]>
Date:   Mon Jul 31 19:32:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 848558d
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 30 20:02:52 2023 -0500

    import vars logic fix

commit b650b84
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 30 00:21:36 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 8573a67
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 29 21:31:12 2023 -0500

    remove duplicate code and fix typo

    remove duplicate tooltip

commit 430986e
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 29 21:07:34 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available
    " if len(runopts)==6 else + "

commit dd0db72
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 29 20:52:31 2023 -0500

    hide "missing" if all are built

    move tooltip functions to helper functions section. hides the string "Missing: ..." from showing if all backends are available

commit 43fffb6
Merge: 0ed65a4 b40550c
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 29 19:13:15 2023 -0500

    Merge branch 'concedo'

commit 0ed65a4
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 29 18:34:21 2023 -0500

    Hide unavailable backends & Add tooltip over backend count

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

    Add tooltip when hovering over backend count label

    hovering over the new label that shows the backend count will explain what the numbers are, and show the users which backends are not available or built

commit 2a26398
Merge: cee2e9d 31486eb
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 29 15:16:33 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 4336231
Author: Henri Vasserman <[email protected]>
Date:   Sat Jul 29 18:35:56 2023 +0300

    add hipBLAS to README

    ---------

    Co-authored-by: ardfork <[email protected]>

commit f8e3fc6
Author: Henri Vasserman <[email protected]>
Date:   Sat Jul 29 14:16:46 2023 +0300

    rocblas init stuff

commit d2ade63
Merge: cde52d6 8a88e58
Author: Henri Vasserman <[email protected]>
Date:   Sat Jul 29 12:59:48 2023 +0300

    Merge 'origin/master' into hipblas

commit cee2e9d
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 26 23:36:55 2023 -0500

    Only Show Available Backends in GUI

    Hides unavailable backends from the user and if the program is launched without any backends made, it shows an error message to them stating no backends were found and to make them using the 'make' command

commit 7863610
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 26 13:27:22 2023 -0500

    Update easy_KCPP-ROCm_install.sh

commit 731cd6e
Author: YellowRoseCx <[email protected]>
Date:   Tue Jul 25 22:39:50 2023 -0500

    Create easy_rocm_install.sh

commit f154685
Merge: cbdc1f3 94e0a06
Author: YellowRoseCx <[email protected]>
Date:   Tue Jul 25 22:25:10 2023 -0500

    Merge branch 'concedo_experimentalMAIN'

commit cbdc1f3
Merge: 5b838d4 9731682
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 16:53:21 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cde52d6
Merge: 8e8054a 84e09a7
Author: Henri Vasserman <[email protected]>
Date:   Mon Jul 24 12:22:58 2023 +0300

    Merge 'origin/master' into hipblas

commit 8e8054a
Author: Henri Vasserman <[email protected]>
Date:   Mon Jul 24 12:20:49 2023 +0300

    Add rocblas to build files

commit 1f6294d
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 03:52:01 2023 -0500

    Fix multi GPU on multiple amd architectures with rocblas_initialize() (ggerganov#5)

    * initialize rocblas

commit 5b838d4
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 03:10:35 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 9bfb2fd
Merge: b379f9d 66328fc
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 03:07:44 2023 -0500

    Merge branch 'concedo_experimental'

commit b379f9d
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 03:07:00 2023 -0500

    Revert "amd multigpu full layer offload w/o vram scratch"

    This reverts commit 9adfc8e.

commit 9adfc8e
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 02:56:40 2023 -0500

    amd multigpu full layer offload w/o vram scratch

commit 05c792e
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 24 00:18:48 2023 -0500

    initialize rocblas

commit ade68d0
Merge: 521ad6b 56995ca
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 23 20:25:05 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 521ad6b
Author: YellowRoseCx <[email protected]>
Date:   Thu Jul 20 21:42:33 2023 -0500

    lazy import_var error handling for saves

commit 9553e52
Merge: cac6650 f036109
Author: YellowRoseCx <[email protected]>
Date:   Thu Jul 20 19:59:41 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit cac6650
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 17 23:05:02 2023 -0500

    Makefile fix! Allows hip/clblast build together

commit 3db70b5
Merge: 2ec4466 7568d1a
Author: Henri Vasserman <[email protected]>
Date:   Tue Jul 18 01:54:17 2023 +0300

    Merge 'origin/master' into hipblas

commit f208670
Author: YellowRoseCx <[email protected]>
Date:   Fri Jul 14 02:56:03 2023 -0500

    improve error handling with gpu names

commit 860e738
Author: YellowRoseCx <[email protected]>
Date:   Fri Jul 14 00:33:03 2023 -0500

    Show GPU names in GUI, Only show GPUs that exist

    changed the pre-set 1,2,3 and 1,2,3,all settings that the GPU selector had and replaced them with a function that grabs the GPU names and sets the names as the values for the selector boxes.

commit 2ec4466
Author: Henri Vasserman <[email protected]>
Date:   Thu Jul 13 13:44:02 2023 +0300

    Update build flags.

    GGML_CUDA_DMMV_Y is now GGML_CUDA_MMV_Y
    so update your build instructions.

    GGML_CUDA_FORCE_DMMV is always enabled.

    ---------

    Co-authored-by: YellowRoseCx <[email protected]>

commit cd36b18
Merge: afcb8fe 1cbf561
Author: Henri Vasserman <[email protected]>
Date:   Thu Jul 13 13:03:01 2023 +0300

    Merge 'origin/master' into hipblas

commit ac7ebc3
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 18:32:18 2023 -0500

    add hipBLAS name scheme to GUI and update README

commit 7f85cc5
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 17:35:54 2023 -0500

    update makefile and ggml.c

commit 6ca3499
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 15:43:45 2023 -0500

    ggml.c fix

commit 770e674
Merge: 2b289cd 5941514
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 15:24:36 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2b289cd
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 14:30:00 2023 -0500

    Update c-cpp.yml

commit 5dae95a
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 14:28:51 2023 -0500

    Update c-cpp.yml

commit b37cd73
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 12 14:27:04 2023 -0500

    Create c-cpp.yml to test Actions

commit afcb8fe
Author: Henri Vasserman <[email protected]>
Date:   Tue Jul 11 18:09:27 2023 +0300

    Add new config option

commit 8c2c497
Merge: e610466 2347463
Author: Henri Vasserman <[email protected]>
Date:   Tue Jul 11 17:53:54 2023 +0300

    Merge 'origin/master' into hipblas

commit e610466
Author: Henri Vasserman <[email protected]>
Date:   Tue Jul 11 17:53:14 2023 +0300

    Expand arch list and make it overrideable

commit 80e4e54
Merge: 7735c5a 1d16309
Author: Henri Vasserman <[email protected]>
Date:   Mon Jul 10 02:09:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 8432e9d
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 9 16:55:30 2023 -0500

    Update Makefile

commit b58c189
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 9 16:20:00 2023 -0500

    Add multi-gpu CuBLAS support to new GUI

commit 0c1c71b
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 8 07:56:57 2023 -0500

    Update Makefile

commit f864f60
Author: Johannes Gäßler <[email protected]>
Date:   Sat Jul 8 00:25:15 2023 +0200

    CUDA: add __restrict__ to mul mat vec kernels (ggerganov#2140)

commit 4539bc2
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 8 01:36:14 2023 -0500

    update makefile for changes

commit 912e31e
Merge: 74e2703 ddaa4f2
Author: YellowRoseCx <[email protected]>
Date:   Fri Jul 7 23:15:37 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 74e2703
Merge: cf65429 f9108ba
Author: YellowRoseCx <[email protected]>
Date:   Wed Jul 5 15:16:49 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit 7735c5a
Merge: c3e3733 7ee76e4
Author: Henri Vasserman <[email protected]>
Date:   Tue Jul 4 17:09:16 2023 +0300

    Merge 'origin/master' into hipblas

commit cf65429
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 3 16:56:40 2023 -0500

    print cuda or opencl based on what's used

commit 72c16d2
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 3 16:45:39 2023 -0500

    Revert "fix my mistake that broke other arches"

    This reverts commit 777aed5.

commit 777aed5
Author: YellowRoseCx <[email protected]>
Date:   Mon Jul 3 15:53:32 2023 -0500

    fix my mistake that broke other arches

commit 27780a9
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 2 16:03:27 2023 -0500

    rocm fixes

commit f52c7d4
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 2 16:02:58 2023 -0500

    Revert "rocm fixes"

    This reverts commit 2fe9927.

commit 2fe9927
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 2 15:58:21 2023 -0500

    rocm fixes

commit efe7560
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 2 15:55:43 2023 -0500

    Revert "move HIPBLAS definitions into ggml-cuda.h"

    This reverts commit bf49a93.

commit 4fc0181
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 2 15:55:36 2023 -0500

    Revert "move hipblas definitions to header files"

    This reverts commit 2741ffb.

commit 89eb576
Merge: 2741ffb 3d2907d
Author: YellowRoseCx <[email protected]>
Date:   Sun Jul 2 14:44:13 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c3e3733
Author: Henri Vasserman <[email protected]>
Date:   Sun Jul 2 15:51:31 2023 +0300

    ROCm fixes

commit 15db19a
Merge: 04419f1 46088f7
Author: Henri Vasserman <[email protected]>
Date:   Sun Jul 2 15:39:57 2023 +0300

    Merge 'origin/master' into hipblas

commit 2741ffb
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 1 17:07:42 2023 -0500

    move hipblas definitions to header files

commit bf49a93
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 1 16:38:50 2023 -0500

    move HIPBLAS definitions into ggml-cuda.h

commit 540f4e0
Merge: 2c3b46f eda663f
Author: YellowRoseCx <[email protected]>
Date:   Sat Jul 1 14:58:32 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 2c3b46f
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 29 18:43:43 2023 -0500

    changes to fix build

commit c9e1103
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 29 18:20:07 2023 -0500

    Update ggml_v2-cuda-legacy.cu for ROCM

commit b858fc5
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 29 17:49:39 2023 -0500

    changes to work with upstream

commit 69a0c25
Merge: 096f0b0 1347d3a
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 29 16:59:06 2023 -0500

    Merge remote-tracking branch 'upstream/concedo'

commit 04419f1
Merge: bb16eff d3494bb
Author: Henri Vasserman <[email protected]>
Date:   Wed Jun 28 23:30:10 2023 +0300

    Merge 'origin/master' into hipblas

commit bb16eff
Author: YellowRoseCx <[email protected]>
Date:   Wed Jun 28 15:27:10 2023 -0500

    headers fix; add kquants_iter for hipblas and add gfx803 (ggerganov#1)

    * kquants_iter for hipblas and add gfx803
    * Update CMakeLists.txt with hipblas kquants_iter and DMMV_F16
    * remove dmmv_f16 for now

commit 096f0b0
Author: YellowRoseCx <[email protected]>
Date:   Wed Jun 28 15:27:02 2023 -0500

    revert unnecessary hipblas conditionals

commit d81e81a
Author: YellowRoseCx <[email protected]>
Date:   Wed Jun 28 14:48:23 2023 -0500

    Update Makefile hipblas nvcc correction

commit c8ae945
Merge: c1e5c83 0be54f7
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 27 10:50:37 2023 +0300

    Merge 'origin/master' into hipblas

commit 2579ecf
Merge: abed427 d2034ce
Author: YellowRoseCx <[email protected]>
Date:   Sun Jun 25 17:50:04 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit c1e5c83
Merge: 35a6031 447ccbe
Author: Henri Vasserman <[email protected]>
Date:   Sun Jun 25 21:40:05 2023 +0300

    Merge 'origin/master' into hipblas

commit 35a6031
Merge: df7346c 66a2555
Author: Henri Vasserman <[email protected]>
Date:   Sun Jun 25 10:57:48 2023 +0300

    Merge 'origin/master' into hipblas

commit abed427
Author: YellowRoseCx <[email protected]>
Date:   Sat Jun 24 19:16:30 2023 -0500

    reorganize If statements to include proper headers

commit 06c3bf0
Merge: ea6d320 8342fe8
Author: YellowRoseCx <[email protected]>
Date:   Sat Jun 24 16:57:20 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit ea6d320
Author: YellowRoseCx <[email protected]>
Date:   Fri Jun 23 01:53:28 2023 -0500

    Update README.md

commit 4d56ad8
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 22 16:19:43 2023 -0500

    Update README.md

commit 21f9308
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 22 15:42:05 2023 -0500

    kquants_iter for hipblas and add gfx803

commit df7346c
Merge: 5dd2fbe 7487137
Author: Henri Vasserman <[email protected]>
Date:   Thu Jun 22 20:51:09 2023 +0300

    Merge 'origin/master' into hipblas

commit b6ff890
Merge: eb094f0 e6ddb15
Author: YellowRoseCx <[email protected]>
Date:   Thu Jun 22 12:42:09 2023 -0500

    Merge branch 'LostRuins:concedo' into main

commit eb094f0
Author: YellowRoseCx <[email protected]>
Date:   Wed Jun 21 23:59:18 2023 -0500

    lowvram parameter description

commit 3a5dfeb
Merge: 665cc11 b1f00fa
Author: YellowRoseCx <[email protected]>
Date:   Wed Jun 21 16:53:03 2023 -0500

    Merge branch 'LostRuins:concedo' into koboldcpp-rocm

commit 665cc11
Author: YellowRoseCx <[email protected]>
Date:   Wed Jun 21 01:13:19 2023 -0500

    add lowvram parameter

commit 222cbbb
Author: YellowRoseCx <[email protected]>
Date:   Tue Jun 20 19:03:28 2023 -0500

    add additional hipblas conditions for cublas

commit e1f9581
Author: YellowRoseCx <[email protected]>
Date:   Tue Jun 20 16:51:59 2023 -0500

    Add hip def for cuda v2

commit 3bff5c0
Merge: a7e74b3 266d47a
Author: YellowRoseCx <[email protected]>
Date:   Tue Jun 20 13:38:06 2023 -0500

    Merge branch 'LostRuins:concedo' into koboldcpp-rocm

commit a7e74b3
Author: YellowRoseCx <[email protected]>
Date:   Mon Jun 19 22:04:18 2023 -0500

    Update README.md

commit 5e99b3c
Author: YellowRoseCx <[email protected]>
Date:   Mon Jun 19 22:03:42 2023 -0500

    Update Makefile

commit 9190b17
Author: YellowRoseCx <[email protected]>
Date:   Mon Jun 19 21:47:10 2023 -0500

    Update README.md

commit 5dd2fbe
Merge: 67e229b 20568fe
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 20 01:23:12 2023 +0300

    Merge 'origin/master' into hipblas

commit 2780ea2
Author: YellowRoseCx <[email protected]>
Date:   Sun Jun 18 15:48:00 2023 -0500

    Update Makefile

commit 04a3e64
Author: YellowRoseCx <[email protected]>
Date:   Sun Jun 18 14:33:39 2023 -0500

    remove extra line

commit cccbca9
Author: YellowRoseCx <[email protected]>
Date:   Sun Jun 18 14:31:17 2023 -0500

    attempt adding ROCM hipblas

commit a44a1d4
Author: YellowRoseCx <[email protected]>
Date:   Sun Jun 18 14:31:01 2023 -0500

    attempt adding ROCM hipblas

commit b088184
Author: YellowRoseCx <[email protected]>
Date:   Sun Jun 18 14:30:54 2023 -0500

    attempt adding ROCM hipblas

commit 67e229b
Merge: 6f7c156 b241649
Author: Henri Vasserman <[email protected]>
Date:   Sun Jun 18 00:36:54 2023 +0300

    Merge 'origin/master' into hipblas

commit 6f7c156
Merge: 61df8e9 fc45a81
Author: Henri Vasserman <[email protected]>
Date:   Sat Jun 17 16:53:22 2023 +0300

    Merge 'origin/master' into hipblas

commit 61df8e9
Author: Henri Vasserman <[email protected]>
Date:   Wed Jun 14 22:46:10 2023 +0300

    add cudaMemset

commit a836529
Merge: 85f902d 254a7a7
Author: Henri Vasserman <[email protected]>
Date:   Wed Jun 14 22:41:55 2023 +0300

    Merge 'origin/master' into hipblas

commit 85f902d
Merge: 4362e80 b50b570
Author: Henri Vasserman <[email protected]>
Date:   Thu Jun 8 10:50:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 4362e80
Merge: fa5b3d7 17366df
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 6 23:14:40 2023 +0300

    Merge 'origin/master' into hipblas

commit fa5b3d7
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 6 18:47:00 2023 +0300

    fix makefile.

commit 1ba4ce4
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 6 18:41:08 2023 +0300

    Revert "warp size fixes"

    It seems like 32 is faster for me, at least and it won't cause so many conflicts.

    This reverts commit 5d6eb72.

commit 5d6eb72
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 6 18:32:41 2023 +0300

    warp size fixes

commit 33091a9
Merge: 9fdaa1d 2d43387
Author: Henri Vasserman <[email protected]>
Date:   Tue Jun 6 16:19:23 2023 +0300

    Merge  'origin/master' into hipblas

commit 9fdaa1d
Author: Henri Vasserman <[email protected]>
Date:   Sat May 27 19:17:53 2023 +0300

    Add more defs

    For forward compatibility ggerganov#1607

commit a4648c1
Merge: 4c8b3fb 0ecb1bb
Author: Henri Vasserman <[email protected]>
Date:   Sat May 27 18:22:39 2023 +0300

    Merge 'origin/master' into hipblas

commit 4c8b3fb
Author: Henri Vasserman <[email protected]>
Date:   Fri May 26 01:08:53 2023 +0300

    add configurable vars

commit 30d921a
Author: Henri Vasserman <[email protected]>
Date:   Fri May 26 01:03:56 2023 +0300

    and makefile

commit a593a4f
Author: Henri Vasserman <[email protected]>
Date:   Fri May 26 00:55:28 2023 +0300

    Add missing parameters

commit 174bf6a
Merge: f80ce7a 1fcdcc2
Author: Henri Vasserman <[email protected]>
Date:   Fri May 26 00:44:23 2023 +0300

    Merge 'origin/master' into hipblas

commit f80ce7a
Merge: 600ace3 ac7876a
Author: Henri Vasserman <[email protected]>
Date:   Thu May 25 00:02:50 2023 +0300

    Merge branch 'origin/master' into hipblas

commit 600ace3
Author: Henri Vasserman <[email protected]>
Date:   Sat May 20 23:42:20 2023 +0300

    update warp size

commit b19fefe
Author: Henri Vasserman <[email protected]>
Date:   Sat May 20 23:28:08 2023 +0300

    Forwardcompat

commit c66115b
Merge: a0b2d5f b8ee340
Author: Henri Vasserman <[email protected]>
Date:   Sat May 20 18:29:31 2023 +0300

    Merge 'origin/master' into hipblas

commit a0b2d5f
Merge: 8bab456 2a5ee02
Author: Henri Vasserman <[email protected]>
Date:   Tue May 16 17:08:29 2023 +0300

    Merge 'origin/master' into hipblas

commit 8bab456
Merge: 2956630 b5c9295
Author: Henri Vasserman <[email protected]>
Date:   Mon May 15 00:01:12 2023 +0300

    Merge 'origin/master' into hipblas

commit 2956630
Merge: 0fe6384 f048af0
Author: Henri Vasserman <[email protected]>
Date:   Sat May 13 13:12:52 2023 +0300

    Merge 'origin/master' into hipblas

commit 0fe6384
Author: Henri Vasserman <[email protected]>
Date:   Fri May 12 17:22:11 2023 +0300

    fix makefile

commit 605560d
Merge: 127f68e 089b1c9
Author: Henri Vasserman <[email protected]>
Date:   Fri May 12 16:12:53 2023 +0300

    Merge 'origin/master' into hipblas

commit 127f68e
Merge: 070cbcc b608b55
Author: Henri Vasserman <[email protected]>
Date:   Thu May 11 20:21:27 2023 +0300

    Merge 'origin/master' into hipblas

commit 070cbcc
Author: Henri Vasserman <[email protected]>
Date:   Sun May 7 18:10:56 2023 +0300

    occupanct function

commit a3296d5
Merge: 0aefa6a e129551
Author: Henri Vasserman <[email protected]>
Date:   Sun May 7 18:06:04 2023 +0300

    Merge 'origin/master' into hipblas

commit 0aefa6a
Merge: baeb482 1b0fd45
Author: Henri Vasserman <[email protected]>
Date:   Sun May 7 12:24:41 2023 +0300

    Merge 'origin/master' into hipblas

commit baeb482
Author: Henri Vasserman <[email protected]>
Date:   Sun May 7 12:24:12 2023 +0300

    Revert to default copy

commit 289073a
Merge: 1107194 173d0e6
Author: Henri Vasserman <[email protected]>
Date:   Sat May 6 19:59:41 2023 +0300

    Merge 'origin/master' into hipblas

commit 1107194
Merge: 04c0d48 a3b85b2
Author: Henri Vasserman <[email protected]>
Date:   Sat May 6 00:38:20 2023 +0300

    Merge 'origin/master' into hipblas

commit 04c0d48
Author: Henri Vasserman <[email protected]>
Date:   Thu May 4 12:31:16 2023 +0300

    Move all HIP stuff to ggml-cuda.cu

commit d83cfba
Merge: b67cc50 799fdc1
Author: Henri Vasserman <[email protected]>
Date:   Thu May 4 11:31:16 2023 +0300

    Merge 'origin/master' into hipblas

commit b67cc50
Merge: fcbc262 e216aa0
Author: Henri Vasserman <[email protected]>
Date:   Wed May 3 15:04:51 2023 +0300

    Merge 'origin/master' into hipblas

commit fcbc262
Merge: c73def1 f4cef87
Author: Henri Vasserman <[email protected]>
Date:   Mon May 1 22:45:29 2023 +0300

    Merge 'origin/master' into hipblas

commit c73def1
Merge: d8ea75e f0d70f1
Author: Henri Vasserman <[email protected]>
Date:   Sun Apr 30 18:40:42 2023 +0300

    Merge 'origin/master' into hipblas

commit d8ea75e
Merge: d194586 334637e
Author: Henri Vasserman <[email protected]>
Date:   Sat Apr 29 11:25:51 2023 +0300

    Merge 'origin/master' into hipblas

commit d194586
Merge: 2ab9d11 7f15c5c
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 28 23:03:52 2023 +0300

    Merge 'origin/master' into hipblas

commit 2ab9d11
Merge: 3b4a531 04aaae1
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 28 16:30:05 2023 +0300

    Merge 'origin/master' into hipblas

commit 3b4a531
Merge: a1caa48 0b2da20
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 28 10:08:41 2023 +0300

    Merge 'origin/master' into hipblas

commit a1caa48
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 28 10:08:21 2023 +0300

    add more cuda defines

    This is so 'slaren/cuda-f16f32' would merge.

commit ecc0565
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 28 01:58:27 2023 +0300

    only .cu file needs to be complied as device

commit ef51e9e
Merge: d571d16 4afcc37
Author: Henri Vasserman <[email protected]>
Date:   Wed Apr 26 12:46:26 2023 +0300

    Merge branch 'ggerganov:master' into hipblas

commit d571d16
Merge: 608aa33 dd0eabc
Author: Henri Vasserman <[email protected]>
Date:   Tue Apr 25 21:15:33 2023 +0300

    Merge 'origin/master' into hipblas

commit 608aa33
Author: Henri Vasserman <[email protected]>
Date:   Tue Apr 25 21:15:04 2023 +0300

    change default GPU arch to match CMake

commit 3a004b2
Author: Henri Vasserman <[email protected]>
Date:   Mon Apr 24 02:24:54 2023 +0300

    add rpath

commit db7a012
Merge: 3677235 284685f
Author: Henri Vasserman <[email protected]>
Date:   Sun Apr 23 21:49:28 2023 +0300

    Merge 'origin/master' into hipblas

commit 3677235
Author: Henri Vasserman <[email protected]>
Date:   Sat Apr 22 23:28:00 2023 +0300

    More build file changes

commit d3e1984
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 21 03:32:06 2023 +0300

    add rpath

commit 0e005f7
Author: Henri Vasserman <[email protected]>
Date:   Fri Apr 21 02:13:00 2023 +0300

    Build file changes

    Now HIP Clang is not required, the CMake scripts will configure the
    needed compiler, which can be system clang++. Also other code can
    still use GCC, but CMake will force the clang to link.

commit 54a63c1
Author: Henri Vasserman <[email protected]>
Date:   Thu Apr 20 22:19:22 2023 +0300

    Update Makefile for the Cuda kernels

commit 0fd8363
Author: Henri Vasserman <[email protected]>
Date:   Thu Apr 20 02:04:00 2023 +0300

    use hipblas based on cublas

* Merge Fixes

* readme merge fix

* remove old ggmlv2 changes

* bring ggml v2_cuda up to date with AMD changes

* Revert ggml v2_cuda changes BC they werent needed

This reverts commit 3385dd4.

* avoid launching subprocesses to get device names for now, but other than that seems to be working

---------

Co-authored-by: Concedo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request model Model specific 🦙. llama
Projects
Development

Successfully merging a pull request may close this issue.

7 participants