Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for quantized OPT models and refactor #295

Merged
merged 13 commits into from
Mar 14, 2023

Conversation

Zerogoki00
Copy link
Contributor

@Zerogoki00 Zerogoki00 commented Mar 13, 2023

Added ability to use quantized OPT models
Added argument to specify quantized model type (LLaMA by default)
Removed --load-in-4bit because we already have --gptq-bits

Tested with OPT-30B-Erebus on RTX 4090. Works slower than LLaMA but it works

@Zerogoki00 Zerogoki00 changed the title Add supporting for quantized OTP models and refactor Add supporting for quantized OPT models and refactor Mar 13, 2023
@Zerogoki00 Zerogoki00 changed the title Add supporting for quantized OPT models and refactor Add support for quantized OPT models and refactor Mar 13, 2023
@Ph0rk0z
Copy link
Contributor

Ph0rk0z commented Mar 13, 2023

How is performance vs flexgen?


path_to_model = Path(f'models/{model_name}')
pt_model = ''
if path_to_model.name.lower().startswith('llama-7b'):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this block breaks compatibility with folder names like llama-7b-hf by decapoda.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the code back

def load_model(model_name):
print(f"Loading {model_name}...")
t0 = time.time()

shared.is_RWKV = model_name.lower().startswith('rwkv-')

# Default settings
if not any([shared.args.cpu, shared.args.load_in_8bit, shared.args.load_in_4bit, shared.args.gptq_bits > 0, shared.args.auto_devices, shared.args.disk, shared.args.gpu_memory is not None, shared.args.cpu_memory is not None, shared.args.deepspeed, shared.args.flexgen, shared.is_RWKV]):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm reluctant to remove --load-in-4bit because that will certainly cause confusion, but I guess we can do it and move on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's up to you but I think that it's a bad practice to keep multiple arguments which do the same thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this --load-in-4bit argument was confusing because it works differently from --load-in-8bit

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have convinced me, let's ditch --load-in-4bit.

parser.add_argument('--load-in-4bit', action='store_true', help='Load the model with 4-bit precision. Currently only works with LLaMA.')
parser.add_argument('--gptq-bits', type=int, default=0, help='Load a pre-quantized model with specified precision. 2, 3, 4 and 8bit are supported. Currently only works with LLaMA.')
parser.add_argument('--gptq-bits', type=int, default=0, help='Load a pre-quantized model with specified precision. 2, 3, 4 and 8bit are supported. Currently only works with LLaMA and OPT.')
parser.add_argument('--gptq-model-type', type=str, default='llama', help='Model type of pre-quantized model. Currently only LLaMa and OPT are supported.')
Copy link
Owner

@oobabooga oobabooga Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could infer --gptq-model-type from the model name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oobabooga
Then user will have to always set a special name for model folder with model type prefix (like llama-xxx, opt-xxxxx)

Maybe do as you described, but keep this argument as fallback if folder name has no prefix? What do you think about it?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like a good idea (making this argument optional).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented this. Tested myself with different folder names and it works. (but I recommend you to check too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here is the logic:
If --gptq-model-type isn't present, then try to get model type from name. Print error and exit if failed
If --gptq-model-type is set, then don't try to get model name and use its value

@LoopControl
Copy link

LoopControl commented Mar 13, 2023

Could you briefly describe how to convert an OPT huggingface model to .pt (or provide a link to pregenerated .pt)?

Would it be similar to this command documented in the GPTQ-llama repo:
python llama.py decapoda-research/llama-7b-hf c4 --wbits 4 --save llama7b-4bit.pt

Edit: Looks like the command is:
python opt.py KoboldAI/OPT-13B-Erebus c4 --wbits 4 --save opt-13b-4bit.pt

Does the dataset parameter (the "c4" in above) make a difference in inference? If so which would you recommend?

@LoopControl
Copy link

LoopControl commented Mar 14, 2023

Good news is, after quantizing a 13B Erebus pt, the model loads in around 8GB of VRAM and seems to generate text.

Problem is, I'm seeing 5x+ slower generations with very short contexts in 4 bit mode as compared to 13B Llama in 4bit:

> python server.py --gptq-bits 4 --gptq-model-type opt --no-stream --model KoboldAI_OPT-13B-Erebus
Output generated in 49.05 seconds (1.06 tokens/s, 52 tokens)
Output generated in 95.38 seconds (0.84 tokens/s, 80 tokens)

For comparison, llama 13B:

> python server.py --gptq-bits 4 --no-stream --model llama-13b-hf
Output generated in 9.24 seconds (8.66 tokens/s, 80 tokens)
Output generated in 8.68 seconds (9.22 tokens/s, 80 tokens)

# 30B llama model, 200 token generation:
Output generated in 36.11 seconds (5.54 tokens/s, 200 tokens)
Output generated in 42.36 seconds (4.72 tokens/s, 200 tokens)

With larger contexts (800+ tokens), llama model continues to work fine but the OPT model seems to just hang (I gave it multiple minutes before killing process).

@oobabooga
Copy link
Owner

@LoopControl can you report that on https://github.com/qwopqwop200/GPTQ-for-LLaMa?

Pinging @qwopqwop200

@qwopqwop200
Copy link

Benchmarked on opt2.7b, but not as slow as this.

@oobabooga oobabooga merged commit 5c05223 into oobabooga:main Mar 14, 2023
TheTerrasque pushed a commit to TheTerrasque/text-generation-webui that referenced this pull request Mar 15, 2023
Squashed commit of the following:

commit 6a1787a
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:55:40 2023 -0300

    CSS fixes

commit 3047ed8
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:41:38 2023 -0300

    CSS fix

commit 87b84d2
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:39:59 2023 -0300

    CSS fix

commit c1959c2
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:34:31 2023 -0300

    Show/hide the extensions block using javascript

commit 348596f
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 15:11:16 2023 -0300

    Fix broken extensions

commit c5f14fb
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 14:19:28 2023 -0300

    Optimize the HTML generation speed

commit bf812c4
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 14:05:35 2023 -0300

    Minor fix

commit 658849d
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:29:00 2023 -0300

    Move a checkbutton

commit 05ee323
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:26:32 2023 -0300

    Rename a file

commit 40c9e46
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:25:28 2023 -0300

    Add file

commit d30a140
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:24:54 2023 -0300

    Further reorganize the UI

commit ffc6cb3
Merge: cf2da86 3b62bd1
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:56:21 2023 -0300

    Merge pull request oobabooga#325 from Ph0rk0z/fix-RWKV-Names

    Fix rwkv names

commit cf2da86
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:51:13 2023 -0300

    Prevent *Is typing* from disappearing instantly while streaming

commit 4146ac4
Merge: 1413931 29b7c5a
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:47:41 2023 -0300

    Merge pull request oobabooga#266 from HideLord/main

    Adding markdown support and slight refactoring.

commit 29b7c5a
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:40:03 2023 -0300

    Sort the requirements

commit ec972b8
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:33:26 2023 -0300

    Move all css/js into separate files

commit 693b53d
Merge: 63c5a13 1413931
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:08:56 2023 -0300

    Merge branch 'main' into HideLord-main

commit 1413931
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:01:32 2023 -0300

    Add a header bar and redesign the interface (oobabooga#293)

commit 9d6a625
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 11:04:30 2023 -0300

    Add 'hallucinations' filter oobabooga#326

    This breaks the API since a new parameter has been added.
    It should be a one-line fix. See api-example.py.

commit 3b62bd1
Author: Forkoz <[email protected]>
Date:   Tue Mar 14 21:23:39 2023 +0000

    Remove PTH extension from RWKV

    When loading the current model was blank unless you typed it out.

commit f0f325e
Author: Forkoz <[email protected]>
Date:   Tue Mar 14 21:21:47 2023 +0000

    Remove Json from loading

    no more 20b tokenizer

commit 128d18e
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 17:57:25 2023 -0300

    Update README.md

commit 1236c7f
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 17:56:15 2023 -0300

    Update README.md

commit b419dff
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 17:55:35 2023 -0300

    Update README.md

commit 72d207c
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 16:31:27 2023 -0300

    Remove the chat API

    It is not implemented, has not been tested, and this is causing confusion.

commit afc5339
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 16:04:17 2023 -0300

    Remove "eval" statements from text generation functions

commit 5c05223
Merge: b327554 87192e2
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 08:05:24 2023 -0300

    Merge pull request oobabooga#295 from Zerogoki00/opt4-bit

    Add support for quantized OPT models

commit 87192e2
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 08:02:21 2023 -0300

    Update README

commit 265ba38
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 07:56:31 2023 -0300

    Rename a file, add deprecation warning for --load-in-4bit

commit 3da73e4
Merge: 518e5c4 b327554
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 07:50:36 2023 -0300

    Merge branch 'main' into Zerogoki00-opt4-bit

commit b327554
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 00:18:13 2023 -0300

    Update bug_report_template.yml

commit 33b9a15
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 23:03:16 2023 -0300

    Delete config.yml

commit b5e0d3c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 23:02:25 2023 -0300

    Create config.yml

commit 7f301fd
Merge: d685332 02d4075
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:41:21 2023 -0300

    Merge pull request oobabooga#305 from oobabooga/dependabot/pip/accelerate-0.17.1

    Bump accelerate from 0.17.0 to 0.17.1

commit 02d4075
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Mar 14 01:40:42 2023 +0000

    Bump accelerate from 0.17.0 to 0.17.1

    Bumps [accelerate](https://github.com/huggingface/accelerate) from 0.17.0 to 0.17.1.
    - [Release notes](https://github.com/huggingface/accelerate/releases)
    - [Commits](huggingface/accelerate@v0.17.0...v0.17.1)

    ---
    updated-dependencies:
    - dependency-name: accelerate
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>

commit d685332
Merge: 481ef3c df83088
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:39:59 2023 -0300

    Merge pull request oobabooga#307 from oobabooga/dependabot/pip/bitsandbytes-0.37.1

    Bump bitsandbytes from 0.37.0 to 0.37.1

commit 481ef3c
Merge: a0ef82c 715c3ec
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:39:22 2023 -0300

    Merge pull request oobabooga#304 from oobabooga/dependabot/pip/rwkv-0.4.2

    Bump rwkv from 0.3.1 to 0.4.2

commit df83088
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Mar 14 01:36:18 2023 +0000

    Bump bitsandbytes from 0.37.0 to 0.37.1

    Bumps [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) from 0.37.0 to 0.37.1.
    - [Release notes](https://github.com/TimDettmers/bitsandbytes/releases)
    - [Changelog](https://github.com/TimDettmers/bitsandbytes/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/TimDettmers/bitsandbytes/commits)

    ---
    updated-dependencies:
    - dependency-name: bitsandbytes
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>

commit 715c3ec
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Mar 14 01:36:02 2023 +0000

    Bump rwkv from 0.3.1 to 0.4.2

    Bumps [rwkv](https://github.com/BlinkDL/ChatRWKV) from 0.3.1 to 0.4.2.
    - [Release notes](https://github.com/BlinkDL/ChatRWKV/releases)
    - [Commits](https://github.com/BlinkDL/ChatRWKV/commits)

    ---
    updated-dependencies:
    - dependency-name: rwkv
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <[email protected]>

commit a0ef82c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:35:28 2023 -0300

    Activate dependabot

commit 3fb8196
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:28:00 2023 -0300

    Implement "*Is recording a voice message...*" for TTS oobabooga#303

commit 0dab2c5
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:18:03 2023 -0300

    Update feature_request.md

commit 79e519c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 20:03:08 2023 -0300

    Update stale.yml

commit 1571458
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 19:39:21 2023 -0300

    Update stale.yml

commit bad0b0a
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 19:20:18 2023 -0300

    Update stale.yml

commit c805843
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 19:09:06 2023 -0300

    Update stale.yml

commit 60cc7d3
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:53:11 2023 -0300

    Update stale.yml

commit 7c17613
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:47:31 2023 -0300

    Update and rename .github/workflow/stale.yml to .github/workflows/stale.yml

commit 47c941c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:37:35 2023 -0300

    Create stale.yml

commit 511b136
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:29:38 2023 -0300

    Update bug_report_template.yml

commit d6763a6
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:27:24 2023 -0300

    Update feature_request.md

commit c6ecb35
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:26:28 2023 -0300

    Update feature_request.md

commit 6846427
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:19:07 2023 -0300

    Update feature_request.md

commit bcfb7d7
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:16:18 2023 -0300

    Update bug_report_template.yml

commit ed30bd3
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:14:54 2023 -0300

    Update bug_report_template.yml

commit aee3b53
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:14:31 2023 -0300

    Update bug_report_template.yml

commit 7dbc071
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:09:58 2023 -0300

    Delete bug_report.md

commit 69d4b81
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:09:37 2023 -0300

    Create bug_report_template.yml

commit 0a75584
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:07:08 2023 -0300

    Create issue templates

commit 518e5c4
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 16:45:08 2023 -0300

    Some minor fixes to the GPTQ loader

commit 8778b75
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 22:11:40 2023 +0300

    use updated load_quantized

commit a6a6522
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 22:11:32 2023 +0300

    determine model type from model name

commit b6c5c57
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 22:11:08 2023 +0300

    remove default value from argument

commit 63c5a13
Merge: 683556f 7ab45fb
Author: Alexander Hristov Hristov <[email protected]>
Date:   Mon Mar 13 19:50:08 2023 +0200

    Merge branch 'main' into main

commit e1c952c
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:22:38 2023 +0300

    make argument non case-sensitive

commit b746250
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:18:56 2023 +0300

    Update README

commit 3c9afd5
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:14:40 2023 +0300

    rename method

commit 1b99ed6
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:01:34 2023 +0300

    add argument --gptq-model-type and remove duplicate arguments

commit edbc611
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:00:38 2023 +0300

    use new quant loader

commit 345b6de
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 19:59:57 2023 +0300

    refactor quant models loader and add support of OPT

commit 683556f
Author: HideLord <[email protected]>
Date:   Sun Mar 12 21:34:09 2023 +0200

    Adding markdown support and slight refactoring.
TheTerrasque pushed a commit to TheTerrasque/text-generation-webui that referenced this pull request Mar 19, 2023
commit 0cbe2dd
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 12:24:54 2023 -0300

    Update README.md

commit 36ac7be
Merge: d2a7fac 705f513
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 11:57:10 2023 -0300

    Merge pull request oobabooga#407 from ThisIsPIRI/gitignore

    Add loras to .gitignore

commit d2a7fac
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 11:56:04 2023 -0300

    Use pip instead of conda for pytorch

commit 705f513
Author: ThisIsPIRI <[email protected]>
Date:   Sat Mar 18 23:33:24 2023 +0900

    Add loras to .gitignore

commit a0b1a30
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 11:23:56 2023 -0300

    Specify torchvision/torchaudio versions

commit c753261
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 10:55:57 2023 -0300

    Disable stop_at_newline by default

commit 7c945cf
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 10:55:24 2023 -0300

    Don't include PeftModel every time

commit 86b9900
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 10:27:52 2023 -0300

    Remove rwkv dependency

commit a163807
Author: oobabooga <[email protected]>
Date:   Sat Mar 18 03:07:27 2023 -0300

    Update README.md

commit a7acfa4
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 22:57:46 2023 -0300

    Update README.md

commit bcd8afd
Merge: dc35861 e26763a
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 22:57:28 2023 -0300

    Merge pull request oobabooga#393 from WojtekKowaluk/mps_support

    Fix for MPS support on Apple Silicon

commit e26763a
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 22:56:46 2023 -0300

    Minor changes

commit 7994b58
Author: Wojtek Kowaluk <[email protected]>
Date:   Sat Mar 18 02:27:26 2023 +0100

    clean up duplicated code

commit dc35861
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 21:05:17 2023 -0300

    Update README.md

commit 30939e2
Author: Wojtek Kowaluk <[email protected]>
Date:   Sat Mar 18 00:56:23 2023 +0100

    add mps support on apple silicon

commit 7d97da1
Author: Wojtek Kowaluk <[email protected]>
Date:   Sat Mar 18 00:17:05 2023 +0100

    add venv paths to gitignore

commit f2a5ca7
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 20:50:27 2023 -0300

    Update README.md

commit 8c8286b
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 20:49:40 2023 -0300

    Update README.md

commit 0c05e65
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 20:25:42 2023 -0300

    Update README.md

commit adc2003
Merge: 20f5b45 66e8d12
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 20:19:33 2023 -0300

    Merge branch 'main' of github.com:oobabooga/text-generation-webui

commit 20f5b45
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 20:19:04 2023 -0300

    Add parameters reference oobabooga#386 oobabooga#331

commit 66e8d12
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 19:59:37 2023 -0300

    Update README.md

commit 9a87111
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 19:52:22 2023 -0300

    Update README.md

commit d4f38b6
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 18:57:48 2023 -0300

    Update README.md

commit ad7c829
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 18:55:01 2023 -0300

    Update README.md

commit 4426f94
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 18:51:07 2023 -0300

    Update the installation instructions. Tldr use WSL

commit 9256e93
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 17:45:28 2023 -0300

    Add some LoRA params

commit 9ed2c45
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 16:06:11 2023 -0300

    Use markdown in the "HTML" tab

commit f0b2645
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 13:07:17 2023 -0300

    Add a comment

commit 7da742e
Merge: ebef4a5 02e1113
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 12:37:23 2023 -0300

    Merge pull request oobabooga#207 from EliasVincent/stt-extension

    Extension: Whisper Speech-To-Text Input

commit ebef4a5
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:58:45 2023 -0300

    Update README

commit cdfa787
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:53:28 2023 -0300

    Update README

commit 3bda907
Merge: 4c13067 614dad0
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:48:48 2023 -0300

    Merge pull request oobabooga#366 from oobabooga/lora

    Add LoRA support

commit 614dad0
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:43:11 2023 -0300

    Remove unused import

commit a717fd7
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:42:25 2023 -0300

    Sort the imports

commit 7d97287
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:41:12 2023 -0300

    Update settings-template.json

commit 29fe7b1
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:39:48 2023 -0300

    Remove LoRA tab, move it into the Parameters menu

commit 214dc68
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 11:24:52 2023 -0300

    Several QoL changes related to LoRA

commit 4c13067
Merge: ee164d1 53b6a66
Author: oobabooga <[email protected]>
Date:   Fri Mar 17 09:47:57 2023 -0300

    Merge pull request oobabooga#377 from askmyteapot/Fix-Multi-gpu-GPTQ-Llama-no-tokens

    Update GPTQ_Loader.py

commit 53b6a66
Author: askmyteapot <[email protected]>
Date:   Fri Mar 17 18:34:13 2023 +1000

    Update GPTQ_Loader.py

    Correcting decoder layer for renamed class.

commit 0cecfc6
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 21:35:53 2023 -0300

    Add files

commit 104293f
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 21:31:39 2023 -0300

    Add LoRA support

commit ee164d1
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 18:22:16 2023 -0300

    Don't split the layers in 8-bit mode by default

commit 0a2aa79
Merge: dd1c596 e085cb4
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 17:27:03 2023 -0300

    Merge pull request oobabooga#358 from mayaeary/8bit-offload

    Add support for memory maps with --load-in-8bit

commit e085cb4
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 13:34:23 2023 -0300

    Small changes

commit dd1c596
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 12:45:27 2023 -0300

    Update README

commit 38d7017
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 12:44:03 2023 -0300

    Add all command-line flags to "Interface mode"

commit 83cb20a
Author: awoo <awoo@awoo>
Date:   Thu Mar 16 18:42:53 2023 +0300

    Add support for --gpu-memory witn --load-in-8bit

commit 23a5e88
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 11:16:17 2023 -0300

    The LLaMA PR has been merged into transformers

    huggingface/transformers#21955

    The tokenizer class has been changed from

    "LLaMATokenizer"

    to

    "LlamaTokenizer"

    It is necessary to edit this change in every tokenizer_config.json
    that you had for LLaMA so far.

commit d54f3f4
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 10:19:00 2023 -0300

    Add no-stream checkbox to the interface

commit 1c37896
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 10:18:34 2023 -0300

    Remove unused imports

commit a577fb1
Author: oobabooga <[email protected]>
Date:   Thu Mar 16 00:46:59 2023 -0300

    Keep GALACTICA special tokens (oobabooga#300)

commit 25a00ea
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 23:43:35 2023 -0300

    Add "Experimental" warning

commit 599d313
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 23:34:08 2023 -0300

    Increase the reload timeout a bit

commit 4d64a57
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 23:29:56 2023 -0300

    Add Interface mode tab

commit b501722
Merge: ffb8986 d3a280e
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 20:46:04 2023 -0300

    Merge branch 'main' of github.com:oobabooga/text-generation-webui

commit ffb8986
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 20:44:34 2023 -0300

    Mini refactor

commit d3a280e
Merge: 445ebf0 0552ab2
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 20:22:08 2023 -0300

    Merge pull request oobabooga#348 from mayaeary/feature/koboldai-api-share

    flask_cloudflared for shared tunnels

commit 445ebf0
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 20:06:46 2023 -0300

    Update README.md

commit 0552ab2
Author: awoo <awoo@awoo>
Date:   Thu Mar 16 02:00:16 2023 +0300

    flask_cloudflared for shared tunnels

commit e9e76bb
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 19:42:29 2023 -0300

    Delete WSL.md

commit 09045e4
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 19:42:06 2023 -0300

    Add WSL guide

commit 9ff5033
Merge: 66256ac 055edc7
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 19:37:26 2023 -0300

    Merge pull request oobabooga#345 from jfryton/main

    Guide for Windows Subsystem for Linux

commit 66256ac
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 19:31:27 2023 -0300

    Make the "no GPU has been detected" message more descriptive

commit 055edc7
Author: jfryton <[email protected]>
Date:   Wed Mar 15 18:21:14 2023 -0400

    Update WSL.md

commit 89883a3
Author: jfryton <[email protected]>
Date:   Wed Mar 15 18:20:21 2023 -0400

    Create WSL.md guide for setting up WSL Ubuntu

    Quick start guide for Windows Subsystem for Linux (Ubuntu), including port forwarding to enable local network webui access.

commit 67d6247
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 18:56:26 2023 -0300

    Further reorganize chat UI

commit ab12a17
Merge: 6a1787a 3028112
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 18:31:39 2023 -0300

    Merge pull request oobabooga#342 from mayaeary/koboldai-api

    Extension: KoboldAI api

commit 3028112
Author: awoo <awoo@awoo>
Date:   Wed Mar 15 23:52:46 2023 +0300

    KoboldAI api

commit 6a1787a
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:55:40 2023 -0300

    CSS fixes

commit 3047ed8
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:41:38 2023 -0300

    CSS fix

commit 87b84d2
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:39:59 2023 -0300

    CSS fix

commit c1959c2
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 16:34:31 2023 -0300

    Show/hide the extensions block using javascript

commit 348596f
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 15:11:16 2023 -0300

    Fix broken extensions

commit c5f14fb
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 14:19:28 2023 -0300

    Optimize the HTML generation speed

commit bf812c4
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 14:05:35 2023 -0300

    Minor fix

commit 658849d
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:29:00 2023 -0300

    Move a checkbutton

commit 05ee323
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:26:32 2023 -0300

    Rename a file

commit 40c9e46
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:25:28 2023 -0300

    Add file

commit d30a140
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 13:24:54 2023 -0300

    Further reorganize the UI

commit ffc6cb3
Merge: cf2da86 3b62bd1
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:56:21 2023 -0300

    Merge pull request oobabooga#325 from Ph0rk0z/fix-RWKV-Names

    Fix rwkv names

commit cf2da86
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:51:13 2023 -0300

    Prevent *Is typing* from disappearing instantly while streaming

commit 4146ac4
Merge: 1413931 29b7c5a
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:47:41 2023 -0300

    Merge pull request oobabooga#266 from HideLord/main

    Adding markdown support and slight refactoring.

commit 29b7c5a
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:40:03 2023 -0300

    Sort the requirements

commit ec972b8
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:33:26 2023 -0300

    Move all css/js into separate files

commit 693b53d
Merge: 63c5a13 1413931
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:08:56 2023 -0300

    Merge branch 'main' into HideLord-main

commit 1413931
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 12:01:32 2023 -0300

    Add a header bar and redesign the interface (oobabooga#293)

commit 9d6a625
Author: oobabooga <[email protected]>
Date:   Wed Mar 15 11:04:30 2023 -0300

    Add 'hallucinations' filter oobabooga#326

    This breaks the API since a new parameter has been added.
    It should be a one-line fix. See api-example.py.

commit 3b62bd1
Author: Forkoz <[email protected]>
Date:   Tue Mar 14 21:23:39 2023 +0000

    Remove PTH extension from RWKV

    When loading the current model was blank unless you typed it out.

commit f0f325e
Author: Forkoz <[email protected]>
Date:   Tue Mar 14 21:21:47 2023 +0000

    Remove Json from loading

    no more 20b tokenizer

commit 128d18e
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 17:57:25 2023 -0300

    Update README.md

commit 1236c7f
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 17:56:15 2023 -0300

    Update README.md

commit b419dff
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 17:55:35 2023 -0300

    Update README.md

commit 72d207c
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 16:31:27 2023 -0300

    Remove the chat API

    It is not implemented, has not been tested, and this is causing confusion.

commit afc5339
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 16:04:17 2023 -0300

    Remove "eval" statements from text generation functions

commit 5c05223
Merge: b327554 87192e2
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 08:05:24 2023 -0300

    Merge pull request oobabooga#295 from Zerogoki00/opt4-bit

    Add support for quantized OPT models

commit 87192e2
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 08:02:21 2023 -0300

    Update README

commit 265ba38
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 07:56:31 2023 -0300

    Rename a file, add deprecation warning for --load-in-4bit

commit 3da73e4
Merge: 518e5c4 b327554
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 07:50:36 2023 -0300

    Merge branch 'main' into Zerogoki00-opt4-bit

commit b327554
Author: oobabooga <[email protected]>
Date:   Tue Mar 14 00:18:13 2023 -0300

    Update bug_report_template.yml

commit 33b9a15
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 23:03:16 2023 -0300

    Delete config.yml

commit b5e0d3c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 23:02:25 2023 -0300

    Create config.yml

commit 7f301fd
Merge: d685332 02d4075
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:41:21 2023 -0300

    Merge pull request oobabooga#305 from oobabooga/dependabot/pip/accelerate-0.17.1

    Bump accelerate from 0.17.0 to 0.17.1

commit 02d4075
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Mar 14 01:40:42 2023 +0000

    Bump accelerate from 0.17.0 to 0.17.1

    Bumps [accelerate](https://github.com/huggingface/accelerate) from 0.17.0 to 0.17.1.
    - [Release notes](https://github.com/huggingface/accelerate/releases)
    - [Commits](huggingface/accelerate@v0.17.0...v0.17.1)

    ---
    updated-dependencies:
    - dependency-name: accelerate
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>

commit d685332
Merge: 481ef3c df83088
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:39:59 2023 -0300

    Merge pull request oobabooga#307 from oobabooga/dependabot/pip/bitsandbytes-0.37.1

    Bump bitsandbytes from 0.37.0 to 0.37.1

commit 481ef3c
Merge: a0ef82c 715c3ec
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:39:22 2023 -0300

    Merge pull request oobabooga#304 from oobabooga/dependabot/pip/rwkv-0.4.2

    Bump rwkv from 0.3.1 to 0.4.2

commit df83088
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Mar 14 01:36:18 2023 +0000

    Bump bitsandbytes from 0.37.0 to 0.37.1

    Bumps [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) from 0.37.0 to 0.37.1.
    - [Release notes](https://github.com/TimDettmers/bitsandbytes/releases)
    - [Changelog](https://github.com/TimDettmers/bitsandbytes/blob/main/CHANGELOG.md)
    - [Commits](https://github.com/TimDettmers/bitsandbytes/commits)

    ---
    updated-dependencies:
    - dependency-name: bitsandbytes
      dependency-type: direct:production
      update-type: version-update:semver-patch
    ...

    Signed-off-by: dependabot[bot] <[email protected]>

commit 715c3ec
Author: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Date:   Tue Mar 14 01:36:02 2023 +0000

    Bump rwkv from 0.3.1 to 0.4.2

    Bumps [rwkv](https://github.com/BlinkDL/ChatRWKV) from 0.3.1 to 0.4.2.
    - [Release notes](https://github.com/BlinkDL/ChatRWKV/releases)
    - [Commits](https://github.com/BlinkDL/ChatRWKV/commits)

    ---
    updated-dependencies:
    - dependency-name: rwkv
      dependency-type: direct:production
      update-type: version-update:semver-minor
    ...

    Signed-off-by: dependabot[bot] <[email protected]>

commit a0ef82c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:35:28 2023 -0300

    Activate dependabot

commit 3fb8196
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:28:00 2023 -0300

    Implement "*Is recording a voice message...*" for TTS oobabooga#303

commit 0dab2c5
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 22:18:03 2023 -0300

    Update feature_request.md

commit 79e519c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 20:03:08 2023 -0300

    Update stale.yml

commit 1571458
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 19:39:21 2023 -0300

    Update stale.yml

commit bad0b0a
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 19:20:18 2023 -0300

    Update stale.yml

commit c805843
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 19:09:06 2023 -0300

    Update stale.yml

commit 60cc7d3
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:53:11 2023 -0300

    Update stale.yml

commit 7c17613
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:47:31 2023 -0300

    Update and rename .github/workflow/stale.yml to .github/workflows/stale.yml

commit 47c941c
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:37:35 2023 -0300

    Create stale.yml

commit 511b136
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:29:38 2023 -0300

    Update bug_report_template.yml

commit d6763a6
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:27:24 2023 -0300

    Update feature_request.md

commit c6ecb35
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:26:28 2023 -0300

    Update feature_request.md

commit 6846427
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:19:07 2023 -0300

    Update feature_request.md

commit bcfb7d7
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:16:18 2023 -0300

    Update bug_report_template.yml

commit ed30bd3
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:14:54 2023 -0300

    Update bug_report_template.yml

commit aee3b53
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:14:31 2023 -0300

    Update bug_report_template.yml

commit 7dbc071
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:09:58 2023 -0300

    Delete bug_report.md

commit 69d4b81
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:09:37 2023 -0300

    Create bug_report_template.yml

commit 0a75584
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 18:07:08 2023 -0300

    Create issue templates

commit 02e1113
Author: EliasVincent <[email protected]>
Date:   Mon Mar 13 21:41:19 2023 +0100

    add auto-transcribe option

commit 518e5c4
Author: oobabooga <[email protected]>
Date:   Mon Mar 13 16:45:08 2023 -0300

    Some minor fixes to the GPTQ loader

commit 8778b75
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 22:11:40 2023 +0300

    use updated load_quantized

commit a6a6522
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 22:11:32 2023 +0300

    determine model type from model name

commit b6c5c57
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 22:11:08 2023 +0300

    remove default value from argument

commit 63c5a13
Merge: 683556f 7ab45fb
Author: Alexander Hristov Hristov <[email protected]>
Date:   Mon Mar 13 19:50:08 2023 +0200

    Merge branch 'main' into main

commit e1c952c
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:22:38 2023 +0300

    make argument non case-sensitive

commit b746250
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:18:56 2023 +0300

    Update README

commit 3c9afd5
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:14:40 2023 +0300

    rename method

commit 1b99ed6
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:01:34 2023 +0300

    add argument --gptq-model-type and remove duplicate arguments

commit edbc611
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 20:00:38 2023 +0300

    use new quant loader

commit 345b6de
Author: Ayanami Rei <[email protected]>
Date:   Mon Mar 13 19:59:57 2023 +0300

    refactor quant models loader and add support of OPT

commit 48aa528
Author: EliasVincent <[email protected]>
Date:   Sun Mar 12 21:03:07 2023 +0100

    use Gradio microphone input instead

commit 683556f
Author: HideLord <[email protected]>
Date:   Sun Mar 12 21:34:09 2023 +0200

    Adding markdown support and slight refactoring.

commit 3b41459
Merge: 1c0bda3 3375eae
Author: Elias Vincent Simon <[email protected]>
Date:   Sun Mar 12 19:19:43 2023 +0100

    Merge branch 'oobabooga:main' into stt-extension

commit 1c0bda3
Author: EliasVincent <[email protected]>
Date:   Fri Mar 10 11:47:16 2023 +0100

    added installation instructions

commit a24fa78
Author: EliasVincent <[email protected]>
Date:   Thu Mar 9 21:18:46 2023 +0100

    tweaked Whisper parameters

commit d5efc06
Merge: 00359ba 3341447
Author: Elias Vincent Simon <[email protected]>
Date:   Thu Mar 9 21:05:34 2023 +0100

    Merge branch 'oobabooga:main' into stt-extension

commit 00359ba
Author: EliasVincent <[email protected]>
Date:   Thu Mar 9 21:03:49 2023 +0100

    interactive preview window

commit 7a03d0b
Author: EliasVincent <[email protected]>
Date:   Thu Mar 9 20:33:00 2023 +0100

    cleanup

commit 4c72e43
Author: EliasVincent <[email protected]>
Date:   Thu Mar 9 12:46:50 2023 +0100

    first implementation
Ph0rk0z pushed a commit to Ph0rk0z/text-generation-webui-testing that referenced this pull request Apr 17, 2023
Add support for quantized OPT models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants