Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase #6

Merged
merged 30 commits into from
Nov 19, 2024
Merged

rebase #6

merged 30 commits into from
Nov 19, 2024

Conversation

pglorio
Copy link
Collaborator

@pglorio pglorio commented Nov 19, 2024

No description provided.

faaany and others added 30 commits November 11, 2024 07:09
…ic (huggingface#33079)

* Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Update docs/source/ar/torchscript.md

Co-authored-by: Abdullah Mohammed <[email protected]>

* Merge troubleshooting.md with this Branch

* Update _toctree.yml

* Update torchscript.md

* Update troubleshooting.md

---------

Co-authored-by: Abdullah Mohammed <[email protected]>
…4549)

* Better support transformers.agents in gradio: small fixes and additional tests
* initial translation

* removed english

* Fixed Trivial Typos, updated _toctree.yml
* add XPU path

* use accelerate API

* Update docs/source/en/tasks/semantic_segmentation.md

Co-authored-by: Steven Liu <[email protected]>

* update more places with accelerate API

---------

Co-authored-by: Steven Liu <[email protected]>
…uggingface#34253)

* Retain newlines in chat template when

* Add try/except

* Add regression test

* Simplify test

* Apply suggestions from code review

Co-authored-by: Matt <[email protected]>

---------

Co-authored-by: Matt <[email protected]>
* add xpu path for awq

* update readme
* add gradient accumulation steps tests for fsdp

* invert no_sync context to fix training for fsdp
* Remove FSDP wrapping from sub-models.

* solve conflict trainer.py

* make fixup

* add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size

* put back extract_model_from_parallel

* use transformers unwrap_model
* remove v4.44 deprecations

* PR comments

* deprecations scheduled for v4.50

* hub version update

* make fiuxp

---------

Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Arthur <[email protected]>
* Add model skeletion with transformers-cli add-new-model-like

* Convert config to modular, add rms_norm_eps, delete clip_qkv

* Convert model to modular, add RMSNorm

* Add flash attention with qk norm and no qkv clipping

* Add decoder layer with RMSNorm after attention/feedforward layers

* Add base and causal model

* Add converter improvements from OLMo repo

* Update weight loading in OLMo to HF converter

* Set correct default for rms_norm_eps

* Set correct pipeline_model_mapping in test

* Run make fixup

* Fix model type

* Re-run modular conversion

* Manually set config docs to fix build errors

* Convert olmo-1124 to olmo_1124 to fix flash attention docs errors

* Start updating tests

* Update tests

* Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124

* Rename input_layernorm and post_attention_layernorm to reflect their ops better

* Use correct tokenizer

* Remove test unsupported by GPT2 tokenizer

* Create GenerationConfig outside of from_pretrained call

* Use simpler init file structure

* Add explicit __all__ to support simplified init

* Make safetensor serialization the default

* Update OLMo November 2024 docs
…3424)

* use num additional tokens

* fix copies + docs

* another fix copies :)

* add docs

* move order for BC
…when reading config (huggingface#34637)

fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config
19d58d3 has introduced a context manager to manage subtests of
test_training_gradient_checkpointing. However, test body was not
moved under "with" statement. Thus, while tests are correctly
marked as skipped, test bodies were still executed. In some cases,
as with llama this caused attribute errors.

Fixes: huggingface#34722
Fixes: 19d58d3 ("Add MLLama (huggingface#33703)")

Signed-off-by: Dmitry Rogozhkin <[email protected]>
add XPU part to testing

Signed-off-by: Lin, Fanli <[email protected]>
…34184)

* Simplify Tensor Parallel implementation with PyTorch TP

* Move tp_plan to config

* Lint

* Format and warning

* Disable copy-from check

* Conditionally get attr from config

* make fix-copies

* Move base_model_tp_plan to PretrainedConfig

* Move TP into from_pretrained

* Add device context for load

* Do not serialize

* Move _tp_plan setting to post_init

* Add has_tp_plan

* Add test_tp

* Add 'Multi-gpu inference' doc

* Add backward support for device type identification

* Auto-detect accelerator

* supports_tp_plan

* copyright year

* Fix copy
…uggingface#34687)

* Allow handling files as args for a tool created with `Tool.from_space`
* Revert "Revert "Fix Whisper CI" (huggingface#34605)"

This reverts commit 74d3824.

* update

---------

Co-authored-by: ydshieh <[email protected]>
Co-authored-by: Arthur <[email protected]>
@pglorio pglorio merged commit 4725983 into zamba2 Nov 19, 2024
24 of 42 checks passed
pglorio pushed a commit that referenced this pull request Jan 16, 2025
* gptqmodel

Signed-off-by: jiqing-feng <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* update readme

Signed-off-by: jiqing-feng <[email protected]>

* gptqmodel need use checkpoint_format (#1)

* gptqmodel need use checkpoint_format

* fix quantize

* Update quantization_config.py

* Update quantization_config.py

* Update quantization_config.py

---------

Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>

* Revert quantizer_gptq.py (#2)

* revert quantizer_gptq.py change

* pass **kwargs

* limit gptqmodel and optimum version

Signed-off-by: jiqing-feng <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* fix warning

Signed-off-by: jiqing-feng <[email protected]>

* fix version check

Signed-off-by: jiqing-feng <[email protected]>

* revert unrelated changes

Signed-off-by: jiqing-feng <[email protected]>

* enable gptqmodel tests

Signed-off-by: jiqing-feng <[email protected]>

* fix requires gptq

Signed-off-by: jiqing-feng <[email protected]>

* Fix Transformer compat (#3)

* revert quantizer_gptq.py change

* pass **kwargs

* add meta info

* cleanup

* cleanup

* Update quantization_config.py

* hf_select_quant_linear pass checkpoint_format and meta

* fix GPTQTestCUDA

* Update test_gptq.py

* gptqmodel.hf_select_quant_linear() now does not select ExllamaV2

* cleanup

* add backend

* cleanup

* cleanup

* no need check exllama version

* Update quantization_config.py

* lower checkpoint_format and backend

* check none

* cleanup

* Update quantization_config.py

* fix self.use_exllama == False

* spell

* fix unittest

* fix unittest

---------

Co-authored-by: LRL <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>

* fix format

Signed-off-by: jiqing-feng <[email protected]>

* fix format again

Signed-off-by: jiqing-feng <[email protected]>

* update gptqmodel version (#6)

* update gptqmodel version

* update gptqmodel version

* fix unit test (#5)

* update gptqmodel version

* update gptqmodel version

* "not self.use_exllama" is not equivalent to "self.use_exllama==False"

* fix unittest

* update gptqmodel version

* backend is loading_attibutes (huggingface#7)

* fix format and tests

Signed-off-by: jiqing-feng <[email protected]>

* fix memory check

Signed-off-by: jiqing-feng <[email protected]>

* fix device mismatch

Signed-off-by: jiqing-feng <[email protected]>

* fix result check

Signed-off-by: jiqing-feng <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_gptq.py

Co-authored-by: Marc Sun <[email protected]>

* update tests

Signed-off-by: jiqing-feng <[email protected]>

* review: update docs (huggingface#10)

* review: update docs (huggingface#12)

* review: update docs

* fix typo

* update tests for gptqmodel

Signed-off-by: jiqing-feng <[email protected]>

* update document (huggingface#9)

* update overview.md

* cleanup

* Update overview.md

* Update overview.md

* Update overview.md

* update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

* Update gptq.md

---------

Co-authored-by: Qubitium-ModelCloud <[email protected]>

* typo

* doc note for asymmetric quant

* typo with apple silicon(e)

* typo for marlin

* column name revert: review

* doc rocm support

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/gptq.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <[email protected]>

* Update docs/source/en/quantization/overview.md

Co-authored-by: Steven Liu <[email protected]>

---------

Signed-off-by: jiqing-feng <[email protected]>
Co-authored-by: LRL-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: Qubitium-ModelCloud <[email protected]>
Co-authored-by: ZX-ModelCloud <[email protected]>
Co-authored-by: LRL <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Mohamed Mekkouri <[email protected]>
Co-authored-by: Steven Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.