forked from huggingface/transformers
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rebase #6
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
update revision
…ic (huggingface#33079) * Add docs/source/ar/torchscript.md to Add_docs_source_ar_torchscript.md * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Update docs/source/ar/torchscript.md Co-authored-by: Abdullah Mohammed <[email protected]> * Merge troubleshooting.md with this Branch * Update _toctree.yml * Update torchscript.md * Update troubleshooting.md --------- Co-authored-by: Abdullah Mohammed <[email protected]>
…4549) * Better support transformers.agents in gradio: small fixes and additional tests
* initial translation * removed english * Fixed Trivial Typos, updated _toctree.yml
[docs] Broken link
* add XPU path * use accelerate API * Update docs/source/en/tasks/semantic_segmentation.md Co-authored-by: Steven Liu <[email protected]> * update more places with accelerate API --------- Co-authored-by: Steven Liu <[email protected]>
…uggingface#34253) * Retain newlines in chat template when * Add try/except * Add regression test * Simplify test * Apply suggestions from code review Co-authored-by: Matt <[email protected]> --------- Co-authored-by: Matt <[email protected]>
LLava -> Llava
* add xpu path for awq * update readme
* add gradient accumulation steps tests for fsdp * invert no_sync context to fix training for fsdp
* Remove FSDP wrapping from sub-models. * solve conflict trainer.py * make fixup * add unit test for fsdp_auto_wrap_policy when using auto_find_batch_size * put back extract_model_from_parallel * use transformers unwrap_model
* remove v4.44 deprecations * PR comments * deprecations scheduled for v4.50 * hub version update * make fiuxp --------- Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Arthur <[email protected]>
* Add model skeletion with transformers-cli add-new-model-like * Convert config to modular, add rms_norm_eps, delete clip_qkv * Convert model to modular, add RMSNorm * Add flash attention with qk norm and no qkv clipping * Add decoder layer with RMSNorm after attention/feedforward layers * Add base and causal model * Add converter improvements from OLMo repo * Update weight loading in OLMo to HF converter * Set correct default for rms_norm_eps * Set correct pipeline_model_mapping in test * Run make fixup * Fix model type * Re-run modular conversion * Manually set config docs to fix build errors * Convert olmo-1124 to olmo_1124 to fix flash attention docs errors * Start updating tests * Update tests * Copy upstream test_eager_matches_sdpa_inference_1_bfloat16 changes to olmo_1124 * Rename input_layernorm and post_attention_layernorm to reflect their ops better * Use correct tokenizer * Remove test unsupported by GPT2 tokenizer * Create GenerationConfig outside of from_pretrained call * Use simpler init file structure * Add explicit __all__ to support simplified init * Make safetensor serialization the default * Update OLMo November 2024 docs
…3424) * use num additional tokens * fix copies + docs * another fix copies :) * add docs * move order for BC
…when reading config (huggingface#34637) fix a bug where 'id2label' was incorrectly written as 'i2label' when reading the config from pretrained config
19d58d3 has introduced a context manager to manage subtests of test_training_gradient_checkpointing. However, test body was not moved under "with" statement. Thus, while tests are correctly marked as skipped, test bodies were still executed. In some cases, as with llama this caused attribute errors. Fixes: huggingface#34722 Fixes: 19d58d3 ("Add MLLama (huggingface#33703)") Signed-off-by: Dmitry Rogozhkin <[email protected]>
make device-agnostic
add XPU part to testing Signed-off-by: Lin, Fanli <[email protected]>
Fixes typo.
…34184) * Simplify Tensor Parallel implementation with PyTorch TP * Move tp_plan to config * Lint * Format and warning * Disable copy-from check * Conditionally get attr from config * make fix-copies * Move base_model_tp_plan to PretrainedConfig * Move TP into from_pretrained * Add device context for load * Do not serialize * Move _tp_plan setting to post_init * Add has_tp_plan * Add test_tp * Add 'Multi-gpu inference' doc * Add backward support for device type identification * Auto-detect accelerator * supports_tp_plan * copyright year * Fix copy
…uggingface#34687) * Allow handling files as args for a tool created with `Tool.from_space`
* Revert "Revert "Fix Whisper CI" (huggingface#34605)" This reverts commit 74d3824. * update --------- Co-authored-by: ydshieh <[email protected]> Co-authored-by: Arthur <[email protected]>
pglorio
pushed a commit
that referenced
this pull request
Jan 16, 2025
* gptqmodel Signed-off-by: jiqing-feng <[email protected]> * fix format Signed-off-by: jiqing-feng <[email protected]> * update readme Signed-off-by: jiqing-feng <[email protected]> * gptqmodel need use checkpoint_format (#1) * gptqmodel need use checkpoint_format * fix quantize * Update quantization_config.py * Update quantization_config.py * Update quantization_config.py --------- Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> * Revert quantizer_gptq.py (#2) * revert quantizer_gptq.py change * pass **kwargs * limit gptqmodel and optimum version Signed-off-by: jiqing-feng <[email protected]> * fix format Signed-off-by: jiqing-feng <[email protected]> * fix warning Signed-off-by: jiqing-feng <[email protected]> * fix version check Signed-off-by: jiqing-feng <[email protected]> * revert unrelated changes Signed-off-by: jiqing-feng <[email protected]> * enable gptqmodel tests Signed-off-by: jiqing-feng <[email protected]> * fix requires gptq Signed-off-by: jiqing-feng <[email protected]> * Fix Transformer compat (#3) * revert quantizer_gptq.py change * pass **kwargs * add meta info * cleanup * cleanup * Update quantization_config.py * hf_select_quant_linear pass checkpoint_format and meta * fix GPTQTestCUDA * Update test_gptq.py * gptqmodel.hf_select_quant_linear() now does not select ExllamaV2 * cleanup * add backend * cleanup * cleanup * no need check exllama version * Update quantization_config.py * lower checkpoint_format and backend * check none * cleanup * Update quantization_config.py * fix self.use_exllama == False * spell * fix unittest * fix unittest --------- Co-authored-by: LRL <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> * fix format Signed-off-by: jiqing-feng <[email protected]> * fix format again Signed-off-by: jiqing-feng <[email protected]> * update gptqmodel version (#6) * update gptqmodel version * update gptqmodel version * fix unit test (#5) * update gptqmodel version * update gptqmodel version * "not self.use_exllama" is not equivalent to "self.use_exllama==False" * fix unittest * update gptqmodel version * backend is loading_attibutes (huggingface#7) * fix format and tests Signed-off-by: jiqing-feng <[email protected]> * fix memory check Signed-off-by: jiqing-feng <[email protected]> * fix device mismatch Signed-off-by: jiqing-feng <[email protected]> * fix result check Signed-off-by: jiqing-feng <[email protected]> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <[email protected]> * Update src/transformers/quantizers/quantizer_gptq.py Co-authored-by: Marc Sun <[email protected]> * update tests Signed-off-by: jiqing-feng <[email protected]> * review: update docs (huggingface#10) * review: update docs (huggingface#12) * review: update docs * fix typo * update tests for gptqmodel Signed-off-by: jiqing-feng <[email protected]> * update document (huggingface#9) * update overview.md * cleanup * Update overview.md * Update overview.md * Update overview.md * update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md * Update gptq.md --------- Co-authored-by: Qubitium-ModelCloud <[email protected]> * typo * doc note for asymmetric quant * typo with apple silicon(e) * typo for marlin * column name revert: review * doc rocm support * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/gptq.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <[email protected]> * Update docs/source/en/quantization/overview.md Co-authored-by: Steven Liu <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]> Co-authored-by: LRL-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: Qubitium-ModelCloud <[email protected]> Co-authored-by: ZX-ModelCloud <[email protected]> Co-authored-by: LRL <[email protected]> Co-authored-by: Marc Sun <[email protected]> Co-authored-by: Mohamed Mekkouri <[email protected]> Co-authored-by: Steven Liu <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.