Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated PR: Downstream develop rebase new changes #71

Merged
merged 1,841 commits into from
Nov 14, 2024
Merged
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jul 23, 2024

  1. Added mamba.py backend (huggingface#30139)

    * Update README.md
    
    * tests: forward ok
    
    * backward test done
    
    * done testing
    
    * removed check. scripts
    
    * Update README.md
    
    * added use_mambapy arg
    
    * fixed typo in warning
    
    * protected imports w/ mambapy package
    
    * delete pscan.py + raise rather than assert
    
    * Update import_utils.py
    
    * fix whitespaces and unused import
    
    * trailing whitespace + import block unformatted
    
    * Update modeling_mamba.py
    
    * transpose before pscan
    
    * shape comment
    
    * ran make style
    
    * use_mambapy=False by default
    
    Co-authored-by: Arthur <[email protected]>
    
    * ran make fix-copies
    
    ---------
    
    Co-authored-by: Arthur <[email protected]>
    alxndrTL and ArthurZucker authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    bab32d6 View commit details
    Browse the repository at this point in the history
  2. Rename Phi-3 rope scaling type (huggingface#31436)

    * renamed phi3 rope_scaling type
    
    * fixed trailing whitespaces
    
    * fixed test
    
    * added warning
    
    * fixed format
    garg-amit authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    034b477 View commit details
    Browse the repository at this point in the history
  3. Revert "Incorrect Whisper long-form decoding timestamps " (huggingfac…

    …e#32148)
    
    Revert "Incorrect Whisper long-form decoding timestamps  (huggingface#32003)"
    
    This reverts commit cd48553.
    sanchit-gandhi authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    3263b34 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    a009fbd View commit details
    Browse the repository at this point in the history
  5. feat(cache): StaticCache uses index_copy_ to avoid useless copy (hugg…

    …ingface#31857)
    
    * feat(cache): StaticCache uses index_copy_ to avoid useless copy
    
    Using index_copy_ allows for explicit in-place change of the tensor.
    Some backends (XLA) will otherwise copy the tensor, making the code
    slower and using more memory.
    
    Proposed implementation will end up using less memory and on XLA will
    result in less compilation, but the change is also quite generic, making
    no change whatsoever on CUDA or CPU backend.
    
    * feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy
    
    Applying the same change done in StaticCache.
    
    * fix(cache): fallback of index_copy_ when not implemented
    
    * fix(cache): in index_copy_ ensure tensors are on same device
    
    * [run slow] llama
    
    * fix(cache): add move of cache_position to same device in SlidingWindowCache
    
    * Revert "[run slow] llama"
    
    This reverts commit 02608dd.
    tengomucho authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    6370062 View commit details
    Browse the repository at this point in the history
  6. Added additional kwarg for successful running of optuna hyperparamete…

    …r search (huggingface#31924)
    
    Update integration_utils.py
    
    Added additional kwarg
    DeF0017 authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    7d92009 View commit details
    Browse the repository at this point in the history
  7. Enhancing SFT Training Efficiency Using Packing and FlashAttention2 w…

    …ith Position IDs (huggingface#31629)
    
    * add DataCollatorBatchFlattening
    
    * Update data_collator.py
    
    * change name
    
    * new FA2 flow if position_ids is provided
    
    * add comments
    
    * minor fix
    
    * minor fix data collator
    
    * add test cases for models
    
    * add test case for data collator
    
    * remove extra code
    
    * formating for ruff check and check_repo.py
    
    * ruff format
    
    ruff format tests src utils
    
    * custom_init_isort.py
    RhuiDih authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    9cf4f2a View commit details
    Browse the repository at this point in the history
  8. Updated ruff to the latest version (huggingface#31926)

    * Updated ruff version and fixed the required code accorindg to the latest version.
    
    * Updated ruff version and fixed the required code accorindg to the latest version.
    
    * Added noqa directive to ignore 1 error shown by ruff
    Sai-Suraj-27 authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    d2c687b View commit details
    Browse the repository at this point in the history
  9. Dev version: v4.44.0.dev0

    LysandreJik committed Jul 23, 2024
    Configuration menu
    Copy the full SHA
    ff0d708 View commit details
    Browse the repository at this point in the history
  10. Llama 3.1 conversion

    Co-authored-by: Arthur Zucker <[email protected]>
    LysandreJik and ArthurZucker committed Jul 23, 2024
    Configuration menu
    Copy the full SHA
    d5a99df View commit details
    Browse the repository at this point in the history
  11. fix (huggingface#32162)

    gante authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    23f6a43 View commit details
    Browse the repository at this point in the history
  12. fix: Fixed an if condition that is always evaluating to true (hugging…

    …face#32160)
    
    Fixed an if condition always evaluating to true.
    Sai-Suraj-27 authored Jul 23, 2024
    Configuration menu
    Copy the full SHA
    bc2adb0 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    c85510f View commit details
    Browse the repository at this point in the history

Commits on Jul 24, 2024

  1. adds: extra_repr() to MambaRMSNorm to include hidden size / size of w…

    …eights in the layer (huggingface#32171)
    
    * adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer
    
    * style fix with ruff:
    rohitdwivedula authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    01be5b4 View commit details
    Browse the repository at this point in the history
  2. fix: default value reflects the runtime environment variables rather …

    …than the ones present at import time. (huggingface#32153)
    
    * fix: default value reflects the runtime environment variables rather than the ones present at import time.
    
    * Fix: Change `deterministic` to None by default; use env var if None
    junrae6454 authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    8678879 View commit details
    Browse the repository at this point in the history
  3. Update qwen2.md (huggingface#32108)

    * Update qwen2.md
    
    outdated description
    
    * Update qwen2.md
    
    amended
    
    * Update qwen2.md
    
    Update
    
    * Update qwen2.md
    
    fix wrong version code, now good to go
    ArtificialZeng authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    5f4ee98 View commit details
    Browse the repository at this point in the history
  4. Remove conversational pipeline tests (huggingface#32099)

    Remove conversation pipeline tests
    amyeroberts authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    165116b View commit details
    Browse the repository at this point in the history
  5. RoPE: relaxed rope validation (huggingface#32182)

    * relaxed rope check
    
    * lets also accept rope_type=None, defaulting to the original implementation
    
    * type and rope_type can coexist
    gante authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    e0182f3 View commit details
    Browse the repository at this point in the history
  6. let's not warn when someone is running a forward (huggingface#32176)

    * let's not warn when someone is running a foward without cache + self.training
    
    * more models
    
    * fixup
    ArthurZucker authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    8d2534c View commit details
    Browse the repository at this point in the history
  7. Fix resize embedding with Deepspeed (huggingface#32192)

    fix resize when deepspeed
    zucchini-nlp authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    1392a68 View commit details
    Browse the repository at this point in the history
  8. Fix float8_e4m3fn in modeling_utils (huggingface#32193)

    * Fix float8_e4m3fn in modeling_utils
    
    * style
    
    * fix
    
    * comment
    SunMarc authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    af0e4b7 View commit details
    Browse the repository at this point in the history
  9. Support dequantizing GGUF FP16 format (huggingface#31783)

    * support gguf fp16
    
    * support gguf bf16 with pytorch
    
    * add gguf f16 test
    
    * remove bf16
    PenutChen authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    1c122a4 View commit details
    Browse the repository at this point in the history
  10. 🚨 No more default chat templates (huggingface#31733)

    * No more default chat templates
    
    * Add the template to the GPT-SW3 tests since it's not available by default now
    
    * Fix GPT2 test
    
    * Fix Bloom test
    
    * Fix Bloom test
    
    * Remove default templates again
    Rocketknight1 authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    edd68f4 View commit details
    Browse the repository at this point in the history
  11. fix: Replaced deprecated unittest method with the correct one (hugg…

    …ingface#32198)
    
    Replaced deprecated unittest method with the correct one.
    Sai-Suraj-27 authored Jul 24, 2024
    Configuration menu
    Copy the full SHA
    85a1269 View commit details
    Browse the repository at this point in the history

Commits on Jul 25, 2024

  1. [whisper] fix short-form output type (huggingface#32178)

    * [whisper] fix short-form output type
    
    * add test
    
    * make style
    
    * update long-form tests
    
    * fixes
    
    * last fix
    
    * finalise test
    sanchit-gandhi authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    5658e74 View commit details
    Browse the repository at this point in the history
  2. remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1…

    ….7.0 (huggingface#32210)
    
    remove unnecessary guard code related with pytorch versions 1.4.2 ~
    1.7.0
    statelesshz authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    f53a5de View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1ecedf1 View commit details
    Browse the repository at this point in the history
  4. [BigBird Pegasus] set _supports_param_buffer_assignment to False (hug…

    …gingface#32222)
    
    set _supports_param_buffer_assignment to False
    kashif authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    9b9a54e View commit details
    Browse the repository at this point in the history
  5. [warnings] fix E721 warnings (huggingface#32223)

    fix E721 warnings
    kashif authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    de23188 View commit details
    Browse the repository at this point in the history
  6. Follow up for huggingface#31973 (huggingface#32025)

    * fix
    
    * [test_all] trigger full CI
    
    ---------
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    df6eee9 View commit details
    Browse the repository at this point in the history
  7. translate philosophy.md to chinese (huggingface#32177)

    * translate philosophy.md to chinese
    
    * add the missing link
    statelesshz authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    6ed0bf1 View commit details
    Browse the repository at this point in the history
  8. Allow a specific microphone to be used by the ffmpeg audio pipeline u…

    …tility functions. Default to using the currently active microphone on Mac (huggingface#31846)
    
    * use currently active microphone on mac for ffmpeg_microphone
    
    * Allow ffmpeg_microphone device to be specified
    
    Co-authored-by: amyeroberts <[email protected]>
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    jrhe and amyeroberts authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    3a83ec4 View commit details
    Browse the repository at this point in the history
  9. Fix code snippet for Grounding DINO (huggingface#32229)

    Fix code snippet for grounding-dino
    qubvel authored Jul 25, 2024
    Configuration menu
    Copy the full SHA
    9d6c064 View commit details
    Browse the repository at this point in the history

Commits on Jul 26, 2024

  1. Generation: stop at eos for assisted decoding (huggingface#31301)

    * fix
    
    * move changes to prompt lookup
    
    * add test
    
    * set eos in assistant model
    
    * style
    
    * fix flakiness
    
    * changes for new `main`
    
    * Update tests/generation/test_utils.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update tests/generation/test_utils.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * add comment to explain
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    zucchini-nlp and amyeroberts authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    4ab33c2 View commit details
    Browse the repository at this point in the history
  2. Llava: generate without images (huggingface#32183)

    * llava w/o images
    
    * tests
    zucchini-nlp authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    fad15fb View commit details
    Browse the repository at this point in the history
  3. Resize embeds with DeepSpeed (huggingface#32214)

    * fix resize when deepspeed
    
    * deepsped uses new embeds
    
    * we needed this
    zucchini-nlp authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    c46edfb View commit details
    Browse the repository at this point in the history
  4. don't log base model architecture in wandb if log model is false (hug…

    …gingface#32143)
    
    * don't log base model architecture in wandb is log model is false
    
    * Update src/transformers/integrations/integration_utils.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * convert log model setting into an enum
    
    * fix formatting
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    joaonadkarni and amyeroberts authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    1c7ebf1 View commit details
    Browse the repository at this point in the history
  5. Refactor: Removed un-necessary object base class (huggingface#32230)

    * Refactored to remove un-necessary object base class.
    
    * small fix.
    Sai-Suraj-27 authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    b8e5cd5 View commit details
    Browse the repository at this point in the history
  6. Adds: extra_repr for RMSNorm layers in most models (huggingface#32204)

    * adds: extra_repr() to RMSNorm layers in multiple models
    
    * adds: extra_repr for deprecated models as well
    
    * formatting as per style guide
    rohitdwivedula authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    f9756d9 View commit details
    Browse the repository at this point in the history
  7. Add check for target_sizes is None in `post_process_image_guided_de…

    …tection` for owlv2 (huggingface#31934)
    
    * Add check for target_sizes is None in post_process_image_guided_detection
    
    * Make sure Owlvit and Owlv2 in sync
    
    * Fix incorrect indentation; add check for correct size of target_sizes
    catalys1 authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    5f841c7 View commit details
    Browse the repository at this point in the history
  8. [tests] fix static cache implementation is not compatible with `att…

    …n_implementation==flash_attention_2` (huggingface#32039)
    
    * add flash attention check
    
    * fix
    
    * fix
    faaany authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    27c7f97 View commit details
    Browse the repository at this point in the history
  9. Flash-Attn: fix generation when no attention mask or no pading (huggi…

    …ngface#32241)
    
    * fix
    
    * fix prev test (half of failures)
    
    * [run-slow] llama, gemma2
    
    * [run-slow] llama, gemma2
    zucchini-nlp authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    81233c0 View commit details
    Browse the repository at this point in the history
  10. More flexible trigger condition (huggingface#32251)

    update
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 26, 2024
    Configuration menu
    Copy the full SHA
    8da9068 View commit details
    Browse the repository at this point in the history

Commits on Jul 27, 2024

  1. Llama 3.1: replace for loop by tensor ops at inv_freq initialization (h…

    …uggingface#32244)
    
    * replace for loop by tensor ops
    
    * rm assert; readability
    gante authored Jul 27, 2024
    Configuration menu
    Copy the full SHA
    44f6fdd View commit details
    Browse the repository at this point in the history

Commits on Jul 29, 2024

  1. 🚨 Bloom support for cache class (huggingface#31445)

    * bloom dynamic cache
    
    * bloom follows standard cache format
    
    * no skips for bloom anymore
    
    * use cache position when possible
    
    * clean up
    
    * codestyle
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * pr comments
    
    * isinstance fix
    
    * address comments
    
    * make musicgen test happy
    
    * [run-slow] bloom
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    zucchini-nlp and amyeroberts authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    f739687 View commit details
    Browse the repository at this point in the history
  2. Upload new model failure report to Hub (huggingface#32264)

    upload
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    f2122cc View commit details
    Browse the repository at this point in the history
  3. Optimize t5 tokenize logic to avoid redundant calls (huggingface#32270)

    * Optimize t5 tokenize logic to avoid redundant calls
    
    * fix and overwrite copies
    leejet authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    5019aab View commit details
    Browse the repository at this point in the history
  4. fix: Fixed wrong argument passed to convert_blip_checkpoint functio…

    …n call (huggingface#32262)
    
    Removed one wrong argument passed to convert_blip_checkpoint function call.
    Sai-Suraj-27 authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    a2ad9d5 View commit details
    Browse the repository at this point in the history
  5. Repo: remove exceptions in check_docstrings (huggingface#32259)

    remove exceptions
    gante authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    535fe78 View commit details
    Browse the repository at this point in the history
  6. make p_mask a numpy array before passing to select_starts_ends (h…

    …uggingface#32076)
    
    * fix
    
    * bug fix
    
    * refine
    
    * fix
    faaany authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    6494479 View commit details
    Browse the repository at this point in the history
  7. fix(docs): Fixed a link in docs (huggingface#32274)

    Fixed a link in docs.
    Sai-Suraj-27 authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    4992889 View commit details
    Browse the repository at this point in the history
  8. Generate: end-to-end compilation (huggingface#30788)

    * mvp
    
    * added test (a few models need fixes)
    
    * fix a few test cases
    
    * test nits
    
    * harder test 😈
    
    * revert changes in stablelm
    
    * test with improved condition
    
    * add todo
    
    * tmp commit
    
    * merged with main
    
    * nits
    
    * add todo
    
    * final corrections
    
    * add docs for generation compilation
    
    * docs nits
    
    * add  tip
    
    * PR suggestions
    
    * add more details to the compilation docs
    
    * fix cache positions
    
    * cache is now init in generate; update docs
    
    * tag test as flaky
    
    * docs
    
    * post rebase make fixup and other nits
    
    * remove unintended changes
    
    * whisper (encoder-decoder) not supported
    
    * move token default updates to ; add tests for token defaults
    
    * push changes
    
    * manual rebase
    
    * chameleon doesn't support this
    
    * fix test_static_cache_mha_mqa_gqa (broken in another PR)
    
    * docs: dynamic is better with end-to-end compilation
    gante authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    7ffe25f View commit details
    Browse the repository at this point in the history
  9. Whisper tokenizer word level timestamps (huggingface#32197)

    * fix _fix_key in PreTrainedModel
    
    * fix _find_longest_common_sequence
    
    * add test
    
    * remove result.json
    
    * nit
    
    * update test
    kamilakesbi authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    3fbaaaa View commit details
    Browse the repository at this point in the history
  10. [pipeline] fix padding for 1-d tensors (huggingface#31776)

    * [pipeline] fix padding for 1-d tensors
    
    * add test
    
    * make style
    
    * Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
    
    Co-authored-by: Kamil Akesbi <[email protected]>
    
    * Update tests/pipelines/test_pipelines_automatic_speech_recognition.py
    
    ---------
    
    Co-authored-by: Kamil Akesbi <[email protected]>
    sanchit-gandhi and kamilakesbi authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    7f5d644 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    811a9ca View commit details
    Browse the repository at this point in the history
  12. Add stream messages from agent run for gradio chatbot (huggingface#32142

    )
    
    * Add stream_to_gradio method for running agent in gradio demo
    aymeric-roucher authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    a24a9a6 View commit details
    Browse the repository at this point in the history
  13. use torch 2.4 in 2 CI jobs (huggingface#32302)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 29, 2024
    Configuration menu
    Copy the full SHA
    f0bc49e View commit details
    Browse the repository at this point in the history

Commits on Jul 30, 2024

  1. Docs: fix GaLore optimizer code example (huggingface#32249)

    Docs: fix GaLore optimizer example
    
    Fix incorrect usage of GaLore optimizer in Transformers trainer code example.
    
    The GaLore optimizer uses low-rank gradient updates to reduce memory usage. GaLore is quite popular and is implemented by the authors in [https://github.com/jiaweizzhao/GaLore](https://github.com/jiaweizzhao/GaLore). A few months ago GaLore was added to the HuggingFace Transformers library in huggingface#29588.
    
    Documentation of the Trainer module includes a few code examples of how to use GaLore. However, the `optim_targe_modules` argument to the `TrainingArguments` function is incorrect, as discussed in huggingface#29588 (comment). This pull request fixes this issue.
    gil2rok authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    3e8106d View commit details
    Browse the repository at this point in the history
  2. Fix GGUF dequantize for gguf==0.9.1 (huggingface#32298)

    * fix gguf dequantize for gguf==0.9.1
    
    * fix old version
    
    * make style
    Isotr0py authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    934fe15 View commit details
    Browse the repository at this point in the history
  3. Cast epochs_trained to int when resuming training (huggingface#32286)

    * fix epochs_trained as int when resuming training
    
    * refactor
    
    ---------
    
    Co-authored-by: teddyferdinan <[email protected]>
    teddy-f-47 and teddyferdinan authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    20528f0 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    084b509 View commit details
    Browse the repository at this point in the history
  5. Fix M4T for ASR pipeline (huggingface#32296)

    * tentative fix
    
    * do the same for M4T
    ylacombe authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    2fbbcf5 View commit details
    Browse the repository at this point in the history
  6. Docs: formatting nits (huggingface#32247)

    * doc formatting nits
    
    * ignore non-autodocs
    
    * Apply suggestions from code review
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/esm/modeling_esm.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/esm/modeling_esm.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * make fixup
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    gante and amyeroberts authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    e68ec18 View commit details
    Browse the repository at this point in the history
  7. Alternative agent plan (huggingface#32295)

    * new agent plan
    
    * plan type assertion
    
    * style corrections
    
    * better prompt naming
    
    * make fixup
    plaggy authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    bd54ed2 View commit details
    Browse the repository at this point in the history
  8. fix: Added missing raise keyword for few exceptions (huggingface#32333)

    Fixed raising of few exceptions.
    Sai-Suraj-27 authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    1627108 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    62c60a3 View commit details
    Browse the repository at this point in the history
  10. fixes huggingface#32329 : The Torch code is correct - to get an avera…

    …ge of 10% o… (huggingface#32335)
    
    fixes huggingface#32329 : The Torch code is correct - to get an average of 10% of the total, we want to take 50% of the remainder after we've already masked 80% with [MASK] in the previous step.
    fkrasnov2 authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    516af4b View commit details
    Browse the repository at this point in the history
  11. Repo checks: skip docstring checks if not in the diff (huggingface#32328

    )
    
    * tmp
    
    * skip files not in the diff
    
    * use git.Repo instead of an external subprocess
    
    * add tiny change to confirm that the diff is working on pushed changes
    
    * add make quality task
    
    * more profesh main commit reference
    gante authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    026a173 View commit details
    Browse the repository at this point in the history
  12. Fix slow GemmaTokenizer and improve SPM slow -> fast conversion proce…

    …ss (huggingface#32191)
    
    * Remove user-defined tokens which can be obtained through merges
    
    * Remove debug line
    
    * formatting
    
    * Refactor spm slow -> fast converter
    
    * revert unnecessary refactor
    
    * set comprehension
    
    * remove test files
    
    * Use `vocab_scores`
    
    * Always replace spiece underline with space in decode
    
    * we no longer need token filtering
    
    * Add save fast load slow unit test
    
    * Remove tokenizers version check
    
    * Remove duplicate code
    
    * Make `<start_of_turn>` and `<end_of_turn>` special tokens
    
    * Bias merge priority with length if score is the same
    
    * Add unit test for merge priority
    
    * CI
    xenova authored Jul 30, 2024
    Configuration menu
    Copy the full SHA
    6e2d04e View commit details
    Browse the repository at this point in the history

Commits on Jul 31, 2024

  1. Configuration menu
    Copy the full SHA
    a326433 View commit details
    Browse the repository at this point in the history
  2. Gemma2 and flash-attention (huggingface#32188)

    * enable flash-attn & static cache
    
    * this works, not the prev
    
    * fix for sliding window layers
    
    * not needed anymore
    zucchini-nlp authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    7f552e2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b75ad56 View commit details
    Browse the repository at this point in the history
  4. [Idefics2] - Fix FA2 call for Perceiver layer (huggingface#32275)

    * Fix FA2 call for Perciever layer
    
    * [run_slow] idefics2
    
    * [run_slow] idefics2
    
    * [run_slow] idefics2
    
    * Fix up
    
    * [run_slow] idefics2
    
    * [run_slow] idefics2
    
    * [run_slow] idefics2
    amyeroberts authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    5f1fcc2 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    ef177a5 View commit details
    Browse the repository at this point in the history
  6. Fix error when streaming to gradio with non-string tool arguments (hu…

    …ggingface#32360)
    
    Fix error when streaming agent run to gradio with non-string tool arguments
    aymeric-roucher authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    b46bd8b View commit details
    Browse the repository at this point in the history
  7. >3-5x faster torch.compile forward compilation for autoregressive dec…

    …oder models (huggingface#32227)
    
    * draft
    
    * apply changes to all relevant archs
    
    * rerun ci - check_docstrings.py failing?
    
    * fix docstring
    
    * move 2D->4D mask creation to modeling file
    
    * repo consistency
    
    * fix the batch size = 1 case - calling contiguous is not enough
    
    * nit
    
    * style
    
    * propagate to gemma/gemma-2
    
    * prepare inputs for gemma generation
    
    * implement test and tiny fix in gemma2
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: Arthur <[email protected]>
    
    * fix copies
    
    * ci pass
    
    * fix gemma's test_compile_static_cache tests
    
    * flacky
    
    * retrigger ci
    
    ---------
    
    Co-authored-by: sanchit-gandhi <[email protected]>
    Co-authored-by: Arthur <[email protected]>
    3 people authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    92abe60 View commit details
    Browse the repository at this point in the history
  8. fix: Removed unnecessary @staticmethod decorator (huggingface#32361)

    * Fixed staticmethods with self as first argument.
    
    * Fixed staticmethods with self as first argument.
    
    * Fixed staticmethods with self as first argument.
    
    * Fixed staticmethods with self as first argument.
    Sai-Suraj-27 authored Jul 31, 2024
    Configuration menu
    Copy the full SHA
    53f0c9c View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    14ee232 View commit details
    Browse the repository at this point in the history

Commits on Aug 1, 2024

  1. LLaVa: add cache class attribute (huggingface#32278)

    cache class flag
    zucchini-nlp authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    453e748 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    9451a38 View commit details
    Browse the repository at this point in the history
  3. [whisper] compile compatibility with long-form decoding (huggingface#…

    …31772)
    
    * [whisper] compile compatibility with long-form decoding
    
    * clarify comment
    
    * fix after rebase
    
    * finalise
    
    * fix bsz
    
    * fix cache split
    
    * remove contiguous
    
    * style
    
    * finish
    
    * update doc
    
    * prevent cuda graph trace
    sanchit-gandhi authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    e234061 View commit details
    Browse the repository at this point in the history
  4. Remove size check between attn_weights and kv_seq_len for phi3 (huggi…

    …ngface#32339)
    
    * Remove size check between attn_weights and kv_seq_len
    
    * add unit tests
    helunwencser authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    48ed24c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    9e28284 View commit details
    Browse the repository at this point in the history
  6. Check device map for saving tokenizer config on TPU (fix for issue hu…

    …ggingface#31971) (huggingface#32043)
    
    * Remove TPU device map for saving tokenizer config
    
    * Update tokenization_utils_base.py
    
    * Fix error msg when passing non-string device into tokenizer
    
    * Fix error message for non-string tokenizer device
    
    * Print out tokenizer device type in error msg
    
    * Update tokenization_utils_base.py
    ayukh authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    05c1f9a View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    2229ebe View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    db8c7ca View commit details
    Browse the repository at this point in the history
  9. Fix conflicting key in init kwargs in PreTrainedTokenizerBase (huggin…

    …gface#31233)
    
    * Fix conflicting key in init kwargs in PreTrainedTokenizerBase
    
    * Update code to check for callable key in save_pretrained
    
    * Apply PR suggestions
    
    * Invoke CI
    
    * Updates based on PR suggestion
    OmarManzoor authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    b4727a1 View commit details
    Browse the repository at this point in the history
  10. Offloaded KV Cache (huggingface#31325)

    * Initial implementation of OffloadedCache
    
    * enable usage via cache_implementation
    
    * Address feedback, add tests, remove legacy methods.
    
    * Remove flash-attn, discover synchronization bugs, fix bugs
    
    * Prevent usage in CPU only mode
    
    * Add a section about offloaded KV cache to the docs
    
    * Fix typos in docs
    
    * Clarifications and better explanation of streams
    n17s authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    ca59d6f View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    e3d8285 View commit details
    Browse the repository at this point in the history
  12. Fixed Hybrid Cache Shape Initialization. (huggingface#32163)

    * fixed hybrid cache init, added test
    
    * Fix Test Typo
    
    ---------
    
    Co-authored-by: Aaron Haag <[email protected]>
    OsamaS99 and Aaron Haag authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    51ab25e View commit details
    Browse the repository at this point in the history
  13. Yell at the user if zero-3 init wasn't performed, but expected to hav…

    …e been done (huggingface#32299)
    
    * Test this zach
    
    * Test for improper init w/o zero3
    
    * Move back
    
    * Apply suggestions from code review
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Get rid of stars in warning
    
    * Make private
    
    * Make clear
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    muellerzr and amyeroberts authored Aug 1, 2024
    Configuration menu
    Copy the full SHA
    82efc53 View commit details
    Browse the repository at this point in the history

Commits on Aug 2, 2024

  1. Update docs (huggingface#32368)

    nits
    zucchini-nlp authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    2af199c View commit details
    Browse the repository at this point in the history
  2. RoPE: Add numerical tests ✨ (huggingface#32380)

    tests! :D
    gante authored Aug 2, 2024
    Configuration menu
    Copy the full SHA
    083e13b View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c1aa0ed View commit details
    Browse the repository at this point in the history

Commits on Aug 3, 2024

  1. fix: (issue huggingface#32124) Exception raised when running `transfo…

    …rmers/examples/flax/language-modeling/t5_tokenizer_model.py`. (huggingface#32157)
    
    fix: Exception raised when running .
    fshp971 authored Aug 3, 2024
    Configuration menu
    Copy the full SHA
    7c31d05 View commit details
    Browse the repository at this point in the history
  2. MixtralFlashAttention2: put "plus 1" inside parentheses when calculat…

    …ing rotary_seq_len, allowing None position_ids input. (huggingface#31500)
    
    * Mixtral: remove unnecessary plus 1 when calculating rotary_seq_len, allowing position_ids=None (no auto position_ids generation could be unsafe)
    
    * fix typo [:-1] to [:, -1]
    
    * to meet formatting requirement
    
    * to meet formatting requirement
    
    * remove white space
    
    * MixtralFlashAttention2: put "+ 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. Fix format/style issue.
    
    * propagate to startcoder2, phi3, mixtral and qwen2
    
    * update qwen2_moe
    xenshinu authored Aug 3, 2024
    Configuration menu
    Copy the full SHA
    621fb3c View commit details
    Browse the repository at this point in the history

Commits on Aug 5, 2024

  1. Bump keras from 2.8.0 to 2.13.1 in /examples/research_projects/decisi…

    …on_transformer (huggingface#32393)
    
    Bump keras in /examples/research_projects/decision_transformer
    
    Bumps [keras](https://github.com/keras-team/keras) from 2.8.0 to 2.13.1.
    - [Release notes](https://github.com/keras-team/keras/releases)
    - [Commits](keras-team/keras@v2.8.0...v2.13.1)
    
    ---
    updated-dependencies:
    - dependency-name: keras
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    847bb85 View commit details
    Browse the repository at this point in the history
  2. fix: SeamlessM4TFeatureExtractor stride remainder (huggingface#32088)

    * fix: SeamlessM4TFeatureExtractor stride remainder
    
    * Added attention mask size test
    
    * Reran ruff for style correction
    TechInterMezzo authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    05ae3a3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3bb646a View commit details
    Browse the repository at this point in the history
  4. huggingface#32184 save total_vocab_size (huggingface#32240)

    * save total_vocab_size = vocab_size + user added tokens to speed up operation
    
    * updating length when added_tokens_decoder is set
    
    * add test len(tokenizer)
    itazap authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    3d7c2f9 View commit details
    Browse the repository at this point in the history
  5. add values for neftune (huggingface#32399)

    I always forget what typical values are, and I have to look at the paper everytime. This will be a helpful reminder.
    nbroad1881 authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    ea5da52 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    f5f1e52 View commit details
    Browse the repository at this point in the history
  7. Persist embedding type of BART and mBART models after resize (hugging…

    …face#32242)
    
    * fix: persist embedding type of MBartConditonalGeneration after resize
    
    * fix: persist embedding type of BartConditonalGeneration after resize
    AbdiHaryadi authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    baf7e5c View commit details
    Browse the repository at this point in the history
  8. fix: Updated test_embeded_special_tokens for luke and mluke models (h…

    …uggingface#32413)
    
    Fixed tokenizertests for luke, mluke models.
    Sai-Suraj-27 authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    458b0cd View commit details
    Browse the repository at this point in the history
  9. Respect the config's attn_implementation if set (huggingface#32383)

    * Respect the config's attn if set
    
    * Update test - can override in from_config
    
    * Fix
    amyeroberts authored Aug 5, 2024
    Configuration menu
    Copy the full SHA
    7e5d46d View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    13dc6b0 View commit details
    Browse the repository at this point in the history

Commits on Aug 6, 2024

  1. Cache: create docs (huggingface#32150)

    * draft
    
    * updates
    
    * works?
    
    * try adding python example in hidden section
    
    * another try
    
    * hwo do i render python
    
    * format as html code?
    
    * Update docs/source/en/kv_cache.md
    
    Co-authored-by: Joao Gante <[email protected]>
    
    * Update docs/source/en/kv_cache.md
    
    Co-authored-by: Joao Gante <[email protected]>
    
    * Update docs/source/en/kv_cache.md
    
    Co-authored-by: Joao Gante <[email protected]>
    
    * Update docs/source/en/kv_cache.md
    
    Co-authored-by: Joao Gante <[email protected]>
    
    * Update docs/source/en/kv_cache.md
    
    Co-authored-by: Joao Gante <[email protected]>
    
    * one more small update
    
    * should render hidden secrtion now
    
    * add outputs
    
    * fix links
    
    * check links
    
    * update all links
    
    * update with offloaded cache
    
    * all cache is importable, so they appear in docs
    
    * fix copies
    
    * docstring...
    
    ---------
    
    Co-authored-by: Joao Gante <[email protected]>
    zucchini-nlp and gante authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    37c5ca5 View commit details
    Browse the repository at this point in the history
  2. Llava: fix checkpoint_doc (huggingface#32458)

    fix: add new llava like model bug
    RUFFY-369 authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    0aa8328 View commit details
    Browse the repository at this point in the history
  3. add the missing flash attention test marker (huggingface#32419)

    * add flash attention check
    
    * fix
    
    * fix
    
    * add the missing marker
    
    * bug fix
    
    * add one more
    
    * remove order
    
    * add one more
    faaany authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    e85d863 View commit details
    Browse the repository at this point in the history
  4. Update kwargs validation for preprocess with decorator (huggingface…

    …#32024)
    
    * BLIP preprocess
    
    * BIT preprocess
    
    * BRIDGETOWER preprocess
    
    * CHAMELEON preprocess
    
    * CHINESE_CLIP preprocess
    
    * CONVNEXT preprocess
    
    * DEIT preprocess
    
    * DONUT preprocess
    
    * DPT preprocess
    
    * FLAVA preprocess
    
    * EFFICIENTNET preprocess
    
    * FUYU preprocess
    
    * GLPN preprocess
    
    * IMAGEGPT preprocess
    
    * INTRUCTBLIPVIDEO preprocess
    
    * VIVIT preprocess
    
    * ZOEDEPTH preprocess
    
    * VITMATTE preprocess
    
    * VIT preprocess
    
    * VILT preprocess
    
    * VIDEOMAE preprocess
    
    * VIDEOLLAVA
    
    * TVP processing
    
    * TVP fixup
    
    * SWIN2SR preprocess
    
    * SIGLIP preprocess
    
    * SAM preprocess
    
    * RT-DETR preprocess
    
    * PVT preprocess
    
    * POOLFORMER preprocess
    
    * PERCEIVER preprocess
    
    * OWLVIT preprocess
    
    * OWLV2 preprocess
    
    * NOUGAT preprocess
    
    * MOBILEVIT preprocess
    
    * MOBILENETV2 preprocess
    
    * MOBILENETV1 preprocess
    
    * LEVIT preprocess
    
    * LAYOUTLMV2 preprocess
    
    * LAYOUTLMV3 preprocess
    
    * Add test
    
    * Update tests
    qubvel authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    fb66ef8 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    438d06c View commit details
    Browse the repository at this point in the history
  6. Dependencies: fix typo (huggingface#32389)

    deps_2
    gante authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    36fd35e View commit details
    Browse the repository at this point in the history
  7. Add Nemotron HF Support (huggingface#31699)

    * Add nemotron support
    
    * fix inference
    
    * add unit test
    
    * add layernorm1p as a class to avoid meta device mismatch
    
    * test fixed
    
    * Add copied_from statements
    
    * remove pretraining_tp args
    
    * remove nemotronlayernorm
    
    * force LN computation done in FP32
    
    * remove nemotrontokenizer and use llamatokenizer
    
    * license update
    
    * add option for kv_channels for minitron8b
    
    * remove assert
    
    * o_proj fixed
    
    * o_proj reshape
    
    * add gated_proj option
    
    * typo
    
    * remove todos
    
    * fix broken test after merging latest main
    
    * remove nezha/nat after meging main
    
    * chnage default config to 15b model
    
    * add nemo conversion script
    
    * rename conversion script
    
    * remove gate_proj option
    
    * pr comment resolved
    
    * fix unit test
    
    * rename kv_channels to head_dim
    
    * resolve PR issue
    
    * add nemotron md
    
    * fix broken tests
    
    * refactor rope for nemotron
    
    * test fix
    
    * remove linearscaling
    
    * whitespace and import
    
    * fix some copied-from
    
    * code style fix
    
    * reformatted
    
    * add position_embedding to nemotronattention
    
    * rope refactor to only use config, copied-from fix
    
    * format
    
    * Run make fix-copies
    
    * nemotron md with autodoc
    
    * doc  fix
    
    * fix order
    
    * pass check_config_docstrings.py
    
    * fix config_attributes
    
    * remove all llama BC related code
    
    * Use PreTrainedTokenizerFast
    
    * ruff check examples
    
    * conversion script update
    
    * add nemotron to toctree
    suiyoubi authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    6a03942 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    3d8bd11 View commit details
    Browse the repository at this point in the history
  9. Add codestral mamba2 (huggingface#32080)

    * add new model like
    
    * draft cuda forward - mismatched keys (sharding on conv1)
    
    * match keys successfully
    
    * fix split
    
    * get generation/forward running (wrong gens, norm?)
    
    * :update
    
    * some refactoring
    
    * fixes
    
    * works up until copy to cache
    
    * fix
    
    * update
    
    * NON WORKING VERSION
    
    * version that work?
    
    * nit
    
    * fix config
    
    * fix conversion script
    
    * working cuda forward
    
    * nit
    
    * update
    
    * simplifcation
    
    * make mamba slow simple work
    
    * no einops
    
    * todo
    
    * fix style
    
    * no einops
    
    * update fix no einsum
    
    * nit
    
    * remove einops
    
    * bug: scan_output differs strongly
    
    * add rms norm option
    
    * fix fast + slow generation with and w/o cache ✔️
    
    * draft integration tests
    
    * remove a big chunk of the einsum
    
    * fix slow, fast generations, without any einsum
    
    * fix copies
    
    * fix structure
    
    * fix up modeling and tests
    
    * fix tests
    
    * clamping is indeed worse
    
    * recover mamba2 cache test
    
    * fix copies
    
    * no cache position (yet)
    
    * fix tf tests
    
    * fix matmul for generate
    
    * fixup
    
    * skip cache tests for now
    
    * [run-slow]mamba2
    
    * tune out hidden states for padding
    
    * test batched generation
    
    * propagate attention mask changes
    
    * fix past length
    
    * fix integration test
    
    * style
    
    * address comments
    
    * update readme
    
    * add mamba2 version check
    
    * fix tests
    
    * [run-slow]mamba2
    
    * skip edge tests
    
    * [run-slow]mamba2
    
    * last fixup
    
    * [run-slow]mamba2
    
    * update README
    
    ---------
    
    Co-authored-by: Arthur Zucker <[email protected]>
    molbap and ArthurZucker authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    80b90e7 View commit details
    Browse the repository at this point in the history
  10. Migrate import checks not need accelerate, and be more clear on min v…

    …ersions (huggingface#32292)
    
    * Migrate import checks to secondary accelerate calls
    
    * better errs too
    
    * Revert, just keep the import checks + remove accelerate-specific things
    
    * Rm extra'
    
    * Empty commit for ci
    
    * Small nits
    
    * Final
    muellerzr authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    194cf1f View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    50c3ba8 View commit details
    Browse the repository at this point in the history
  12. dev version 4.45.0

    ArthurZucker committed Aug 6, 2024
    Configuration menu
    Copy the full SHA
    26a9443 View commit details
    Browse the repository at this point in the history
  13. is_torchdynamo_compiling -- cast a wide exception net (huggingface#…

    …32476)
    
    * cast a wide net
    
    * make fix-copies with a few manual changes
    
    * add copied from
    gante authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    4fdc702 View commit details
    Browse the repository at this point in the history
  14. Revert "fixes to properly shard FSDP across cpu and meta for cpu_effc…

    …ient_loading for prequantized 4bit (huggingface#32276)" (huggingface#32477)
    
    * Revert "fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for prequantized 4bit (huggingface#32276)"
    
    This reverts commit 62c60a3.
    
    We uncovered an issue with this change that caused our training runs to hang.
    
    * `is_torchdynamo_compiling` -- cast a wide exception net (huggingface#32476)
    
    * cast a wide net
    
    * make fix-copies with a few manual changes
    
    * add copied from
    
    ---------
    
    Co-authored-by: Joao Gante <[email protected]>
    matthewdouglas and gante authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    ac2707e View commit details
    Browse the repository at this point in the history
  15. 🌐 [i18n-KO] Translated mask_generation.md to Korean (huggingface#32257

    )
    
    * docs: ko: tasks/mask_generation.md
    
    * feat: nmt draft
    
    * fix : toc local
    
    * fix : manual edits
    
    * fix : ko-toctree
    
    * fix: resolve suggestions
    
    Co-authored-by: boyunJang <[email protected]>
    Co-authored-by: Chaewon Song <[email protected]>
    
    * fix: resolve suggestions
    
    Co-authored-by: boyunJang <[email protected]>
    Co-authored-by: Chaewon Song <[email protected]>
    
    * fix: resolve suggestions
    
    * fix: resolve suggestions
    
    * fix: resolve suggestions
    
    ---------
    
    Co-authored-by: boyunJang <[email protected]>
    Co-authored-by: Chaewon Song <[email protected]>
    3 people authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    5301b98 View commit details
    Browse the repository at this point in the history
  16. 🌐 [i18n-KO] Translated idefics.md to Korean (huggingface#32258)

    * docs: ko: tasks/idefics.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * fix: resolve suggestions
    
    Co-authored-by: Chaewon Song <[email protected]>
    Co-authored-by: Harheem Kim <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    
    ---------
    
    Co-authored-by: Chaewon Song <[email protected]>
    Co-authored-by: Harheem Kim <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    4 people authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    3b193c7 View commit details
    Browse the repository at this point in the history
  17. 🌐 [i18n-KO] Translated image_to_image.md to Korean (huggingface#32327)

    * docs: ko: tasks/image_to_image.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * fix: resolve suggestions
    
    Co-authored-by: Jihun Lim <[email protected]>
    Co-authored-by: Jiwook Han <[email protected]>
    
    * fix: handle remaining suggestions
    
    Co-authored-by: Jiwook Han <[email protected]>
    
    ---------
    
    Co-authored-by: Jihun Lim <[email protected]>
    Co-authored-by: Jiwook Han <[email protected]>
    3 people authored Aug 6, 2024
    Configuration menu
    Copy the full SHA
    6af0854 View commit details
    Browse the repository at this point in the history

Commits on Aug 7, 2024

  1. Cache: new Cache format in decoder-only models (huggingface#31421)

    * draft bart with new cache
    
    * add cache for decoder-only models
    
    * revert utils
    
    * modify docstring
    
    * revert bart
    
    * minor fixes
    
    * fix copies (not related)
    
    * revert tests
    
    * remove enc-dec related code
    
    * remove bloom
    
    * remove opt (enc-dec)
    
    * update docstring
    
    * git, codegen, gpt_neo, gpt_neox, gpj
    
    * clean up
    
    * copied from statements
    
    * revert
    
    * tmp
    
    * update warning msg
    
    * forgot git
    
    * add more flags
    
    * run-slow git,codegen,gpt_neo,gpt_neox,gpj
    
    * add cache flag to VLMs
    
    * remove files
    
    * style
    
    * video LLMs also need a flag
    
    * style
    
    * llava will go in another PR
    
    * style
    
    * [run-slow] codegen, falcon, git, gpt_neo, gpt_neox, gptj, idefics
    
    * Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
    
    Co-authored-by: Arthur <[email protected]>
    
    * copy from
    
    * deprecate until v4.45 and warn if not training
    
    * nit
    
    * fix test
    
    * test static cache
    
    * add more tests and fix models
    
    * fix copies
    
    * return sliding window mask
    
    * run slow tests & fix + codestyle
    
    * one more falcon fix for alibi
    
    ---------
    
    Co-authored-by: Arthur <[email protected]>
    zucchini-nlp and ArthurZucker authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    a30c865 View commit details
    Browse the repository at this point in the history
  2. Gemma2: add cache warning (huggingface#32279)

    * gemma2 fallback to dynamic cache
    
    * Update src/transformers/models/gemma2/modeling_gemma2.py
    
    Co-authored-by: Joao Gante <[email protected]>
    
    * Update src/transformers/models/gemma2/modeling_gemma2.py
    
    Co-authored-by: Arthur <[email protected]>
    
    * raise error and dont fallback to dynamic cache
    
    * prev will break most forward calls/tests
    
    * Update src/transformers/models/gemma2/modeling_gemma2.py
    
    Co-authored-by: Arthur <[email protected]>
    
    * update
    
    * fix copies
    
    ---------
    
    Co-authored-by: Joao Gante <[email protected]>
    Co-authored-by: Arthur <[email protected]>
    3 people authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    7ad784a View commit details
    Browse the repository at this point in the history
  3. enable xla fsdp (huggingface#32048)

    * enable xla fsdp
    
    * add acceleration version check for xla fsdp
    hanwen-sun authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    46d09af View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    c54a6f9 View commit details
    Browse the repository at this point in the history
  5. Agents use grammar (huggingface#31735)

    * Allow optional use of grammars to constrain generation
    aymeric-roucher authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    e0d8253 View commit details
    Browse the repository at this point in the history
  6. fix broken link in docs (huggingface#32491)

    `https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TextGenerationPipeline.__call__`
    
    `generate_kwargs (dict, optional) — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here).`
    
    link in "here" doesnt work
    jorahn authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    b640103 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    b7fb393 View commit details
    Browse the repository at this point in the history
  8. 🌐 [i18n-KO] Translated gptq.md to Korean (huggingface#32293)

    * fix: manual edits
    
    * fix: manual edits2
    
    * fix: delete files
    
    * fix: resolve suggestions
    
    Co-authored-by: Sungmin Oh <[email protected]>
    Co-authored-by: SeungYoun Lee <[email protected]>
    Co-authored-by: 김준재 <[email protected]>
    
    * fix: resolve suggestions
    
    Co-authored-by: Steven Liu <[email protected]>
    
    ---------
    
    Co-authored-by: Sungmin Oh <[email protected]>
    Co-authored-by: SeungYoun Lee <[email protected]>
    Co-authored-by: 김준재 <[email protected]>
    Co-authored-by: Steven Liu <[email protected]>
    5 people authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    1124d95 View commit details
    Browse the repository at this point in the history
  9. 🌐 [i18n-KO] Translated prompting.md to Korean (huggingface#32294)

    * docs: ko: tasks/prompting.md
    
    * feat: nmt-draft
    
    * fix: update translation in prompting.md
    
    * fix: update toctree.yml
    
    * fix: manual edits
    
    * fix: toctree edits
    
    * fix: resolve suggestions
    
    Co-authored-by: boyunJang <[email protected]>
    Co-authored-by: Harheem Kim <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    
    ---------
    
    Co-authored-by: boyunJang <[email protected]>
    Co-authored-by: Harheem Kim <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    4 people authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    fcc4f2a View commit details
    Browse the repository at this point in the history
  10. 🌐 [i18n-KO] Translated quantization/quanto.md to Korean (huggingfac…

    …e#32281)
    
    * docs: ko: quantization/quanto.md
    
    * feat: nmt draft
    
    * fix: resolve suggestions
    
    Co-authored-by: SeungYoun Lee <[email protected]>
    Co-authored-by: Minki Kim <[email protected]>
    Co-authored-by: 김준재 <[email protected]>
    
    * fix: resolve suggestions
    
    Co-authored-by: SeungYoun Lee <[email protected]>
    
    ---------
    
    Co-authored-by: SeungYoun Lee <[email protected]>
    Co-authored-by: Minki Kim <[email protected]>
    Co-authored-by: 김준재 <[email protected]>
    4 people authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    fa59fd8 View commit details
    Browse the repository at this point in the history
  11. 🌐 [i18n-KO] Translated image_feature_extraction.md to Korean (huggi…

    …ngface#32239)
    
    * docs: ko: tasks/images_feature_extraction.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * fix: manual edits
    
    * fix: manual edits
    
    * fix: manual edits
    
    * feat: manual edits
    
    * Update docs/source/ko/tasks/image_feature_extraction.md
    
    Co-authored-by: Jihun Lim <[email protected]>
    
    * Update docs/source/ko/tasks/image_feature_extraction.md
    
    Co-authored-by: Jihun Lim <[email protected]>
    
    * fix: manual edits
    
    ---------
    
    Co-authored-by: Jihun Lim <[email protected]>
    mreraser and heuristicwave authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    cba7bcf View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    73a59a2 View commit details
    Browse the repository at this point in the history
  13. Docs: Fixed WhisperModel.forward’s docstring link (huggingface#32498)

    Fixed WhisperModel.forward’s docstring link.
    Sai-Suraj-27 authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    543df48 View commit details
    Browse the repository at this point in the history
  14. 🌐 [i18n-KO] Translated chat_templating.md to Korean (huggingface#32362

    )
    
    * docs: ko: chat_templating.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * Update docs/source/ko/chat_templating.md
    
    Co-authored-by: Sungmin Oh <[email protected]>
    
    * Update docs/source/ko/chat_templating.md
    
    Co-authored-by: Sungmin Oh <[email protected]>
    
    * fix: apply suggestions from code review - anchor
    
    Co-authored-by: Sungmin Oh <[email protected]>
    
    * fix: manual edits
    
    Co-authored-by: SeungYoun Lee <[email protected]>
    Co-authored-by: Minki Kim <[email protected]>
    
    * fix: manual edits
    
    * fix: delete 'default template' section
    
    ---------
    
    Co-authored-by: Sungmin Oh <[email protected]>
    Co-authored-by: SeungYoun Lee <[email protected]>
    Co-authored-by: Minki Kim <[email protected]>
    4 people authored Aug 7, 2024
    Configuration menu
    Copy the full SHA
    78566db View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    f5cdbf6 View commit details
    Browse the repository at this point in the history

Commits on Aug 8, 2024

  1. Fix typo: depracted -> deprecated (huggingface#32489)

    Hello!
    
    ## Pull Request overview
    * Fix typo
    
    ## Details
    This should speak for itself.
    
    cc @itazap @ArthurZucker 
    
    - Tom Aarsen
    tomaarsen authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    aefd3e2 View commit details
    Browse the repository at this point in the history
  2. Fix issue huggingface#32518: Update llm_tutorial.md (huggingface#32523)

    Update llm_tutorial.md
    
    remove comma re: issue 32518
    
    huggingface#32518
    doomdagadiggiedahdah authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    1c944ac View commit details
    Browse the repository at this point in the history
  3. Change Phi3 _supports_sdpa to True (huggingface#32457)

    * Change `_supports_sdpa` to True
    
    * add phi3 to sdpa support list
    pocca2048 authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    e28784f View commit details
    Browse the repository at this point in the history
  4. Uniformize kwargs for processors - GroundingDINO (huggingface#31964)

    * fix typo
    
    * uniform kwargs
    
    * make style
    
    * add comments
    
    * remove return_tensors
    
    * remove common_kwargs from processor since it propagates
    
    * make style
    
    * return_token_type_ids to True
    
    * revert the default imagekwargs since does not accept any value in the image processro
    
    * revert processing_utils.py
    
    * make style
    
    * add molbap's commit
    
    * fix typo
    
    * fix common processor
    
    * remain
    
    * Revert "add molbap's commit"
    
    This reverts commit a476c6e.
    
    * add unsync PR
    
    * revert
    
    * make CI happy
    
    * nit
    
    * import annotationformat
    SangbumChoi authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    d3b3551 View commit details
    Browse the repository at this point in the history
  5. Fix add-new-model-like (huggingface#31773)

    * handle (processor_class, None) returned by ModelPatterns
    
    * handle (slow, fast) image processors in add model
    
    * handle old image processor case
    molbap authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    b51d414 View commit details
    Browse the repository at this point in the history
  6. Add Qwen2-Audio (huggingface#32137)

    * add qwen2audio
    
    * Update check_repo.py
    
    * fix style
    
    * fix test
    
    * fix style
    
    * add model size
    
    * Qwen2AudioEncoderModel->Qwen2AudioEncoder; add copy info
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * switch the attention_mask and the feature_attention_mask
    
    * add to PRIVATE_MODELS in check_repo.py; add to MODEL_NAMES_TO_IGNORE in check_table.py
    
    * fix initialization
    
    * update chat_template
    
    * fix consistency issue after copy
    
    * add docstrings to _merge_input_ids_with_audio_features
    
    * add copied from to prepare_inputs_for_generation
    
    * add more details to docs
    
    * rm comment
    
    * add init_std
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * update
    
    * Update docs/source/en/model_doc/qwen2_audio.md
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * update tests
    
    * rm ignore_index
    
    * update processor
    
    * rm ffmpeg_read
    
    * Update tests/models/qwen2_audio/test_modeling_qwen2_audio.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update docs/source/en/model_doc/qwen2_audio.md
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update docs/source/en/model_doc/qwen2_audio.md
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update docs/source/en/model_doc/qwen2_audio.md
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * update
    
    * typo
    
    * [run_slow] qwen2_audio
    
    * [run_slow] qwen2_audio
    
    * [run_slow] qwen2_audio
    
    * fix quality
    
    * [run_slow] qwen2_audio
    
    * [run_slow] qwen2_audio
    
    * [run_slow] qwen2_audio
    
    * add official model
    
    ---------
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    Co-authored-by: amyeroberts <[email protected]>
    3 people authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    16ed064 View commit details
    Browse the repository at this point in the history
  7. filter flash_attn optional imports loading remote code (huggingface#3…

    …0954)
    
    * filter flash_attn optional imports loading remote code
    
    * improve pattern
    
    * fix code style
    
    * Update src/transformers/dynamic_module_utils.py
    
    Co-authored-by: Matt <[email protected]>
    
    ---------
    
    Co-authored-by: Matt <[email protected]>
    eaidova and Rocketknight1 authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    cc832cb View commit details
    Browse the repository at this point in the history
  8. 🌐 [i18n-KO] Translated ko-llm_tutorial_optimization.md to Korean (h…

    …uggingface#32372)
    
    * docs: ko: llm_tutorial_optimization.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * Update docs/source/ko/llm_tutorial_optimization.md
    
    Co-authored-by: Chaewon Song <[email protected]>
    
    * Update docs/source/ko/llm_tutorial_optimization.md
    
    Co-authored-by: Chaewon Song <[email protected]>
    
    * fix: resolve suggestions - 1
    
    Co-authored-by: Chaewon Song <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    Co-authored-by: boyunJang <[email protected]>
    
    * fix: resolve suggestions - 2
    
    Co-authored-by: boyunJang <[email protected]>
    Co-authored-by: Chaewon Song <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    
    ---------
    
    Co-authored-by: Chaewon Song <[email protected]>
    Co-authored-by: timdalxx <[email protected]>
    Co-authored-by: boyunJang <[email protected]>
    4 people authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    43f3fe8 View commit details
    Browse the repository at this point in the history
  9. 🌐 [i18n-KO] Translated trainer.md to Korean (huggingface#32260)

    * docs: ko: ko-trainer
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * fix: manual edits
    
    * fix: glossary
    
    * fix: glossary
    
    * Apply suggestions from code review
    
    Co-authored-by: Jinuk <[email protected]>
    Co-authored-by: SeongWooChoi <[email protected]>
    
    ---------
    
    Co-authored-by: Jinuk <[email protected]>
    Co-authored-by: SeongWooChoi <[email protected]>
    3 people authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    96ba7f0 View commit details
    Browse the repository at this point in the history
  10. 🌐 [i18n-KO] Translated eetq.md to Korean (huggingface#32352)

    * docs: ko: quantization/eetq.md
    
    * feat: nmt draft
    
    * fix docs: ko: quantization/eetq.md
    
    * fix docs: ko: quantization/eetq.md
    
    * fix: resolve suggestions
    
    Co-authored-by: Jiwook Han <[email protected]>
    
    * fix: resolve suggestions
    
    * fix: resolve suggsetions
    
    ---------
    
    Co-authored-by: Jiwook Han <[email protected]>
    jun048098 and mreraser authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    e0396bd View commit details
    Browse the repository at this point in the history
  11. 🌐 [i18n-KO] Translated fsdp.md to Korean (huggingface#32261)

    * docs: ko: fsdp.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * Apply suggestions from code review
    
    Co-authored-by: 김준재 <[email protected]>
    Co-authored-by: Minki Kim <[email protected]>
    
    * fix: resolve suggestions
    
    * Update docs/source/ko/fsdp.md
    
    Co-authored-by: 김준재 <[email protected]>
    
    * Update docs/source/ko/fsdp.md
    
    Co-authored-by: Steven Liu <[email protected]>
    
    ---------
    
    Co-authored-by: 김준재 <[email protected]>
    Co-authored-by: Minki Kim <[email protected]>
    Co-authored-by: Steven Liu <[email protected]>
    4 people authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    496207a View commit details
    Browse the repository at this point in the history
  12. 🌐 [i18n-KO] Translated bitsandbytes.md to Korean (huggingface#32408)

    * docs: ko: quantization/bitsandbytes.md
    
    * feat: nmt draft
    
    * fix: minor typos
    
    * fix: manual edits
    
    * fix: manual edits
    
    * fix: resolve suggestions
    
    Co-authored-by: wony617 <[email protected]>
    Co-authored-by: YONGSANG <[email protected]>
    Co-authored-by: Woojun Jung <[email protected]>
    
    * fix: resolve suggestions
    
    Co-authored-by: Steven Liu <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Steven Liu <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Steven Liu <[email protected]>
    
    ---------
    
    Co-authored-by: wony617 <[email protected]>
    Co-authored-by: YONGSANG <[email protected]>
    Co-authored-by: Woojun Jung <[email protected]>
    Co-authored-by: Steven Liu <[email protected]>
    5 people authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    b01f9c4 View commit details
    Browse the repository at this point in the history
  13. Fix generate with inputs_embeds as input (huggingface#32493)

    * I think inputs_embeds has ndim == 3
    
    * fix sequence length catch
    
    * add generate test
    
    * [run-slow]olmo, persimmon, gemma, gemma2, qwen2, llama
    
    * skip whisper
    
    * fix bart test
    
    * more fixes
    molbap authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    0442816 View commit details
    Browse the repository at this point in the history
  14. Fixed test test_static_cache_exportability with torch 2.4.0 (huggin…

    …gface#32516)
    
    Workaround the export issue in torch 2.4
    
    Co-authored-by: Guang Yang <[email protected]>
    guangy10 and Guang Yang authored Aug 8, 2024
    Configuration menu
    Copy the full SHA
    0164560 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    54ac39c View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    85817d9 View commit details
    Browse the repository at this point in the history

Commits on Aug 9, 2024

  1. Configuration menu
    Copy the full SHA
    838d141 View commit details
    Browse the repository at this point in the history
  2. Fix a bug in Qwen2Audio (huggingface#32552)

    fix _update_model_kwargs_for_generation
    faychu authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    7728b78 View commit details
    Browse the repository at this point in the history
  3. fix slow integration gemma2 test (huggingface#32534)

    no empty revision
    ArthurZucker authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    e4522fe View commit details
    Browse the repository at this point in the history
  4. fix non contiguous tensor value error in save_pretrained (huggingface…

    …#32422)
    
    Signed-off-by: duzhanwei <[email protected]>
    Co-authored-by: duzhanwei <[email protected]>
    congcongke and duzhanwei authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    e7f4ace View commit details
    Browse the repository at this point in the history
  5. 🌐 [i18n-KO] Translated agent.md to Korean (huggingface#32351)

    * docs: ko: main_classes/agent
    
    * feat: chatgpt draft
    
    * fix: manual edits
    
    * �fix: resolve suggestions
    
    Co-authored-by: Woojun Jung <[email protected]>
    Co-authored-by: thsamaji <[email protected]>
    Co-authored-by: SeungAhSon <[email protected]>
    
    * fix: resolve suggestions
    
    * fix: resolve code line number
    
    ---------
    
    Co-authored-by: Woojun Jung <[email protected]>
    Co-authored-by: thsamaji <[email protected]>
    Co-authored-by: SeungAhSon <[email protected]>
    4 people authored Aug 9, 2024
    Configuration menu
    Copy the full SHA
    48101cf View commit details
    Browse the repository at this point in the history

Commits on Aug 12, 2024

  1. Add new model (huggingface#32615)

    * v1 - working version
    
    * fix
    
    * fix
    
    * fix
    
    * fix
    
    * rename to correct name
    
    * fix title
    
    * fixup
    
    * rename files
    
    * fix
    
    * add copied from on tests
    
    * rename to `FalconMamba` everywhere and fix bugs
    
    * fix quantization + accelerate
    
    * fix copies
    
    * add `torch.compile` support
    
    * fix tests
    
    * fix tests and add slow tests
    
    * copies on config
    
    * merge the latest changes
    
    * fix tests
    
    * add few lines about instruct
    
    * Apply suggestions from code review
    
    Co-authored-by: Arthur <[email protected]>
    
    * fix
    
    * fix tests
    
    ---------
    
    Co-authored-by: Arthur <[email protected]>
    younesbelkada and ArthurZucker authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    7c11491 View commit details
    Browse the repository at this point in the history
  2. Fix: FA2 with packed training (huggingface#32487)

    * fix check
    
    * add tests
    
    * [run-slow] llama, gemma2
    
    * oops, whisper actually runs but needed some special treatment
    zucchini-nlp authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    8f2b6d5 View commit details
    Browse the repository at this point in the history
  3. Fix sliding window attention used in Gemma2FlashAttention2 (huggingfa…

    …ce#32522)
    
    * fix sliding window attention (flash2) in gemma2 model
    
    * [run-slow] gemma
    
    * fix slicing attention_mask for flash_attn2
    
    * fix slicing attention_mask when flash_attn is used
    
    * add missing comment
    
    * slice the last seq_len tokens in the key, value states
    
    * revert code of slicing key, value states
    brcps12 authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    342e3f9 View commit details
    Browse the repository at this point in the history
  4. fix: Fixed conditional check for encodec model names (huggingface#3…

    …2581)
    
    * Fixed conditional check for encodec model names.
    
    * Reformatted conditional check.
    Sai-Suraj-27 authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    bd251e4 View commit details
    Browse the repository at this point in the history
  5. Fix .push_to_hub(..., create_pr=True, revision="my-branch") when cr…

    …eating PR on not-owned repo (huggingface#32094)
    
    Fix create_pr aagainst existing revision
    Wauplin authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    e31a7a2 View commit details
    Browse the repository at this point in the history
  6. Bump aiohttp from 3.9.4 to 3.10.2 in /examples/research_projects/deci…

    …sion_transformer (huggingface#32569)
    
    Bump aiohttp in /examples/research_projects/decision_transformer
    
    Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.9.4 to 3.10.2.
    - [Release notes](https://github.com/aio-libs/aiohttp/releases)
    - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst)
    - [Commits](aio-libs/aiohttp@v3.9.4...v3.10.2)
    
    ---
    updated-dependencies:
    - dependency-name: aiohttp
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    50837f2 View commit details
    Browse the repository at this point in the history
  7. Bump torch from 1.13.1 to 2.2.0 in /examples/research_projects/visual…

    …_bert (huggingface#32220)
    
    Bump torch in /examples/research_projects/visual_bert
    
    Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0.
    - [Release notes](https://github.com/pytorch/pytorch/releases)
    - [Changelog](https://github.com/pytorch/pytorch/blob/main/RELEASE.md)
    - [Commits](pytorch/pytorch@v1.13.1...v2.2.0)
    
    ---
    updated-dependencies:
    - dependency-name: torch
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    8a3c55e View commit details
    Browse the repository at this point in the history
  8. Cleanup tool calling documentation and rename doc (huggingface#32337)

    * Rename "Templates for Chat Models" doc to "Chat Templates"
    
    * Small formatting fix
    
    * Small formatting fix
    
    * Small formatting fix
    
    * Cleanup tool calling docs as well
    
    * Remove unneeded 'revision'
    
    * Move tip to below main code example
    
    * Little bonus section on template editing
    Rocketknight1 authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    b7ea171 View commit details
    Browse the repository at this point in the history
  9. 🌐 [i18n-KO] Translated deepspeed.md to Korean (huggingface#32431)

    * Update _toctree.yml
    
    * docs: ko: deepspeed.md
    
    * Apply suggestions from code review
    
    Co-authored-by: wony617 <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: wony617 <[email protected]>
    
    * Update docs/source/ko/_toctree.yml
    
    Co-authored-by: Steven Liu <[email protected]>
    
    * Update docs/source/ko/deepspeed.md
    
    * Update docs/source/ko/deepspeed.md
    
    Co-authored-by: SeungAhSon <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: wony617 <[email protected]>
    
    * Update docs/source/ko/_toctree.yml
    
    ---------
    
    Co-authored-by: wony617 <[email protected]>
    Co-authored-by: Steven Liu <[email protected]>
    Co-authored-by: SeungAhSon <[email protected]>
    4 people authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    4996990 View commit details
    Browse the repository at this point in the history
  10. 🌐 [i18n-KO] Translated awq.mdto Korean (huggingface#32324)

    * fix: manual edits
    
    * Apply suggestions from code review
    
    Co-authored-by: SeongWooChoi <[email protected]>
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    
    * fix:manual edits
    
    - 잘못된 경로에 번역본 파일을 생성해서 옮김
    
    * Delete docs/source/ko/tasks/awq.md
    
    * Update docs/source/ko/_toctree.yml
    
    Co-authored-by: Steven Liu <[email protected]>
    
    ---------
    
    Co-authored-by: SeongWooChoi <[email protected]>
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    Co-authored-by: Steven Liu <[email protected]>
    4 people authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    7f777ab View commit details
    Browse the repository at this point in the history
  11. fix: Fixed failing test_find_base_model_checkpoint (huggingface#32638)

    Fixed failing test_find_base_model_checkpoint.
    Sai-Suraj-27 authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    ce4b288 View commit details
    Browse the repository at this point in the history
  12. Bump tensorflow from 2.11.1 to 2.12.1 in /examples/research_projects/…

    …decision_transformer (huggingface#32341)
    
    Bump tensorflow in /examples/research_projects/decision_transformer
    
    Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.11.1 to 2.12.1.
    - [Release notes](https://github.com/tensorflow/tensorflow/releases)
    - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md)
    - [Commits](tensorflow/tensorflow@v2.11.1...v2.12.1)
    
    ---
    updated-dependencies:
    - dependency-name: tensorflow
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    126cbdb View commit details
    Browse the repository at this point in the history
  13. "to be not" -> "not to be" (huggingface#32636)

    * "to be not" -> "not to be"
    
    * Update sam.md
    
    * Update trainer.py
    
    * Update modeling_utils.py
    
    * Update test_modeling_utils.py
    
    * Update test_modeling_utils.py
    qgallouedec authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    f1c8542 View commit details
    Browse the repository at this point in the history
  14. fix: Updated the is_torch_mps_available() function to include `min_…

    …version` argument (huggingface#32545)
    
    * Fixed wrong argument in is_torch_mps_available() function call.
    
    * Fixed wrong argument in is_torch_mps_available() function call.
    
    * sorted the import.
    
    * Fixed wrong argument in is_torch_mps_available() function call.
    
    * Fixed wrong argument in is_torch_mps_available() function call.
    
    * Update src/transformers/utils/import_utils.py
    
    Co-authored-by: Arthur <[email protected]>
    
    * removed extra space.
    
    * Added type hint for the min_version parameter.
    
    * Added missing import.
    
    ---------
    
    Co-authored-by: Arthur <[email protected]>
    Sai-Suraj-27 and ArthurZucker authored Aug 12, 2024
    Configuration menu
    Copy the full SHA
    2a5a6ad View commit details
    Browse the repository at this point in the history

Commits on Aug 13, 2024

  1. Expand inputs in processors for VLMs (huggingface#30962)

    * let it be
    
    * draft
    
    * should not have changed
    
    * add warnings
    
    * fix & add tests
    
    * fix tests
    
    * ipnuts embeds cannot be passed with pixels
    
    * more updates
    
    * paligemma ready!
    
    * minor typos
    
    * update blip-2
    
    * fix tests & raise error
    
    * docstring
    
    * add blip2 test
    
    * tmp
    
    * add image seq length to config
    
    * update docstring
    
    * delete
    
    * fix tests
    
    * fix blip
    
    * fix paligemma
    
    * out-of-place scatter
    
    * add llava-next-video
    
    * Update src/transformers/models/blip_2/modeling_blip_2.py
    
    Co-authored-by: Pablo Montalvo <[email protected]>
    
    * remove tmp
    
    * codestyle
    
    * nits
    
    * more nits
    
    * remove overriding in tests
    
    * comprehension when merging video
    
    * fix-copies
    
    * revert changes for embeds test
    
    * fix tests after making comprehension
    
    * Update src/transformers/models/blip_2/processing_blip_2.py
    
    Co-authored-by: Pablo Montalvo <[email protected]>
    
    * Update src/transformers/models/blip_2/processing_blip_2.py
    
    Co-authored-by: Pablo Montalvo <[email protected]>
    
    * more updates
    
    * fix tests
    
    ---------
    
    Co-authored-by: Pablo Montalvo <[email protected]>
    zucchini-nlp and molbap authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    a29eabd View commit details
    Browse the repository at this point in the history
  2. Automatically add transformers tag to the modelcard (huggingface#32623

    )
    
    * Automatically add `transformers` tag to the modelcard
    
    * Specify library_name and test
    LysandreJik authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    29c3a0f View commit details
    Browse the repository at this point in the history
  3. Fix tests (huggingface#32649)

    * skip failing tests
    
    * [no-filter]
    
    * [no-filter]
    
    * fix wording catch in FA2 test
    
    * [no-filter]
    
    * trigger normal CI without filtering
    molbap authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    a5a8291 View commit details
    Browse the repository at this point in the history
  4. fix tensors on different devices in WhisperGenerationMixin (hugging…

    …face#32316)
    
    * fix
    
    * enable on xpu
    
    * no manual remove
    
    * move to device
    
    * remove to
    
    * add move to
    faaany authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    b5016d5 View commit details
    Browse the repository at this point in the history
  5. Add support for GrokAdamW optimizer (huggingface#32521)

    * add grokadamw
    
    * reformat
    
    * code review feedback, unit test
    
    * reformat
    
    * reformat
    ehartford authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    481e156 View commit details
    Browse the repository at this point in the history
  6. Add Depth Anything V2 Metric models (huggingface#32126)

    * add checkpoint and repo names
    
    * adapt head to support metric depth estimation
    
    * add max_depth output scaling
    
    * add expected logits
    
    * improve docs
    
    * fix docstring
    
    * add checkpoint and repo names
    
    * adapt head to support metric depth estimation
    
    * add max_depth output scaling
    
    * add expected logits
    
    * improve docs
    
    * fix docstring
    
    * rename depth_estimation to depth_estimation_type
    
    * add integration test
    
    * Refactored tests to include metric depth model inference test
    * Integration test pass when the timm backbone lines are commented (L220-L227)
    
    * address feedback
    
    * replace model path to use organization path
    
    * formatting
    
    * delete deprecated TODO
    
    * address feedback
    
    * [run_slow] depth_anything
    bt2513 authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    cc25757 View commit details
    Browse the repository at this point in the history
  7. Fix: Fixed directory path for utils folder in `test_tokenization_util…

    …s.py` (huggingface#32601)
    
    * Removed un-necessary expressions.
    
    * Fixed directory path for utils folder in test_tokenization_utils.py
    Sai-Suraj-27 authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    c3cd9d8 View commit details
    Browse the repository at this point in the history
  8. Modify ProcessorTesterMixin for better generalization (huggingface#32637

    )
    
    * Add padding="max_length" to tokenizer kwargs and change crop_size to size for image_processor kwargs
    
    * remove crop_size argument in align processor tests to be coherent with base tests
    
    * Add pad_token when loading tokenizer if needed, change test override tokenizer kwargs, remove unnecessary test overwrites in grounding dino
    yonigozlan authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    5bcbdff View commit details
    Browse the repository at this point in the history
  9. TF_Deberta supporting mixed precision (huggingface#32618)

    * Update modeling_tf_deberta.py
    
    Corrected some codes which do not support mixed precision
    
    * Update modeling_tf_deberta_v2.py
    
    Corrected some codes which do not support mixed precision
    
    * Update modeling_tf_deberta_v2.py
    
    * Update modeling_tf_deberta.py
    
    * Add files via upload
    
    * Add files via upload
    pinesnow72 authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    9d2ab88 View commit details
    Browse the repository at this point in the history
  10. Fix tests recurrent (huggingface#32651)

    * add fix for recurrentgemma
    
    * [no-filter]
    
    * trigger-ci
    
    * [no-filter]
    
    * [no-filter]
    
    * attempt to fix mysterious zip error
    
    * [no-filter]
    
    * fix lookup error
    
    * [no-filter]
    
    * remove summarization hack
    
    * [no-filter]
    molbap authored Aug 13, 2024
    Configuration menu
    Copy the full SHA
    c135783 View commit details
    Browse the repository at this point in the history

Commits on Aug 14, 2024

  1. Support MUSA (Moore Threads GPU) backend in transformers (huggingface…

    …#31913)
    
    Add accelerate version check, needs accelerate>=0.33.0
    fmo-mt authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    a22ff36 View commit details
    Browse the repository at this point in the history
  2. fix: Fixed failing tests in tests/utils/test_add_new_model_like.py (h…

    …uggingface#32678)
    
    * Fixed failing tests in tests/utils/test_add_new_model_like.py
    
    * Fixed formatting using ruff.
    
    * Small nit.
    Sai-Suraj-27 authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    df32347 View commit details
    Browse the repository at this point in the history
  3. Update translation docs review (huggingface#32662)

    update list of people to tag
    stevhliu authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    9485289 View commit details
    Browse the repository at this point in the history
  4. Add TorchAOHfQuantizer (huggingface#32306)

    * Add TorchAOHfQuantizer
    
    Summary:
    Enable loading torchao quantized model in huggingface.
    
    Test Plan:
    local test
    
    Reviewers:
    
    Subscribers:
    
    Tasks:
    
    Tags:
    
    * Fix a few issues
    
    * style
    
    * Added tests and addressed some comments about dtype conversion
    
    * fix torch_dtype warning message
    
    * fix tests
    
    * style
    
    * TorchAOConfig -> TorchAoConfig
    
    * enable offload + fix memory with multi-gpu
    
    * update torchao version requirement to 0.4.0
    
    * better comments
    
    * add torch.compile to torchao README, add perf number link
    
    ---------
    
    Co-authored-by: Marc Sun <[email protected]>
    jerryzh168 and SunMarc authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    78d78cd View commit details
    Browse the repository at this point in the history
  5. Fix JetMoeIntegrationTest (huggingface#32332)

    JetMoeIntegrationTest
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    20a0449 View commit details
    Browse the repository at this point in the history
  6. Update the distributed CPU training on Kubernetes documentation (hugg…

    …ingface#32669)
    
    * Update the Kubernetes CPU training example
    
    * Add namespace arg
    
    Signed-off-by: Dina Suehiro Jones <[email protected]>
    
    ---------
    
    Signed-off-by: Dina Suehiro Jones <[email protected]>
    dmsuehir authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    6577c77 View commit details
    Browse the repository at this point in the history
  7. fix: Fixed unknown pytest config option doctest_glob (huggingface#3…

    …2475)
    
    Fixed unknown config option doctest_glob.
    Sai-Suraj-27 authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    95a7781 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    0cea208 View commit details
    Browse the repository at this point in the history
  9. Updated workflows to the latest versions (huggingface#32405)

    Updated few workflows to the latest versions.
    Sai-Suraj-27 authored Aug 14, 2024
    Configuration menu
    Copy the full SHA
    8820fe8 View commit details
    Browse the repository at this point in the history

Commits on Aug 15, 2024

  1. Configuration menu
    Copy the full SHA
    e840127 View commit details
    Browse the repository at this point in the history
  2. fix: Corrected falcon-mamba-7b model checkpoint name (huggingface#…

    …32837)
    
    Corrected the model checkpoint.
    Sai-Suraj-27 authored Aug 15, 2024
    Configuration menu
    Copy the full SHA
    ab7e893 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d6751d9 View commit details
    Browse the repository at this point in the history

Commits on Aug 16, 2024

  1. VLMs: small clean-up for cache class (huggingface#32417)

    * fix beam search in video llava
    
    * [run-slow] video_llava
    zucchini-nlp authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    f3c8b18 View commit details
    Browse the repository at this point in the history
  2. add back the position ids (huggingface#32554)

    * add back the position ids
    
    * fix failing test
    ArthurZucker authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    c215523 View commit details
    Browse the repository at this point in the history
  3. Use head_dim if in config for RoPE (huggingface#32495)

    * use head_dim if in config for RoPE
    
    * typo
    
    * simplify with getattr
    suiyoubi authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    5fd7ca7 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    70d5df6 View commit details
    Browse the repository at this point in the history
  5. [tests] make test_sdpa_equivalence device-agnostic (huggingface#32520)

    * fix on xpu
    
    * [run_all]
    faaany authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    8f9fa3b View commit details
    Browse the repository at this point in the history
  6. Cache: use batch_size instead of max_batch_size (huggingface#32657)

    * more precise name
    
    * better docstrings
    
    * Update src/transformers/cache_utils.py
    
    Co-authored-by: Arthur <[email protected]>
    
    ---------
    
    Co-authored-by: Arthur <[email protected]>
    gante and ArthurZucker authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    cf32ee1 View commit details
    Browse the repository at this point in the history
  7. Fix AutoConfig and AutoModel support for Llava-Next-Video (huggingfac…

    …e#32844)
    
    * Fix: fix all model_type of Llava-Next-Video to llava_next_video
    
    * Fix doc for llava_next_video
    
    * * Fix formatting issues
    * Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation
    
    * Fix docs TOC for llava-next-video
    TKONIY authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    a27182b View commit details
    Browse the repository at this point in the history
  8. improve _get_is_as_tensor_fns (huggingface#32596)

    * improve _get_is_as_tensor_fns
    
    * format
    zrr1999 authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    f20d0e8 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    0b066be View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    1c36db6 View commit details
    Browse the repository at this point in the history
  11. Reduce the error log when using core models that need their weights r…

    …enamed, and provide a step forward (huggingface#32656)
    
    * Fin
    
    * Modify msg
    
    * Finish up nits
    muellerzr authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    8ec028a View commit details
    Browse the repository at this point in the history
  12. Make beam_constraints.Constraint.advance() docstring more accurate (h…

    …uggingface#32674)
    
    * Fix beam_constraints.Constraint.advance() docstring
    
    * Update src/transformers/generation/beam_constraints.py
    
    Co-authored-by: Steven Liu <[email protected]>
    
    ---------
    
    Co-authored-by: Joao Gante <[email protected]>
    Co-authored-by: Steven Liu <[email protected]>
    3 people authored Aug 16, 2024
    Configuration menu
    Copy the full SHA
    6806d33 View commit details
    Browse the repository at this point in the history

Commits on Aug 17, 2024

  1. Configuration menu
    Copy the full SHA
    52cb403 View commit details
    Browse the repository at this point in the history

Commits on Aug 19, 2024

  1. Add Flax Dinov2 (huggingface#31960)

    * tfmsenv restored in main
    
    * installed flax
    
    * forward pass done and all tests passed
    
    * make fix-copies and cleaning the scripts
    
    * fixup attempt 1
    
    * fixup attempt 2
    
    * fixup third attempt
    
    * fixup attempt 4
    
    * fixup attempt 5
    
    * dinov2 doc fixed
    
    * FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE
    
    * external pos_encoding layer removed
    
    * fixup attempt 6
    
    * fixed integration test values
    
    * fixup attempt 7
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * comments removed
    
    * comment removed from the test
    
    * fixup
    
    * Update src/transformers/models/dinov2/modeling_flax_dinov2.py
    
    Co-authored-by: Sanchit Gandhi <[email protected]>
    
    * new fixes 1
    
    * interpolate_pos_encoding function removed
    
    * droppath rng fixed, pretrained beit copied-from still not working
    
    * modeling_flax_dinov2.py reformatted
    
    * Update tests/models/dinov2/test_modeling_flax_dinov2.py
    
    Co-authored-by: Sanchit Gandhi <[email protected]>
    
    * added Copied from, to the tests
    
    * copied from statements removed from tests
    
    * fixed copied from statements in the tests
    
    * [run_slow] dinov2
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    Co-authored-by: Sanchit Gandhi <[email protected]>
    3 people authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    843e5e2 View commit details
    Browse the repository at this point in the history
  2. Add Descript-Audio-Codec model (huggingface#31494)

    * dac model
    
    * original dac works
    
    * add dac model
    
    * dac can be instatiated
    
    * add forward pass
    
    * load weights
    
    * all weights are used
    
    * convert checkpoint script ready
    
    * test
    
    * add feature extractor
    
    * up
    
    * make style
    
    * apply cookicutter
    
    * fix tests
    
    * iterate on FeatureExtractor
    
    * nit
    
    * update dac doc
    
    * replace nn.Sequential with nn.ModuleList
    
    * nit
    
    * apply review suggestions 1/2
    
    * Update src/transformers/models/dac/modeling_dac.py
    
    Co-authored-by: Sanchit Gandhi <[email protected]>
    
    * up
    
    * apply review suggestions 2/2
    
    * update padding in FeatureExtractor
    
    * apply review suggestions
    
    * iterate on design and tests
    
    * add integration tests
    
    * feature extractor tests
    
    * make style
    
    * all tests pass
    
    * make style
    
    * fixup
    
    * apply review suggestions
    
    * fix-copies
    
    * apply review suggestions
    
    * apply review suggestions
    
    * Update docs/source/en/model_doc/dac.md
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * Update docs/source/en/model_doc/dac.md
    
    Co-authored-by: Yoach Lacombe <[email protected]>
    
    * anticipate transfer weights to descript
    
    * up
    
    * make style
    
    * apply review suggestions
    
    * update slow test values
    
    * update slow tests
    
    * update test values
    
    * update with CI values
    
    * update with vorace values
    
    * update test with slice
    
    * make style
    
    ---------
    
    Co-authored-by: Sanchit Gandhi <[email protected]>
    Co-authored-by: Yoach Lacombe <[email protected]>
    3 people authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    8260cb3 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    54b7703 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e55b33c View commit details
    Browse the repository at this point in the history
  5. Add __repr__ for Conv1D (huggingface#32425)

    * Add representation for Conv1D, for better output info.
    
    * code format for Conv1D
    
    * We add a __repr__ func for Conv1D, this allows the print (or output) of the model's info has a better description for Conv1D.
    AaronZLT authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    f1b720e View commit details
    Browse the repository at this point in the history
  6. Support save/load ckpt for XLA FSDP (huggingface#32311)

    * Support save/load ckpt for XLA FSDP
    
    * Fix bug for save
    
    * Fix style
    
    * reserve sharded ckpt and better file naming
    
    * minor fix
    
    Co-authored-by: Zach Mueller <[email protected]>
    
    * add is_fsdp_xla_v1_enabled
    
    ---------
    
    Co-authored-by: Zach Mueller <[email protected]>
    yitongh and muellerzr authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    8a4857c View commit details
    Browse the repository at this point in the history
  7. RT-DETR parameterized batchnorm freezing (huggingface#32631)

    * fix: Parameterized norm freezing
    
    For the R18 model, the authors don't freeze norms in the backbone.
    
    * Update src/transformers/models/rt_detr/configuration_rt_detr.py
    
    Co-authored-by: Pavel Iakubovskii <[email protected]>
    
    ---------
    
    Co-authored-by: Pavel Iakubovskii <[email protected]>
    AlanBlanchet and qubvel authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    5f6c080 View commit details
    Browse the repository at this point in the history
  8. Fix incorrect vocab size retrieval in GGUF config (huggingface#32551)

    * fix gguf config vocab size
    
    * minor fix
    
    * link issue
    Isotr0py authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    59e8f19 View commit details
    Browse the repository at this point in the history
  9. Mamba / FalconMamba: Fix mamba left padding (huggingface#32677)

    * fix mamba left padding
    
    * Apply suggestions from code review
    
    Co-authored-by: Pablo Montalvo <[email protected]>
    
    * fix copies
    
    * test with `inputs_embeds`
    
    * Update src/transformers/models/falcon_mamba/modeling_falcon_mamba.py
    
    Co-authored-by: Arthur <[email protected]>
    
    * copies
    
    * clairfy
    
    * fix last comments
    
    * remove
    
    ---------
    
    Co-authored-by: Pablo Montalvo <[email protected]>
    Co-authored-by: Arthur <[email protected]>
    3 people authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    93e538a View commit details
    Browse the repository at this point in the history
  10. Fix: Mamba2 generation mismatch between input_ids and inputs_embeds (h…

    …uggingface#32694)
    
    * fix cache when using input embeddings
    
    * simplify check, we can always add input ids seq len since its 0 in first pass
    vasqu authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    61d89c1 View commit details
    Browse the repository at this point in the history
  11. Docs: Fixed whisper-large-v2 model link in docs (huggingface#32871)

    Fixed whisper-large-v2 model link in docs.
    Sai-Suraj-27 authored Aug 19, 2024
    Configuration menu
    Copy the full SHA
    3720484 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    85345bb View commit details
    Browse the repository at this point in the history

Commits on Aug 20, 2024

  1. Allow-head-dim (huggingface#32857)

    * support head dim
    
    * fix the doc
    
    * fixup
    
    * add oproj
    
    Co-authored-by: Suhara
    <[email protected]>>
    
    * update
    
    Co-authored-by: bzantium <[email protected]>
    
    * Co-authored-by: suhara <[email protected]>
    
    * Update
    
    Co-authored-by: Yoshi Suhara <[email protected]>
    
    ---------
    
    Co-authored-by: bzantium <[email protected]>
    Co-authored-by: Yoshi Suhara <[email protected]>
    3 people authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    13e645b View commit details
    Browse the repository at this point in the history
  2. 🚨🚨🚨 Update min version of accelerate to 0.26.0 (huggingface#32627)

    * Update min version of accelerate to 0.26.0
    
    * dev-ci
    
    * update min version in import
    
    * remove useless check
    
    * dev-ci
    
    * style
    
    * dev-ci
    
    * dev-ci
    SunMarc authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    fd06ad5 View commit details
    Browse the repository at this point in the history
  3. Fix repr for conv (huggingface#32897)

    add nx
    ArthurZucker authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    65f4bc9 View commit details
    Browse the repository at this point in the history
  4. fix: jamba cache fails to use torch.nn.module (huggingface#32894)

    Co-authored-by: Gal Cohen <[email protected]>
    xgal and Gal Cohen authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    01c4fc4 View commit details
    Browse the repository at this point in the history
  5. Fix: Mamba2 norm_before_gate usage (huggingface#32686)

    * mamba2 uses norm_before_gate=False
    
    * small nit
    
    * remove norm_before_gate flag and follow False path only
    vasqu authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    c63a3d0 View commit details
    Browse the repository at this point in the history
  6. Bump nltk from 3.7 to 3.9 in /examples/research_projects/decision_tra…

    …nsformer (huggingface#32903)
    
    Bump nltk in /examples/research_projects/decision_transformer
    
    Bumps [nltk](https://github.com/nltk/nltk) from 3.7 to 3.9.
    - [Changelog](https://github.com/nltk/nltk/blob/develop/ChangeLog)
    - [Commits](nltk/nltk@3.7...3.9)
    
    ---
    updated-dependencies:
    - dependency-name: nltk
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    9800e6d View commit details
    Browse the repository at this point in the history
  7. Replace tensor.norm() with decomposed version for CLIP executorch e…

    …xport (huggingface#32887)
    
    * Replace .norm() with decomposed version for executorch export
    
    * [run_slow] clip
    qubvel authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    078d5a8 View commit details
    Browse the repository at this point in the history
  8. link for optimizer names (huggingface#32400)

    * link for optimizer names
    
    Add a note and link to where the user can find more optimizer names easily because there are many more optimizers than are mentioned in the docstring.
    
    * make fixup
    nbroad1881 authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    1dde50c View commit details
    Browse the repository at this point in the history
  9. [i18n-ar] add README_ar.md to README.md (huggingface#32583)

    * Update README.md
    
    * Update README.md
    
    * Add README_ar.md to i18n/README_de.md
    
    * Add README_ar.md to i18n/README_es.md
    
    * Add README_ar.md to i18n/README_fr.md
    
    * Add README_ar.md to i18n/README_hd.md
    
    * Add README_ar.md to i18n/README_ja.md
    
    * Add README_ar.md to i18n/README_ko.md
    
    * Add README_ar.md to i18n/README_pt-br.md
    
    * Add README_ar.md to i18n/README_ru.md
    
    * Add README_ar.md to i18n/README_te.md
    
    * Add README_ar.md to i18n/README_vi.md
    
    * Add README_ar.md to i18n/README_vi.md
    
    * Add README_ar.md to i18n/README_zh-hans.md
    
    * Add README_ar.md to i18n/README_zh-hant.md
    
    * Create README_ar.md
    AhmedAlmaghz authored Aug 20, 2024
    Configuration menu
    Copy the full SHA
    8713466 View commit details
    Browse the repository at this point in the history

Commits on Aug 21, 2024

  1. fix: [whisper] don't overwrite GenerationConfig's return_timestamps

    … when `return_timestamps` is not passed to `generate` function (huggingface#31296)
    
    [whisper] don't overwrite return_timestamps when not passed to generate
    hrl authored Aug 21, 2024
    Configuration menu
    Copy the full SHA
    c6d484e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3bb7b05 View commit details
    Browse the repository at this point in the history

Commits on Aug 22, 2024

  1. Jamba: update integration tests (huggingface#32250)

    * try test updates
    
    * a few more changes
    
    * a few more changes
    
    * a few more changes
    
    * [run slow] jamba
    
    * skip logits checks on older gpus
    
    * [run slow] jamba
    
    * oops
    
    * [run slow] jamba
    
    * Update tests/models/jamba/test_modeling_jamba.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update tests/models/jamba/test_modeling_jamba.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    ---------
    
    Co-authored-by: amyeroberts <[email protected]>
    gante and amyeroberts authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    f6e2586 View commit details
    Browse the repository at this point in the history
  2. fix: Added missing huggingface_hub installation to workflows (huggi…

    …ngface#32891)
    
    Added missing huggingface_hub installation to workflows.
    Sai-Suraj-27 authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    af638c4 View commit details
    Browse the repository at this point in the history
  3. fix: no need to dtype A in jamba (huggingface#32924)

    Co-authored-by: Gal Cohen <[email protected]>
    xgal and Gal Cohen authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    6baa6f2 View commit details
    Browse the repository at this point in the history
  4. FEAT / Trainer: Add adamw 4bit optimizer (huggingface#31865)

    * add 4bit optimizer
    
    * style
    
    * fix msg
    
    * style
    
    * add qgalore
    
    * Revert "add qgalore"
    
    This reverts commit 25278e8.
    
    * style
    
    * version check
    SunMarc authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    c42d264 View commit details
    Browse the repository at this point in the history
  5. CI: separate step to download nltk files (huggingface#32935)

    * separate step to download nltk files
    
    * duplicated
    
    * rm comma
    gante authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    8b94d28 View commit details
    Browse the repository at this point in the history
  6. FIX / Hub: Also catch for exceptions.ConnectionError (huggingface#3…

    …1469)
    
    * Update hub.py
    
    * Update errors
    
    * Apply suggestions from code review
    
    Co-authored-by: Lucain <[email protected]>
    
    ---------
    
    Co-authored-by: Amy Roberts <[email protected]>
    Co-authored-by: Lucain <[email protected]>
    3 people authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    eeea712 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    9282413 View commit details
    Browse the repository at this point in the history
  8. Fix benchmark script (huggingface#32635)

    * fix
    
    * >= 0.3.0
    
    ---------
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    bf97d4a View commit details
    Browse the repository at this point in the history
  9. Improve greedy search memory usage (huggingface#32895)

    Do not call torch.repeat_interleave if expand_size is 1
    regisss authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    99d67f1 View commit details
    Browse the repository at this point in the history
  10. Add chat_template for tokenizer extracted from GGUF model (huggingfac…

    …e#32908)
    
    * add chat_template to gguf tokenizer
    
    * add template through tokenizer config
    Isotr0py authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    ee8c01f View commit details
    Browse the repository at this point in the history
  11. fix: (issue huggingface#32689) AttributeError raised when using `Tr…

    …ainer` with `eval_on_start=True` in Jupyter Notebook. (huggingface#32849)
    
    fix: `AttributeError` raised when using `Trainer` with `eval_on_start=True` in Jupyter Notebook.
    fshp971 authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    f1d822b View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    975b988 View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    18199b3 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    273c0af View commit details
    Browse the repository at this point in the history
  15. 🌐 [i18n-KO] Translated `knowledge_distillation_for_image_classificati…

    …on.md to Korean" (huggingface#32334)
    
    * docs: ko: tasks/knowledge_distillation_for_image_classification.md
    
    * feat: nmt draft
    
    * fix: manual edits
    
    * Apply suggestions from code review
    
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Ahnjj_DEV <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Ahnjj_DEV <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Ahnjj_DEV <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    
    * Apply suggestions from code review
    
    * Apply suggestions from code review
    
    * Apply suggestions from code review
    
    * Apply suggestions from code review
    
    ---------
    
    Co-authored-by: Chulhwa (Evan) Han <[email protected]>
    Co-authored-by: Ahnjj_DEV <[email protected]>
    3 people authored Aug 22, 2024
    Configuration menu
    Copy the full SHA
    09e6579 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    a26de15 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    d806fa3 View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2024

  1. Configuration menu
    Copy the full SHA
    cfd2e47 View commit details
    Browse the repository at this point in the history
  2. conflict changes 2 11/14/24

    Cemberk committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    0c944eb View commit details
    Browse the repository at this point in the history