Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layoutlmv2 onnx #2

Closed
wants to merge 1,841 commits into from
Closed

Layoutlmv2 onnx #2

wants to merge 1,841 commits into from
This pull request is big! We’re only showing the most recent 250 commits.

Commits on Jun 7, 2022

  1. Fix circular import in onnx.utils (huggingface#17577)

    * Fix circular import in onnx.utils
    
    * Add comment for test fetcher
    
    * Here too
    
    * Style
    sgugger authored Jun 7, 2022
    Configuration menu
    Copy the full SHA
    b6a65ae View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b118730 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    9e72eb4 View commit details
    Browse the repository at this point in the history
  4. Add examples telemetry (huggingface#17552)

    * Add examples telemetry
    
    * Alternative approach
    
    * Add to all other examples
    
    * Add to templates as well
    
    * Put framework separately
    
    * Same for TensorFlow
    sgugger authored Jun 7, 2022
    Configuration menu
    Copy the full SHA
    3cab902 View commit details
    Browse the repository at this point in the history
  5. Fx support for Deberta-v[1-2], Hubert and LXMERT (huggingface#17539)

    * Support for deberta and deberta-v2
    
    * Support for LXMert
    
    * Support for Hubert
    
    * Fix for pt1.11
    
    * Trigger CI
    michaelbenayoun authored Jun 7, 2022
    Configuration menu
    Copy the full SHA
    5c8f601 View commit details
    Browse the repository at this point in the history
  6. quicktour.mdx en -> pt translation (huggingface#17074)

    * Quicktour Portuguese Translation
    
    Translated quicktour.mdx until line 161
    
    * Finished translating quicktour.mdx
    
    Ready to upload and adjust eventual .mdx or translation mistakes.
    
    * Add _toctree.yml and fix nits
    
    * Fixed pt-br mdx syntax problem
    
    Closed <frameworkcontent> instance
    
    * Changed </frameworkcontent> line
    
    * Copied missing block from english version of quicktour.mdx
    
    * Reviwed the entire file once again. It should be working now.
    
    Co-authored-by: Omar U. Espejel <[email protected]>
    vitorfrois and omarespejel authored Jun 7, 2022
    Configuration menu
    Copy the full SHA
    706bb83 View commit details
    Browse the repository at this point in the history
  7. M-CTC-T Model (huggingface#16402)

    * added cbs to notebooks, made copy-paste error fix in generation_utils
    
    * initial push for mctc model
    
    * mctc feature extractor done
    
    * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.
    
    * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.
    
    * passing attention, now struggling to figure out how attention masks make sense here
    
    * works when excluding attention masks. ask later how one would integrate attention maskshere
    
    * bizarre configuration error (model prefix comes first in config dict json and messes up the order)
    
    * all passing but bizzarre config dict ordering issue when to_dict
    
    * passing all major tests
    
    * feature extraction, processor, tokenizer added & tests passing
    
    * style & consistency & other logistical fixes
    
    * copy paste fix
    
    * model after feature extraction working
    
    * commiting final feature extraction results; need to fix normalization
    
    * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?
    
    * delete print ; format code a bit
    
    * fixing tests
    
    * passing major tests
    
    * fixing styles
    
    * completed tokenization test with real example; not sure if these values are entirely correct.
    
    * last test fixes from local
    
    * reverting accidentally included custom setup configs
    
    * remove load tf weights; fix config error
    
    * testing couldnt import featureextractor
    
    * fix docs
    
    * fix docs
    
    * resolving comments
    
    * style fixes
    
    * style fixes
    
    * Update to MCTCConv1dSubSampler
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * relposemb fixes
    
    * conv1d name issue; expecting config fail with paraentheses
    
    * fix config issue
    
    * fix config issue
    
    * fix config issue
    
    * change everything to MCTCT
    
    * fixing naming change errors
    
    * archive list
    
    * copyrights and docs
    
    * copyrights and docs
    
    * copyrights and docs
    
    * merge resolution
    
    * move tests, fix to changed optionaldependency structure
    
    * test directories changed
    
    * fixing tests
    
    * how to avoid tf tests?
    
    * how to avoid tf tests?
    
    * tests passing locally
    
    * allow mctctprocessor imported any env
    
    * allow mctctprocessor imported any env
    
    * fixed second round of feedback, need to fix docs
    
    * doc changes not being applied
    
    * all fixed
    
    * style fix
    
    * feedback fixes
    
    * fix copies and feature extraction style fix
    
    * Update tests/models/visual_bert/test_modeling_visual_bert.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * copy paste huggingface:main visual bert
    
    * added eof newline to visual bert; all tests are passing otherwise
    
    * fix slow tests by adding attention mask
    
    * change model id to speechbrain
    
    * make fix-copies
    
    * fix readme unwanted deletes
    
    * fixing readmes, make fix-copies
    
    * consistent M-CTC-T naming
    
    * Update src/transformers/models/mctct/__init__.py
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * all fixed but variable naming
    
    * adjust double quotes
    
    * fixed variable names
    
    * copyright and mr quilter
    
    * Apply suggestions from code review
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * correct slow tests
    
    * make fix-copies
    
    * Update src/transformers/models/mctct/configuration_mctct.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mctct/configuration_mctct.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * m-ctc-t not mctct
    
    Co-authored-by: Patrick von Platen <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    3 people authored Jun 7, 2022
    Configuration menu
    Copy the full SHA
    119e3c0 View commit details
    Browse the repository at this point in the history
  8. fix (huggingface#17589)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 7, 2022
    Configuration menu
    Copy the full SHA
    c6cea5a View commit details
    Browse the repository at this point in the history

Commits on Jun 8, 2022

  1. CLI: add stricter automatic checks to pt-to-tf (huggingface#17588)

    * Stricter pt-to-tf checks; Update docker image for related tests
    
    * check all attributes in the output
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    gante and sgugger authored Jun 8, 2022
    Configuration menu
    Copy the full SHA
    78c695e View commit details
    Browse the repository at this point in the history
  2. Add TFData2VecVision for semantic segmentation (huggingface#17271)

    * feat: initial implementation of data2vec segmentation model in TF.
    
    * chore: minor corrections to make the segmenter work.
    
    * chore: removed unncessary files.
    
    * chore: add tests and other modifications.
    
    * fix: loss computation for segmentation.
    
    * chore: remove unused variable.
    
    * chore: formatting.
    
    * added a dummy adaptive pooling layer.
    
    * removed unnecessary file.
    
    * potentially add identifiers to layer names.
    
    * fix: layer naming.
    
    * chore: removed unnecessary print.
    
    * Skipping unneeded test
    
    * chore: add logging to debug tolerance.
    
    * fix: segmentation tests for tfdata2vecvision
    
    * chore: make style.
    
    * fix: layer names, assertion to be resolved.
    
    * Bumping test tolerance a bit
    
    * chore: bump the tol in PT test.
    
    Co-authored-by: matt <[email protected]>
    sayakpaul and Rocketknight1 authored Jun 8, 2022
    Configuration menu
    Copy the full SHA
    9d99489 View commit details
    Browse the repository at this point in the history
  3. Explicit versions in docker files (huggingface#17586)

    * Update docker file
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 8, 2022
    Configuration menu
    Copy the full SHA
    264128c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    ae7bae8 View commit details
    Browse the repository at this point in the history
  5. Extend Transformers Trainer Class to Enable CPU AMP and Integrate Int…

    …el Extension for PyTorch (huggingface#17138)
    
    * init PR
    
    * fix import ipex
    
    * minor fix on bf16
    
    * refine optimizer
    
    * refine args notes
    
    * refine code
    
    * refine ipex optimize args
    
    * refine half_precision_backend
    
    * black format
    
    * isort format
    
    * isort format files
    
    * flake8 format
    
    * doc builder format
    
    * refine codes
    
    * remove jit and optim bits
    
    * black preview format
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * refine code
    
    * refine notes
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * code refine
    
    * add ipex ut
    
    * add performance cpu doc
    
    * link to the cpu doc from main perf doc
    
    * install ipex into CI's docker
    
    * Update perf_train_cpu.mdx
    
    * Update docs/source/en/perf_train_cpu.mdx
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * Update perf_train_cpu.mdx
    
    * Update perf_train_cpu.mdx
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: Stas Bekman <[email protected]>
    Co-authored-by: Stas Bekman <[email protected]>
    4 people authored Jun 8, 2022
    Configuration menu
    Copy the full SHA
    34097b3 View commit details
    Browse the repository at this point in the history
  6. Fix link for community notebooks (huggingface#17602)

    * Fix link for community notebooks
    
    This fixes the link for community notebooks due to reorganization.
    
    * Replace old link with fully link to the doc page
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    ngoquanghuy99 and sgugger authored Jun 8, 2022
    Configuration menu
    Copy the full SHA
    ee82c86 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    7d0b6fc View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    e160a5d View commit details
    Browse the repository at this point in the history
  9. TF: Merge PT and TF behavior for Bart when no decoder_input_ids are p…

    …assed (huggingface#17593)
    
    * Merge PT and TF behavior
    gante authored Jun 8, 2022
    Configuration menu
    Copy the full SHA
    e9d5138 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    66e8656 View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2022

  1. Configuration menu
    Copy the full SHA
    dfc76b2 View commit details
    Browse the repository at this point in the history
  2. BLOOM (huggingface#17474)

    * adding template
    
    * update model
    
    * model update
    
    * update conf for debug model
    
    * update conversion
    
    * update conversion script
    
    * update conversion script
    
    * fix missing keys check
    
    * add tests to test the tokenizer in the local machine
    
    * Change variable name
    
    * add tests on xnli dataset
    
    * add more description
    
    * add descriptions + clearer code
    
    * clearer code
    
    * adding new tests + skipping few tests because of env problems
    
    * change comment
    
    * add dtype on the configuration
    
    * add test embeddings
    
    * add hardcoded test
    
    * fix dtype issue
    
    * adding torch.float16 to config
    
    * adding more metrics (min, max, mean)
    
    * add sum
    
    * now the test passes with almost equal
    
    * add files for conversion - test passes on cpu  gpu
    
    * add final changes
    
    * cleaning code
    
    * add new args in the docstring
    
    * fix one liner function
    
    * remove macros
    
    * remove forward attention
    
    * clean up init funtion
    
    * add comments on the issue
    
    * rm scale mask softmax
    
    * do make style
    
    * fix dtype in init
    
    * fixing for loop on att probs
    
    * fix style with black
    
    * fix style + doc error
    
    * fix and debug CI errors (docs + style)
    
    * some updates
    
    - change new operations
    - finally add scaled softmax
    - added new args in the config
    
    * make use cache working
    
    * add changes
    
    - save sharded models
    - final changes on the modeling script
    
    * add changes
    
    - comment on alibi
    - add TODO on seq length
    
    * test commit
    
    - added a text to test the commit
    
    Co-authored-by: thomasw21 <[email protected]>
    
    * final changes
    
    - attention mask change
    - generation works on BS176b
    
    Co-authored-by: thomasw21 <[email protected]>
    
    * changes - model + conversion
    
    * move to correct dir
    
    * put ,
    
    * fex fixes
    
    * fix tokenizer autodoc
    
    * fix minor CI issues
    
    * fix minor CI issues
    
    * fix minor CI issues
    
    * fix style issue
    
    * fix minor import issues
    
    * fix few issues
    
    * remove def main on the test
    
    * add require torch
    
    * replace decorator with 'with'
    
    * fix style
    
    * change to bloom
    
    * add quick fix tokenizer
    
    * fix tokenizer file
    
    * fix tokenizer
    
    - merge tests
    - small fixes
    
    * fix import issue
    
    * add bloom to readme
    
    * fix consistency
    
    * Update docs/source/en/model_doc/bloom.mdx
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Apply suggestions from code review
    
    fix comment issues on file headers
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * fix doc issue
    
    * small fix - modeling test
    
    * some changes
    
    - refactor some code
    - taking into account reviews
    - more tests should pass
    - removed pruning tests
    
    * remove useless division
    
    * more tests should pass
    
    * more tests should pass
    
    * more tests should pass
    
    * let's try this one
    
    -add alibi offset
    - remove all permutes to make the grad operations work
    - finger crossed
    
    * refactor
    
    - refactor code
    - style changes
    - add new threshold for test
    
    * major changes
    
    - change BLOOM to Bloom
    - add quick doc on bloom.mdx
    - move embeddings test on modeling test
    
    * modify readme
    
    * small fixes
    
    * small fix
    
    - better threshold for a test
    
    * remove old test file from fetcher
    
    * fix small typo
    
    * major change
    
    - change BloomLMHead to BloomForCausalLM
    
    * remove onnx config
    
    * major changes
    
    - refactor the code
    - remove asserts
    - change tol for test
    
    * make style
    
    * small change
    
    * adding a slow test + commenting old ones for now
    
    * make style
    
    * Apply suggestions from code review
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * make style
    
    * fix duplicates
    
    * cleaning comments on config
    
    * clean a bit conversion file
    
    * refacor a bit modeling file
    
    * refactor tokenizer file
    
    * fix tokenization test issue
    
    * fix tokenization issue #2
    
    * fix tokenization issue second try
    
    * fix test issue
    
    * make style + add suggestions
    
    * change test fetcher
    
    * try this one
    
    - slow tests should pass
    - finger crossed
    
    * possible final changes
    
    * make style
    
    * try fix padding side issue
    
    * fix side
    
    * fix padding issue
    
    * fix ko-readme
    
    * fix config auto
    
    * cleaning modeling file
    
    * keep bloom in caps in ko
    
    * update config docs
    
    * remove pretraining_pp
    
    * remove model parallel
    
    * update config
    
    - add correct config files
    
    * fix duplicates
    
    * fix fetcher
    
    * fix refactor issue
    
    - remove divide function
    
    * try to remove alibi
    
    * small fixes
    
    - fix alibi
    - remove seq length
    - refactor a bit the code
    
    * put correct values
    
    - fix bos and eos token ids
    
    * fix attention mask loop
    
    Co-authored-by: thomasw21 <[email protected]>
    
    * small fixes:
    
    - remove skip bias add
    
    * small fixes
    
    - fix typo in readme
    - fix typos in config
    
    * small changes
    
    - remove a test
    - add reconstruction test
    - change config
    
    * small changes
    
    - change Scaled Softmax to BloomScaledSoftmax
    
    * small fixes
    
    - fix alibi dtype
    
    * major changes
    
    - removing explicit dtype when loading modules
    - fixing test args (torch_dtype=auto)
    - add dosctring
    
    * fix readmes
    
    * major changes
    
    - now bloom supports alibi shifting
    - refactor a bit the code
    - better test tolerance now
    
    * refactor a bit
    
    * refactor a bit
    
    * put correct name on test
    
    * change docstring
    
    * small changes
    
    - fix docstring modeling
    - fix test tolerance
    
    * fix small nit
    
    - take dtype from tensors in the conversion script
    
    * minor fix
    
    - fix mdx issue
    
    * minor fix
    
    - change config docstring
    
    * forward contrib credits from PR14084
    
    * Apply suggestions from code review
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * apply modifications
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * resolve softmax upcast
    
    * Apply suggestions from code review
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: Niklas Muennighoff <[email protected]>
    
    * final changes modeling
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * Merge commit 'd156898f3b9b2c990e5963f5030a7143d57921a2'
    
    * merge commit
    
    * Apply suggestions from code review
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * apply suggestions
    
    Apply suggestions from Stas comments
    Co-authored-by: Stas Bekman <[email protected]>
    
    * Fix gradient checkpointing
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * add slow but exact
    
    * add accelerate compatibility
    
    Co-authored-by: Nicolas Patry <[email protected]>
    
    * forward contrib credits
    
    Co-authored-by: thomasw21 <[email protected]>
    Co-authored-by: sgugger <[email protected]>
    Co-authored-by: patrickvonplaten <[email protected]>
    Co-authored-by: Niklas Muennighoff <[email protected]>
    Co-authored-by: LysandreJik <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * fix torch device on tests
    
    * make style
    
    * Apply suggestions from code review
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * fix nits
    
    Co-authored-by: patrickvonplaten<[email protected]>
    
    * remove final nits
    
    * fix doc
    
    - add more details on the doc
    - add links to checkpoints
    
    * Update src/transformers/__init__.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * apply suggestions
    
    Co-authored-by: sgugger <[email protected]>
    
    * put test torchscript to false
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    Co-authored-by: justheuristic <[email protected]>
    
    * fix alibi
    
    - create alibi only once
    
    * add small doc
    
    * make quality
    
    * replace torch.nn
    
    * remove token type emb
    
    * fix fused op + output bias
    
    * add fused op
    
    - now can control fused operation from config
    
    * remove fused op
    
    * make quality
    
    * small changes
    
    - remove unsed args on config
    - removed bias gelu file
    - make the model torchscriptable
    - add torchscript slow tests
    
    * Update src/transformers/models/bloom/modeling_bloom.py
    
    * fix slow
    
    * make style
    
    * add accelerate support
    
    * add bloom to deepspeed tests
    
    * minor changes
    
    * Apply suggestions from code review
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * minor change
    
    * slow tests pass
    
    * Apply suggestions from code review
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update docs/source/en/model_doc/bloom.mdx
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * minor changes:
    
    - change docstring
    - add link to paper
    
    Co-authored-by: Thomwolf <[email protected]>
    Co-authored-by: Thomas Wolf <[email protected]>
    Co-authored-by: thomasw21 <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: sIncerass <[email protected]>
    Co-authored-by: Stas Bekman <[email protected]>
    Co-authored-by: Niklas Muennighoff <[email protected]>
    Co-authored-by: Nicolas Patry <[email protected]>
    Co-authored-by: thomasw21 <[email protected]>
    Co-authored-by: sgugger <[email protected]>
    Co-authored-by: patrickvonplaten <[email protected]>
    Co-authored-by: LysandreJik <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    Co-authored-by: justheuristic <[email protected]>
    Co-authored-by: Stas Bekman <[email protected]>
    16 people authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    ca2a55e View commit details
    Browse the repository at this point in the history
  3. Add ONNX support for ResNet (huggingface#17585)

    * Add ONNX support for ResNet
    
    * Add ONNX test
    
    * make fix-copies
    regisss authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    5323094 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e0be053 View commit details
    Browse the repository at this point in the history
  5. Use shape_list to safely get shapes for Swin (huggingface#17591)

    * Use shape_list to safely get shapes
    
    * Add relevant test
    
    * Tidy and add metrics
    
    * Resolve dynamic shaping issues and move test
    
    * Tidy up and all samples in batch
    
    * Formatting
    amyeroberts authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    9fc3423 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    2908064 View commit details
    Browse the repository at this point in the history
  7. Adding top_k argument to text-classification pipeline. (huggingfa…

    …ce#17606)
    
    * Adding `top_k` and `sort` arguments to `text-classification` pipeline.
    
    - Deprecate `return_all_scores` as `top_k` is more uniform with other
      pipelines, and a superset of what `return_all_scores` can do.
      BC is maintained though.
      `return_all_scores=True` -> `top_k=None`
      `return_all_scores=False` -> `top_k=1`
    
    - Using `top_k` will imply sorting the results, but using no argument
      will keep the results unsorted for backward compatibility.
    
    * Remove `sort`.
    
    * Fixing the test.
    
    * Remove bad doc.
    Narsil authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    2351729 View commit details
    Browse the repository at this point in the history
  8. Fix very long job failure text in Slack report (huggingface#17630)

    * Fix very long job failure text in Slack report
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    c70dacd View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    90ed9ae View commit details
    Browse the repository at this point in the history
  10. Running a pipeline of float16. (huggingface#17637)

    When we're preparing the tensors for CPU for postprocessing, we need
    to upgrade the `float16` to `float32` since CPUs don't have instructions
    for `[b]float16`.
    Narsil authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    c38f4e1 View commit details
    Browse the repository at this point in the history
  11. [modeling_utils] torch_dtype/auto floating dtype fixes (huggingface#1…

    …7614)
    
    * [modeling_utils] torch_dtype/auto fixes
    
    * add test
    
    * apply suggestions
    
    * add missing fallback
    
    * Renaming things
    
    * Use for else
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    stas00 and sgugger authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    75343de View commit details
    Browse the repository at this point in the history
  12. Pre-build DeepSpeed (huggingface#17607)

    * pre-build deepspeed
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    da0bed5 View commit details
    Browse the repository at this point in the history
  13. convert assertion to raised exception in debertav2 (huggingface#17619)

    * convert assertion to raised exception in debertav2
    
    * change assert to raise exception in deberta
    
    * fix messages
    sam-h-bean authored Jun 9, 2022
    Configuration menu
    Copy the full SHA
    fba0b6a View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    df1ec6b View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2022

  1. Translation/autoclass (huggingface#17615)

    * Add Italian translation for autoclass_tutorial.mdx
    
    * Fix synthesis
    
    Co-authored-by: martina.fumanelli <[email protected]>
    mfumanelli and martina.fumanelli authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    e0b58fb View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    af4a1ec View commit details
    Browse the repository at this point in the history
  3. Move Clip image utils to image_utils.py (huggingface#17628)

    * move clip image utils to image_utils.py
    
    * dont default to square images
    
    * fix typo, revert change to test file
    
    * edit convert_rgb comments
    alaradirik authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    6e93d94 View commit details
    Browse the repository at this point in the history
  4. Enable crop_center method to handle (W, H, C) images (huggingface#17626)

    * enable crop_center method to handle (W, H, C) images
    
    * minor style and comment edits
    alaradirik authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    49becba View commit details
    Browse the repository at this point in the history
  5. Bump cookiecutter in /examples/research_projects/decision_transformer (

    …huggingface#17645)
    
    Bumps [cookiecutter](https://github.com/cookiecutter/cookiecutter) from 1.7.2 to 2.1.1.
    - [Release notes](https://github.com/cookiecutter/cookiecutter/releases)
    - [Changelog](https://github.com/cookiecutter/cookiecutter/blob/master/HISTORY.md)
    - [Commits](cookiecutter/cookiecutter@1.7.2...2.1.1)
    
    ---
    updated-dependencies:
    - dependency-name: cookiecutter
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    1d46330 View commit details
    Browse the repository at this point in the history
  6. Fix style

    LysandreJik committed Jun 10, 2022
    Configuration menu
    Copy the full SHA
    2bc3051 View commit details
    Browse the repository at this point in the history
  7. Fix style

    LysandreJik committed Jun 10, 2022
    Configuration menu
    Copy the full SHA
    cdaed36 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    fd1e670 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    b880909 View commit details
    Browse the repository at this point in the history
  10. Fixes huggingface#17128 . (huggingface#17356)

    VisibleDeprecationWarning is addressed by specifying dtype=object when creating numpy array.
    Update code based on review feedback.
    Undo whitespace changes to tokenization_utils_base.py.
    
    Co-authored-by: I like data <[email protected]>
    mygithubid1 and ilikedata2 authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    35b1603 View commit details
    Browse the repository at this point in the history
  11. 🐛 Properly raise RepoNotFoundError when not authenticated (huggingf…

    …ace#17651)
    
    * Raise RepoNotFoundError in case of 401
    
    * Include changes from revert-17646-skip_repo_not_found
    
    * Add a comment
    
    * 💄 Code quality
    
    * 💚 Update `get_from_cache` test
    
    * 💚 Code quality & skip failing test
    SBrandeis authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    c99ddcc View commit details
    Browse the repository at this point in the history
  12. update README.md (huggingface#17657)

    - use CodeParrot scores of v1.1
    - change evaluation command to use accelerate
    loubnabnl authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    3114df4 View commit details
    Browse the repository at this point in the history
  13. [BigBirdFlaxTests] Make tests slow (huggingface#17658)

    * [BigBirdFlaxTests] Make tests slow
    
    * up
    
    * correct black with new version
    patrickvonplaten authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    5e428b7 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    b4eef63 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    13e875c View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    39e1461 View commit details
    Browse the repository at this point in the history
  17. Avoid GPU OOM for a TF Rag test (huggingface#17638)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 10, 2022
    Configuration menu
    Copy the full SHA
    224bde9 View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2022

  1. Configuration menu
    Copy the full SHA
    a5282ab View commit details
    Browse the repository at this point in the history
  2. Add Visual Question Answering (VQA) pipeline (huggingface#17286)

    * wip
    
    * rebase
    
    * all tests pass
    
    * rebase
    
    * ready for PR
    
    * address comments
    
    * fix styles
    
    * add require_torch to pipeline test
    
    * remove remote image to improve CI consistency
    
    * address comments; fix tf/flax tests
    
    * address comments; fix tf/flax tests
    
    * fix tests; add alias
    
    * repo consistency tests
    
    * Update src/transformers/pipelines/visual_question_answering.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * address comments
    
    * Update src/transformers/pipelines/visual_question_answering.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * merge
    
    * Update src/transformers/models/auto/modeling_auto.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * merge
    
    Co-authored-by: Sijun He <[email protected]>
    Co-authored-by: NielsRogge <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    5 people authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    66336dc View commit details
    Browse the repository at this point in the history
  3. Fixed documentation typo, parameter name is evaluation_strategy, not …

    …eval_strategy (huggingface#17669)
    
    Co-authored-by: Saint <[email protected]>
    sainttttt and Saint authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    c1daf72 View commit details
    Browse the repository at this point in the history
  4. explicitly set utf8 for Windows (huggingface#17664)

    Bram Vanroy authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    7308358 View commit details
    Browse the repository at this point in the history
  5. Fix dtype getter (huggingface#17668)

    * Fix dtype getters
    
    * Proper fix for dtype getter
    
    * Style and commant
    
    * Always use last for consistency
    
    * Quality
    sgugger authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    a1344db View commit details
    Browse the repository at this point in the history
  6. Update modeling_gpt_neox.py (huggingface#17575)

    I'm guessing that the intention was to have the `_no_split_modules` class attribute for `GPTNeoXPreTrainedModel` to be set to `["GPTNeoXLayer"]`, akin to how its set as `["GPTJBlock"]` for `GPTJPreTrainedModel`.
    
    If this is incorrect, please feel free to just close the PR.
    
    Thanks!
    willfrey authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    5483388 View commit details
    Browse the repository at this point in the history
  7. Add Ray's scope to training arguments (huggingface#17629)

    * allow scope from trainer arg
    
    * add ray_scope to training args
    
    * escape double quotes
    
    * make style && quality
    
    * attempt to solve doc style issues
    
    * splitting up URLs for style
    
    * make fixup
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: Antoni Baum <[email protected]>
    
    * make style
    
    Co-authored-by: Antoni Baum <[email protected]>
    Bram Vanroy and Yard1 authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    457d4a3 View commit details
    Browse the repository at this point in the history
  8. enable cpu distribution training using mpirun (huggingface#17570)

    * enable cpu distribution training using mpirun
    
    *command like
    *    mpirun -n 2 python3 run_qa.py --no_cuda --xpu_backend ccl xxxx
    *MASTER_ADDR and MASTER_PORT should be set as env
    *export MASTER_ADDR=127.0.0.1
    *export MASTER_PORT=29500
    
    Signed-off-by: Wang, Yi A <[email protected]>
    
    * fix according to the review comment
    
    Signed-off-by: Wang, Yi A <[email protected]>
    
    * use accelerate logic for cpu distribution training to set "RANK","LOCAL_RANK","WORLD_SIZE" environment
    
    Signed-off-by: Wang, Yi A <[email protected]>
    sywangyi authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    4aabf9b View commit details
    Browse the repository at this point in the history
  9. Add FP16 Support for SageMaker Model Parallel (huggingface#17386)

    * Add FP16 supporot for sagemaker model parallel
    
    * minor fix
    
    * fix indentation
    
    * handle mix precision exception for smmp
    
    * minor fix
    
    * remove amp implementation on SMMP
    
    * remove redundant stuff
    
    * reformat trainer
    
    * restyling
    
    * reformat
    haohanchen-aws authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    1690094 View commit details
    Browse the repository at this point in the history
  10. Add LongT5 model (huggingface#16792)

    * Initial commit
    
    * Make some fixes
    
    * Make PT model full forward pass
    
    * Drop TF & Flax implementation, fix copies etc
    
    * Add Flax model and update some corresponding stuff
    
    * Drop some TF things
    
    * Update config and flax local attn
    
    * Add encoder_attention_type to config
    
    * .
    
    * Update docs
    
    * Do some cleansing
    
    * Fix some issues -> make style; add some docs
    
    * Fix position_bias + mask addition + Update tests
    
    * Fix repo consistency
    
    * Fix model consistency by removing flax operation over attn_mask
    
    * [WIP] Add PT TGlobal LongT5
    
    * .
    
    * [WIP] Add flax tglobal model
    
    * [WIP] Update flax model to use the right attention type in the encoder
    
    * Fix flax tglobal model forward pass
    
    * Make the use of global_relative_attention_bias
    
    * Add test suites for TGlobal model
    
    * Fix minor bugs, clean code
    
    * Fix pt-flax equivalence though not convinced with correctness
    
    * Fix LocalAttn implementation to match the original impl. + update READMEs
    
    * Few updates
    
    * Update: [Flax] improve large model init and loading huggingface#16148
    
    * Add ckpt conversion script accoring to huggingface#16853 + handle torch device placement
    
    * Minor updates to conversion script.
    
    * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM
    
    * gpu support + dtype fix
    
    * Apply some suggestions from code review
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * * Remove (de)parallelize stuff
    * Edit shape comments
    * Update README.md
    * make fix-copies
    
    * Remove caching logic for local & tglobal attention
    
    * Apply another batch of suggestions from code review
    
    * Add missing checkpoints
    * Format converting scripts
    * Drop (de)parallelize links from longT5 mdx
    
    * Fix converting script + revert config file change
    
    * Revert "Remove caching logic for local & tglobal attention"
    
    This reverts commit 2a61982.
    
    * Stash caching logic in Flax model
    
    * Make side relative bias used always
    
    * Drop caching logic in PT model
    
    * Return side bias as it was
    
    * Drop all remaining model parallel logic
    
    * Remove clamp statements
    
    * Move test files to the proper place
    
    * Update docs with new version of hf-doc-builder
    
    * Fix test imports
    
    * Make some minor improvements
    
    * Add missing checkpoints to docs
    * Make TGlobal model compatible with torch.onnx.export
    * Replace some np.ndarray with jnp.ndarray
    
    * Fix TGlobal for ONNX conversion + update docs
    
    * fix _make_global_fixed_block_ids and masked neg  value
    
    * update flax model
    
    * style and quality
    
    * fix imports
    
    * remove load_tf_weights_in_longt5 from init and fix copies
    
    * add slow test for TGlobal model
    
    * typo fix
    
    * Drop obsolete is_parallelizable and one warning
    
    * Update __init__ files to fix repo-consistency
    
    * fix pipeline test
    
    * Fix some device placements
    
    * [wip]: Update tests -- need to generate summaries to update expected_summary
    
    * Fix quality
    
    * Update LongT5 model card
    
    * Update (slow) summarization tests
    
    * make style
    
    * rename checkpoitns
    
    * finish
    
    * fix flax tests
    
    Co-authored-by: phungvanduy <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    Co-authored-by: patil-suraj <[email protected]>
    5 people authored Jun 13, 2022
    Configuration menu
    Copy the full SHA
    a72f1c9 View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2022

  1. Fix doc builder Dockerfile (huggingface#17435)

    * Fix doc builder Dockerfile
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    df15703 View commit details
    Browse the repository at this point in the history
  2. Extend Transformers Trainer Class to Enable PyTorch Torchscript for I…

    …nference (huggingface#17153)
    
    * add jit mode option and model wrap
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/training_args.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * refine code
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * add ut and refine code
    
    * code refine
    
    * refine code
    
    * add inference doc
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * add cpu inference performance doc
    
    * Update perf_infer_cpu.mdx
    
    * Update perf_infer_cpu.mdx
    
    * Update performance.mdx
    
    * Update _toctree.yml
    
    * refine jit func naming
    
    * Update _toctree.yml
    
    * Delete perf_infer_gpu_one.mdx
    
    * Update perf_infer_cpu.mdx
    
    * Update docs/source/en/perf_infer_cpu.mdx
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * add none check before jit
    
    * Update docs/source/en/perf_infer_cpu.mdx
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update docs/source/en/perf_infer_cpu.mdx
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: Stas Bekman <[email protected]>
    Co-authored-by: Stas Bekman <[email protected]>
    4 people authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    3b29c9f View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    53496ac View commit details
    Browse the repository at this point in the history
  4. Rag end2end new (huggingface#17650)

    * check
    
    * update the RAG-end2end with new PL and RAY
    
    * removed unwanted comments
    Shamane Siri authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    9068fa6 View commit details
    Browse the repository at this point in the history
  5. Include a comment to reflect Amy's contributions (huggingface#17689)

    * Add note on amy's contribution.
    
    Co-authored-by: Amy Roberts <[email protected]>
    
    * remove non-tech comment.
    
    Co-authored by: Amy Roberts <[email protected]>
    
    Co-authored-by: Amy Roberts <[email protected]>
    sayakpaul and amyeroberts authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    3960ce9 View commit details
    Browse the repository at this point in the history
  6. Swin main layer (huggingface#17693)

    * Swin models call TFSwinMainLayer
    
    * Tidy up
    amyeroberts authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    bd43151 View commit details
    Browse the repository at this point in the history
  7. Add BloomForSequenceClassification and `BloomForTokenClassification…

    …` classes (huggingface#17639)
    
    * add new bloom classes
    
    * (feat) add bloom classification tests; make style
    
    * style: change import in test
    
    * add some typehints to bloom classes
    
    * merge main into branch
    
    * fix: input checking in bloom seq classification
    
    * fix tests
    
    * change model class tests
    
    * fix few tests
    
    - more tests should pass
    - one test left
    
    * make token classifier return hidden states
    
    * style: make BLOOM typehints consistent
    
    Co-authored-by: Younes Belkada <[email protected]>
    
    Co-authored-by: younesbelkada <[email protected]>
    Co-authored-by: Younes Belkada <[email protected]>
    3 people authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    edb672a View commit details
    Browse the repository at this point in the history
  8. FX function refactor (huggingface#17625)

    * Function refactor
    
    * Update src/transformers/utils/fx.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    michaelbenayoun and sgugger authored Jun 14, 2022
    Configuration menu
    Copy the full SHA
    7ec9128 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    120649b View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    d453ea6 View commit details
    Browse the repository at this point in the history

Commits on Jun 15, 2022

  1. Configuration menu
    Copy the full SHA
    b76290f View commit details
    Browse the repository at this point in the history
  2. Documentation: RemBERT fixes (huggingface#17641)

    * rembert: fix python codeblock
    
    * rembert: use correct google/rembert checkpoint name in documentation
    
    * rembert: use correct google/rembert checkpoint name in TF documentation
    stefan-it authored Jun 15, 2022
    Configuration menu
    Copy the full SHA
    242cc6e View commit details
    Browse the repository at this point in the history
  3. [Wav2Vec2Conformer] Official release (huggingface#17709)

    * [Wav2Vec2Conformer] Official release
    
    * remove from not-in-readme
    patrickvonplaten authored Jun 15, 2022
    Configuration menu
    Copy the full SHA
    7f14839 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    50415b8 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    6ebeeee View commit details
    Browse the repository at this point in the history
  6. CLI: Add flag to push TF weights directly into main (huggingface#17720)

    * Add flag to push weights directly into main
    gante authored Jun 15, 2022
    Configuration menu
    Copy the full SHA
    c3c62b5 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    66f8933 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    3981ee8 View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2022

  1. Fix mask token in the example (huggingface#17725)

    VIsualBert uses bert-base-uncased tokenizer, therefore, instead of {mask}, the mask token should be [MASK]
    Jiayi-Pan authored Jun 16, 2022
    Configuration menu
    Copy the full SHA
    2eadb7e View commit details
    Browse the repository at this point in the history
  2. Fix tf shared embedding (huggingface#17730)

    * fix the naming
    
    * from pt in test for now
    
    * make style
    
    * slow test and removed from_pt
    ArthurZucker authored Jun 16, 2022
    Configuration menu
    Copy the full SHA
    f44e2c2 View commit details
    Browse the repository at this point in the history
  3. Refine Bf16 test for deepspeed (huggingface#17734)

    * Refine BF16 check in CPU/GPU
    
    * Fixes
    
    * Renames
    sgugger authored Jun 16, 2022
    Configuration menu
    Copy the full SHA
    36d4647 View commit details
    Browse the repository at this point in the history
  4. v4.21.0.dev0

    sgugger committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    7c6ec19 View commit details
    Browse the repository at this point in the history
  5. Remove needless file

    sgugger committed Jun 16, 2022
    Configuration menu
    Copy the full SHA
    3c7e56f View commit details
    Browse the repository at this point in the history

Commits on Jun 17, 2022

  1. Enable PyTorch nightly build CI (huggingface#17335)

    * nightly build pytorch CI
    
    * fix working dir
    
    * change time and event name
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 17, 2022
    Configuration menu
    Copy the full SHA
    ca169db View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2d7c1bb View commit details
    Browse the repository at this point in the history
  3. Bump notebook in /examples/research_projects/visual_bert (huggingface…

    …#17742)
    
    Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12.
    
    ---
    updated-dependencies:
    - dependency-name: notebook
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jun 17, 2022
    Configuration menu
    Copy the full SHA
    5089a2d View commit details
    Browse the repository at this point in the history
  4. Bump notebook in /examples/research_projects/lxmert (huggingface#17743)

    Bumps [notebook](http://jupyter.org) from 6.4.10 to 6.4.12.
    
    ---
    updated-dependencies:
    - dependency-name: notebook
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jun 17, 2022
    Configuration menu
    Copy the full SHA
    e44a569 View commit details
    Browse the repository at this point in the history
  5. Migrate HFDeepSpeedConfig from trfrs to accelerate (huggingface#17623)

    * Migrate HFDeepSpeedConfig from trfrs to accelerate
    
    * add `accelerate` to testing dep
    
    * addressing comments
    
    * addressing comments
    
    Using `_shared_state` and avoiding object creation. This is necessary as `notebook_launcher` in `launcers.py` checks `len(AcceleratorState._shared_state)>0` to throw an error.
    
    * resolving comments
    
    1. Use simple API from accelerate to manage the deepspeed config integration
    2. Update the related documentation
    
    * reverting changes and addressing comments
    
    * docstring correction
    
    * addressing nits
    
    * addressing nits
    
    * addressing nits 3
    
    * bumping up the accelerate version to 0.10.0
    
    * resolving import
    
    * update setup.py to include deepspeed dependencies
    
    * Update dependency_versions_table.py
    
    * fixing imports
    
    * reverting changes to CI dependencies for "run_tests_pipelines_tf*" tests
    
    These changes didn't help with resolving the failures and I believe this needs to be addressed in another PR.
    
    * removing `accelerate` as hard dependency
    
    Resolves issues related to CI Tests
    
    * adding `accelerate` as dependency for building docs
    
    resolves failure in Build PR Documentation test
    
    * adding `accelerate` as dependency in "dev" to resolve doc build issue
    
    * resolving comments
    
    1. adding `accelerate` to extras["all"]
    2. Including check for accelerate too before import HFDeepSpeedConfig from there
    
    Co-Authored-By: Sylvain Gugger <[email protected]>
    
    * resolving comments
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    pacman100 and sgugger authored Jun 17, 2022
    Configuration menu
    Copy the full SHA
    21a7724 View commit details
    Browse the repository at this point in the history
  6. Save huggingface checkpoint as artifact in mlflow callback (huggingfa…

    …ce#17686)
    
    * Fix eval to compute rouge correctly for rouge_score
    
    * styling
    
    * moving sentence tokenization to utils from run_eval
    
    * saving ckpt in mlflow
    
    * use existing format of args
    
    * fix documentation
    
    Co-authored-by: Swetha Mandava <[email protected]>
    swethmandava and Swetha Mandava authored Jun 17, 2022
    Configuration menu
    Copy the full SHA
    522a9ec View commit details
    Browse the repository at this point in the history

Commits on Jun 18, 2022

  1. Added translation of index.mdx to Portuguese Issue huggingface#16824 (h…

    …uggingface#17565)
    
    * Added translation of installation.mdx to Portuguese, as well
    as default templates of _toctree.yml and _config.py
    
    * [ build_documentation.yml ] - Updated doc_builder to build
    documentation in Portuguese.
    [ pipeline_tutorial.mdx ] - Created translation for the pipeline_tutorial.mdx.
    
    * [ build_pr_documentation.yml ] - Added pt language to pr_documentation builder.
    
    [ pipeline_tutorial.mdx ] - Grammar changes.
    
    * [ accelerate.mdx ] - Translated to Portuguese the acceleration tutorial.
    
    * [ multilingual.mdx ] - Added portuguese translation for multilingual tutorial.
    
    [ training.mdx ] - Added portuguese translation for training tutorial.
    
    * [ preprocessing.mdx ] - WIP
    
    * Update _toctree.yml
    
    * Adding Pré-processamento to _toctree.yml
    
    * Update accelerate.mdx
    
    * Nits and eliminate preprocessing file while it is ready
    
    * [ index.mdx ] - Translated to Portuguese the index apresentation page.
    
    * [ docs/source/pt ] - Updated _toctree.yml to match newest translations.
    
    * Fix build_pr_documentation.yml
    
    * Fix index nits
    
    * nits in _toctree
    
    Co-authored-by: Omar U. Espejel <[email protected]>
    rzimmerdev and omarespejel authored Jun 18, 2022
    Configuration menu
    Copy the full SHA
    0d92798 View commit details
    Browse the repository at this point in the history
  2. Attempt to change Push CI to workflow_run (huggingface#17753)

    * Use workflow_run event for push CI
    
    * change to workflow_run
    
    * Add comments
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 18, 2022
    Configuration menu
    Copy the full SHA
    6589e51 View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2022

  1. TF: BART compatible with XLA generation (huggingface#17479)

    * Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
    gante authored Jun 20, 2022
    Configuration menu
    Copy the full SHA
    132402d View commit details
    Browse the repository at this point in the history
  2. deprecate is_torch_bf16_available (huggingface#17738)

    * deprecate is_torch_bf16_available
    
    * address suggestions
    stas00 authored Jun 20, 2022
    Configuration menu
    Copy the full SHA
    a2d34b7 View commit details
    Browse the repository at this point in the history
  3. Fix cache for GPT-Neo-X (huggingface#17764)

    * Fix cache for GPT-Neo-X
    
    * Add more tests
    sgugger authored Jun 20, 2022
    Configuration menu
    Copy the full SHA
    fdb1208 View commit details
    Browse the repository at this point in the history
  4. Not use -1e4 as attn mask (huggingface#17306)

    * Use torch.finfo(self.dtype).min
    
    * for GPTNeoX
    
    * for Albert
    
    * For Splinter
    
    * Update src/transformers/models/data2vec/modeling_data2vec_audio.py
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * fix -inf used in Bart-like models
    
    * Fix a few remaining -inf
    
    * more fix
    
    * clean up
    
    * For CLIP
    
    * For FSMT
    
    * clean up
    
    * fix test
    
    * Add dtype argument and use it for LayoutLMv3
    
    * update FlaxLongT5Attention
    
    Co-authored-by: ydshieh <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    3 people authored Jun 20, 2022
    Configuration menu
    Copy the full SHA
    d3cb288 View commit details
    Browse the repository at this point in the history
  5. Update modeling_longt5.py (huggingface#17777)

    On line 180, `torch.tensor(-1.0, xxx)` gives the error "TypeError: 'float' object cannot be interpreted as an integer" 
    This is because the dtype here is `int64`.  For `dtype=int64`, this needs to simply be `-1`.  
    This impacts the long-t5-tglogbal-x model.  It does not impact the long-t5-local-x version which does not appear to call this line.
    bjascob authored Jun 20, 2022
    Configuration menu
    Copy the full SHA
    da27c4b View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2022

  1. Add UL2 (just docs) (huggingface#17740)

    * Add UL2
    Co-authored-by: Daniel Hesslow <[email protected]>
    
    * Correct naming
    
    * sort better
    
    * up
    
    * apply sylvains suggestion
    patrickvonplaten authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    8fcbe27 View commit details
    Browse the repository at this point in the history
  2. add onnx support for deberta and debertav2 (huggingface#17617)

    * add onnx support for debertav2
    
    * debertav2 -> deberta-v2 in onnx features file
    
    * remove causal lm
    
    * add deberta-v2-xlarge to onnx tests
    
    * use self.type().dtype() in xsoftmax
    
    Co-authored-by: Jingya HUANG <[email protected]>
    
    * remove hack for deberta
    
    * remove unused imports
    
    * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
    
    Co-authored-by: Jingya HUANG <[email protected]>
    
    * use generate dummy inputs
    
    * linter
    
    * add imports
    
    * add support for deberta v1 as well
    
    * deberta does not support multiple choice
    
    * Update src/transformers/models/deberta/configuration_deberta.py
    
    Co-authored-by: Jingya HUANG <[email protected]>
    
    * Update src/transformers/models/deberta_v2/configuration_deberta_v2.py
    
    Co-authored-by: Jingya HUANG <[email protected]>
    
    * one line ordered dict
    
    * fire build
    
    Co-authored-by: Jingya HUANG <[email protected]>
    sam-h-bean and JingyaHuang authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    eb16be4 View commit details
    Browse the repository at this point in the history
  3. [CodeParrot] Near-deduplication with jaccard similarity (huggingface#…

    …17054)
    
    * deduplication draft
    
    * update style
    
    * update style test
    
    * dummy test main
    
    * rename modules
    
    * rename functions
    
    * return extremes in deduplicate_clusters
    
    * update style
    
    * cast str for gzip
    
    * update doc string
    
    * time processing
    
    * use dataset map to compute minhash
    
    * fill value for short token
    
    * remove da map method
    
    * update style
    
    * use share object to multiprocess
    
    * update style
    
    * use f-string and minor fix
    
    Co-authored-by: Leandro von Werra <[email protected]>
    Co-authored-by: Loubna Ben Allal <[email protected]>
    
    * update style
    
    * use module parameters
    
    * change ds_dedup to ds_filter
    
    * save ds_dedup
    
    * mv test to script tests
    
    * make jaccard threshold a parameter of deduplicate_dataset
    
    * update style
    
    * add doc strings
    
    * update style
    
    * add doc string for DuplicationIndex
    
    * save files into data dir
    
    * update readme
    
    * Update examples/research_projects/codeparrot/README.md
    
    Co-authored-by: Loubna Ben Allal <[email protected]>
    
    * make near deduplication optional
    
    * move near deduplication in README
    
    * Update examples/research_projects/codeparrot/README.md
    
    Co-authored-by: Leandro von Werra <[email protected]>
    
    * use f string
    
    Co-authored-by: Leandro von Werra <[email protected]>
    Co-authored-by: Loubna Ben Allal <[email protected]>
    3 people authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    da2bd2a View commit details
    Browse the repository at this point in the history
  4. Add link to notebook (huggingface#17791)

    Co-authored-by: Niels Rogge <[email protected]>
    NielsRogge and Niels Rogge authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    3fab17f View commit details
    Browse the repository at this point in the history
  5. [ViTMAE] Fix docstrings and variable names (huggingface#17710)

    * Fix docstrings and variable names
    
    * Rename x to something better
    
    * Improve messages
    
    * Fix docstrings and add test for greyscale images
    
    Co-authored-by: Niels Rogge <[email protected]>
    NielsRogge and Niels Rogge authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    b681e12 View commit details
    Browse the repository at this point in the history
  6. Fix Automatic Download of Pretrained Weights in DETR (huggingface#17712)

    * added use_backbone_pretrained
    
    * style fixes
    
    * update
    
    * Update detr.mdx
    
    * Update detr.mdx
    
    * Update detr.mdx
    
    * update using doc py
    
    * Update detr.mdx
    
    * Update src/transformers/models/detr/configuration_detr.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    AnugunjNaman and sgugger authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    27e9073 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    7bc88c0 View commit details
    Browse the repository at this point in the history
  8. Prepare transformers for v0.8.0 huggingface-hub release (huggingface#…

    …17716)
    
    * Prepare CI for v0.8.0
    
    * pin hfh (revert before merge)
    
    * Revert "pin hfh (revert before merge)"
    
    This reverts commit a010314.
    
    * Test rc3
    
    * Test latest rc
    
    * Unpin to the RC
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    LysandreJik and sgugger authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    6a5272b View commit details
    Browse the repository at this point in the history
  9. Use 5e-5 For BigBird PT/Flax equivalence tests (huggingface#17780)

    * rename to check_pt_flax_outputs
    
    * update check_pt_flax_outputs
    
    * use 5e-5 for BigBird PT/Flax test
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    f47afef View commit details
    Browse the repository at this point in the history
  10. TF Sharded (huggingface#17713)

    * initial commit
    
    * update modeeling tf utils
    
    * quality
    
    * clean and update args
    
    * update
    
    * remove potential bug
    
    * code quality
    
    * update
    
    * update max shard
    
    * update tests for sharding from pretrained
    
    * fix remaining test
    
    * make style
    
    * h5py if tf available
    
    * update and fix test
    
    * fix test
    
    * style
    
    * modified push to hub to support shard for TF
    
    * quick fix
    
    * update code
    
    * merge branch main and style
    
    * Apply suggestions from code review
    
    Co-authored-by: Joao Gante <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * update based on reviews
    
    * update doc
    
    * update and style
    
    * Apply suggestions from code review
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update based on reviews
    
    * fix typo
    
    * style
    
    Co-authored-by: Joao Gante <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    4 people authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    7cced02 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    ef23fae View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    52404cb View commit details
    Browse the repository at this point in the history
  13. Add final_layer_norm to OPT model (huggingface#17785)

    * Add final_layer_norm to OPT model
    
    * Add JAX and TF version
    
    * Fix Keras name
    
    * Woops
    
    * Allow for non breaking change
    
    * Apply suggestions from code review
    
    * add tests
    
    Co-authored-by: Patrick von Platen <[email protected]>
    thomasw21 and patrickvonplaten authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    abc400b View commit details
    Browse the repository at this point in the history
  14. Improve error message Union not allowed (huggingface#17769)

    * Improve error message Union not allowed
    
    * make style
    
    * Update src/transformers/hf_argparser.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    Bram Vanroy and sgugger authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    26a6a42 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    3ccff0d View commit details
    Browse the repository at this point in the history
  16. Fix top_k_top_p_filtering having unexpected behavior (huggingface#1…

    …7744)
    
    - Fix `top_k_top_p_filtering` not passing `filter_value` to
       `TopPLogitsWarper` causing any top-p filtered logits to be -inf
       instead of specified value
    
     - Add corresponding test
    unifyh authored Jun 21, 2022
    Configuration menu
    Copy the full SHA
    3b00b62 View commit details
    Browse the repository at this point in the history

Commits on Jun 22, 2022

  1. Configuration menu
    Copy the full SHA
    16c6eb7 View commit details
    Browse the repository at this point in the history
  2. Add logits_processor parameter, used by generate, to `Seq2SeqTraine…

    …r` methods `evaluate` and `predict` (huggingface#17805)
    
    * Add logits_processor parameter, used by `generate`, to `Seq2SeqTrainer` methods `evaluate` and `predict`
    
    * Add all generate parameters to `Seq2SeqTrainer`, and also to `QuestionAnsweringSeq2SeqTrainer` which overrides it
    
    * Remove `self._num_beams` from trainer classes
    
    * - Run fixup
    - Fix "Constraint" not exposed
    - Fix synced_gpus to actually read from param
    
    * Use kwargs
    
    * Copy kwargs before making changes to it
    
    * Fix style issues unused imports
    eranhirs authored Jun 22, 2022
    Configuration menu
    Copy the full SHA
    1357038 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    56b83cf View commit details
    Browse the repository at this point in the history
  4. Bump numpy in /examples/research_projects/visual_bert (huggingface#17816

    )
    
    Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
    - [Release notes](https://github.com/numpy/numpy/releases)
    - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
    - [Commits](numpy/numpy@v1.21.0...v1.22.0)
    
    ---
    updated-dependencies:
    - dependency-name: numpy
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jun 22, 2022
    Configuration menu
    Copy the full SHA
    af0d21e View commit details
    Browse the repository at this point in the history
  5. Bump numpy from 1.21.0 to 1.22.0 in /examples/research_projects/lxmert (

    huggingface#17817)
    
    Bumps [numpy](https://github.com/numpy/numpy) from 1.21.0 to 1.22.0.
    - [Release notes](https://github.com/numpy/numpy/releases)
    - [Changelog](https://github.com/numpy/numpy/blob/main/doc/HOWTO_RELEASE.rst)
    - [Commits](numpy/numpy@v1.21.0...v1.22.0)
    
    ---
    updated-dependencies:
    - dependency-name: numpy
      dependency-type: direct:production
    ...
    
    Signed-off-by: dependabot[bot] <[email protected]>
    
    Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
    dependabot[bot] authored Jun 22, 2022
    Configuration menu
    Copy the full SHA
    c366ce1 View commit details
    Browse the repository at this point in the history
  6. CLI: use hub's create_commit (huggingface#17755)

    * use create_commit
    
    * better commit message and description
    
    * touch setup.py to trigger cache update
    
    * add hub version gating
    gante authored Jun 22, 2022
    Configuration menu
    Copy the full SHA
    0d0c392 View commit details
    Browse the repository at this point in the history
  7. Offload fixes (huggingface#17810)

    * Offload fixes
    
    * Add a test
    sgugger authored Jun 22, 2022
    Configuration menu
    Copy the full SHA
    df8e680 View commit details
    Browse the repository at this point in the history

Commits on Jun 23, 2022

  1. Fix push CI artifact path (huggingface#17788)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    8d634b7 View commit details
    Browse the repository at this point in the history
  2. add doctests for DETR (huggingface#17786)

    * add: check labels for detr object detection doctests
    
    * add: check shapes
    
    * add: add detr to documentation_tests.py
    
    * fix: make fixup output
    
    * fix: add a comment
    qherreros authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    ab223fc View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5cce307 View commit details
    Browse the repository at this point in the history
  4. Update type hints modeling_yoso.py (huggingface#17827)

    * Update modeling_yoso.py
    
    * make fixup
    
    * Update modeling_yoso.py
    
    That should be it copied from previous PR
    F02934 authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    4297f44 View commit details
    Browse the repository at this point in the history
  5. Add missing type hints for QDQBertModel (huggingface#17783)

    * Feat: add missing type hints for QDQBertModel
    
    * fix: ran black and isort
    
    * feat: Add missing output type for QDQBertModel
    
    * feat: Add type hints for QDQBertLMHeadModel and models starting with QDQBertFor
    
    * fix: add missing return type for QDQBertModel
    
    * fix: remove wrong return type for QDQBertEmbeddings
    
    * fix: readded config argument to load_tf_weights_in_qdqbert
    
    * fix: add BertConfig type to BertEmbeddings config due t checko error in ci
    
    * fix: removed config type hints to avoid copy checks
    willtai authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    d37a68e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    b2fdbac View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    3eed553 View commit details
    Browse the repository at this point in the history
  8. Fix an error message in BigBird (huggingface#17840)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    5bc779a View commit details
    Browse the repository at this point in the history
  9. Improve performance docs (huggingface#17750)

    * add skeleton files
    
    * fix cpu inference link
    
    * add hint to make clear that single gpu section contains general info
    
    * add new files to ToC
    
    * update toctree to have subsection for performance
    
    * add "coming soon" to the still empty sections
    
    * fix missing title
    
    * fix typo
    
    * add reference to empty documents
    
    * Apply suggestions from code review
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Stas Bekman <[email protected]>
    
    Co-authored-by: Stas Bekman <[email protected]>
    lvwerra and stas00 authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    6f29029 View commit details
    Browse the repository at this point in the history
  10. BLOOM minor changes on tokenizer (huggingface#17823)

    * few fixes:
    
    - hardcode tokenizer padding side
    - remove unused args
    
    * few fixes:
    
    - added new attribute on TokenizerTesterMixin
    - added new slow test
    - remove unused arg on tokenizer class
    
    * make style
    
    * Update src/transformers/models/bloom/tokenization_bloom_fast.py
    
    Co-authored-by: SaulLu <[email protected]>
    
    * make quality
    
    * apply changes
    
    - remove new attribute
    - redefine test on the class
    
    * add comments
    
    Co-authored-by: SaulLu <[email protected]>
    younesbelkada and SaulLu authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    18c263c View commit details
    Browse the repository at this point in the history
  11. Fix broken test for models with batchnorm (huggingface#17841)

    * Fix tests that broke when models used batchnorm
    
    * Initializing the model twice does not actually...
    ...give you the same weights each time.
    I am good at machine learning.
    
    * Fix speed regression
    Rocketknight1 authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    1a7ef33 View commit details
    Browse the repository at this point in the history
  12. Update modeling_cvt.py (huggingface#17846)

    As shown in the colab notebook I added the missing type hints for " CvtForImageClassification
    CvtModel
    "
    F02934 authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    e70abda View commit details
    Browse the repository at this point in the history
  13. Change no trainer image_classification test (huggingface#17635)

    * Adjust test arguments and use a new example test
    muellerzr authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    acb709d View commit details
    Browse the repository at this point in the history
  14. Nezha Pytorch implementation (huggingface#17776)

    * wip
    
    * rebase
    
    * all tests pass
    
    * rebase
    
    * ready for PR
    
    * address comments
    
    * fix styles
    
    * add require_torch to pipeline test
    
    * remove remote image to improve CI consistency
    
    * address comments; fix tf/flax tests
    
    * address comments; fix tf/flax tests
    
    * fix tests; add alias
    
    * repo consistency tests
    
    * Update src/transformers/pipelines/visual_question_answering.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * address comments
    
    * Update src/transformers/pipelines/visual_question_answering.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * merge
    
    * wip
    
    * wip
    
    * wip
    
    * most basic tests passes
    
    * all tests pass now
    
    * relative embedding
    
    * wip
    
    * running make fixup
    
    * remove bert changes
    
    * fix doc
    
    * fix doc
    
    * fix issues
    
    * fix doc
    
    * address comments
    
    * fix CI
    
    * remove redundant copied from
    
    * address comments
    
    * fix broken test
    
    Co-authored-by: Sijun He <[email protected]>
    Co-authored-by: NielsRogge <[email protected]>
    3 people authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    7cf52a4 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    7c1b912 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    75259b4 View commit details
    Browse the repository at this point in the history
  17. Auto-build Docker images before on-merge if setup.py was changed (hug…

    …gingface#17573)
    
    * Auto-build on setup modification
    
    * Modify push-caller
    
    * Make adjustments based on code review
    muellerzr authored Jun 23, 2022
    Configuration menu
    Copy the full SHA
    893ab12 View commit details
    Browse the repository at this point in the history

Commits on Jun 24, 2022

  1. Improve vision models (huggingface#17731)

    * Improve vision models
    
    * Add a lot of improvements
    
    * Remove to_2tuple from swin tests
    
    * Fix TF Swin
    
    * Fix more tests
    
    * Fix copies
    
    * Improve more models
    
    * Fix ViTMAE test
    
    * Add channel check for TF models
    
    * Add proper channel check for TF models
    
    * Apply suggestion from code review
    
    * Apply suggestions from code review
    
    * Add channel check for Flax models, apply suggestion
    
    * Fix bug
    
    * Add tests for greyscale images
    
    * Add test for interpolation of pos encodigns
    
    Co-authored-by: Niels Rogge <[email protected]>
    NielsRogge and Niels Rogge authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    0917870 View commit details
    Browse the repository at this point in the history
  2. Improve encoder decoder model docs (huggingface#17815)

    * Copied all the changes from the last PR
    
    * added in documentation_tests.txt
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: Yih-Dar <[email protected]>
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update docs/source/en/model_doc/encoder-decoder.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    Co-authored-by: vishwaspai <[email protected]>
    Co-authored-by: NielsRogge <[email protected]>
    Co-authored-by: Yih-Dar <[email protected]>
    4 people authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    c2c0d9d View commit details
    Browse the repository at this point in the history
  3. Fix Constrained beam search duplication and weird output issue (huggi…

    …ngface#17814)
    
    * fix(ConstrainedBeamSearchScorer.step_sentence_constraint): avoid hypothesis duplication between topk and advance
    
    * fix(GenerationMixin.constrained_beam_search): appropriately assign beam scores instead of token scores
    boy2000-007man authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    bc7a6fd View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    73a0496 View commit details
    Browse the repository at this point in the history
  5. Fix Splinter test (huggingface#17854)

    * fix
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    4474900 View commit details
    Browse the repository at this point in the history
  6. Add CodeGen model (huggingface#17443)

    * Add CodeGen model
    
    * Add missing key and switch order of super()
    
    * Fix torch.ones init with uint8 instead of bool
    
    * Address comments: copy statements and doc
    
    * update tests
    
    * remove old model parallel
    
    * fix batch gen tests
    
    * fix batch gen test
    
    * update test_gpt2_sample_max_time
    
    * fix codgen test and revert gpt2 test change
    
    * Fix incorrect tie_word_embedding value, typo, URL
    
    * Fix model order in README and styling
    
    * Reorder model list alphabetically
    
    * Set tie_word_embedding to False by default
    
    * Apply suggestions from code review
    
    * Better attn mask name & remove attn masked_bias
    
    * add tokenizer for codegen
    
    * quality
    
    * doc tokenizer
    
    * fix-copies
    
    * add CodeGenTokenizer in converter
    
    * make truncation optional
    
    * add test for truncation
    
    * add copyright
    
    * fix-copies
    
    * fix fast tokenizer decode
    
    * Update src/transformers/models/codegen/tokenization_codegen.py
    
    Co-authored-by: Patrick von Platen <[email protected]>
    
    * increase vocab_size in tests
    
    Co-authored-by: patil-suraj <[email protected]>
    Co-authored-by: Patrick von Platen <[email protected]>
    3 people authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    d6b6fb9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    061a73d View commit details
    Browse the repository at this point in the history
  8. Add type hints for gptneox models (huggingface#17858)

    * feat: Add type hints for GPTNeoxForCausalLM and GPTNeoXModel
    
    * fix: removed imported Dict type
    
    * fix: Removed unused List import
    willtai authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    ef28a40 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    2ef94ee View commit details
    Browse the repository at this point in the history
  10. Use higher value for hidden_size in Flax BigBird test (huggingface#17822

    )
    
    * Use higher value for hidden_size in Flax BigBird test
    
    * remove 5e-5
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    0e0f1f4 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    494aac6 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    b03be78 View commit details
    Browse the repository at this point in the history
  13. Properly get tests deps in test_fetcher (huggingface#17870)

    * Properly get tests deps in test_fetcher
    
    * Remove print
    sgugger authored Jun 24, 2022
    Configuration menu
    Copy the full SHA
    e8eb699 View commit details
    Browse the repository at this point in the history

Commits on Jun 25, 2022

  1. Configuration menu
    Copy the full SHA
    cc5c061 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2022

  1. Configuration menu
    Copy the full SHA
    401fcca View commit details
    Browse the repository at this point in the history
  2. Add a TF in-graph tokenizer for BERT (huggingface#17701)

    * Add a TF in-graph tokenizer for BERT
    
    * Add from_pretrained
    
    * Add proper truncation, option handling to match other tokenizers
    
    * Add proper imports and guards
    
    * Add test, fix all the bugs exposed by said test
    
    * Fix truncation of paired texts in graph mode, more test updates
    
    * Small fixes, add a (very careful) test for savedmodel
    
    * Add tensorflow-text dependency, make fixup
    
    * Update documentation
    
    * Update documentation
    
    * make fixup
    
    * Slight changes to tests
    
    * Add some docstring examples
    
    * Update tests
    
    * Update tests and add proper lowercasing/normalization
    
    * make fixup
    
    * Add docstring for padding!
    
    * Mark slow tests
    
    * make fixup
    
    * Fall back to BertTokenizerFast if BertTokenizer is unavailable
    
    * Fall back to BertTokenizerFast if BertTokenizer is unavailable
    
    * make fixup
    
    * Properly handle tensorflow-text dummies
    Rocketknight1 authored Jun 27, 2022
    Configuration menu
    Copy the full SHA
    ee0d001 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3ec7d4c View commit details
    Browse the repository at this point in the history
  4. fix (huggingface#17890)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 27, 2022
    Configuration menu
    Copy the full SHA
    9a34538 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    afb71b6 View commit details
    Browse the repository at this point in the history
  6. Fix add new model like frameworks (huggingface#17869)

    * Add new model like adds only the selected frameworks object in init
    
    * Small fix
    sgugger authored Jun 27, 2022
    Configuration menu
    Copy the full SHA
    9874282 View commit details
    Browse the repository at this point in the history
  7. bert: add conversion script for BERT Token Dropping TF2 checkpoints (h…

    …uggingface#17142)
    
    * bert: add conversion script for BERT Token Dropping TF2 checkpoints
    
    * bert: rename conversion script for BERT Token Dropping checkpoints
    
    * bert: fix flake errors in BERT Token Dropping conversion script
    
    * bert: make doc-builder happy!!1!11
    
    * bert: fix pytorch_dump_path of BERT Token Dropping conversion script
    stefan-it authored Jun 27, 2022
    Configuration menu
    Copy the full SHA
    71b2839 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    6dd00f6 View commit details
    Browse the repository at this point in the history
  9. Fix bug in gpt2's (from-scratch) special scaled weight initialization (

    …huggingface#17877)
    
    * only special scale init each gpt2 c_proj weight once, on exact match
    
    * fix double quotes
    
    Co-authored-by: leandro <[email protected]>
    karpathy and leandro authored Jun 27, 2022
    Configuration menu
    Copy the full SHA
    e02037b View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2022

  1. Configuration menu
    Copy the full SHA
    0b0dd97 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f717d47 View commit details
    Browse the repository at this point in the history
  3. Fix PyTorch/TF Auto tests (huggingface#17895)

    * add loading_info
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 28, 2022
    Configuration menu
    Copy the full SHA
    db2644b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    9eec4e9 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    1dfa03f View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    0094565 View commit details
    Browse the repository at this point in the history
  7. Move logic into pixelshuffle layer (huggingface#17899)

    * Move all pixelshuffle logic into layer
    
    * Rename layer
    
    * Use correct input to function
    amyeroberts authored Jun 28, 2022
    Configuration menu
    Copy the full SHA
    f71895a View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    bfcd574 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    76d13de View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    b424f0b View commit details
    Browse the repository at this point in the history
  11. Adding GroupViT Models (huggingface#17313)

    * add group vit and fixed test (except slow)
    
    * passing slow test
    
    * addressed some comments
    
    * fixed test
    
    * fixed style
    
    * fixed copy
    
    * fixed segmentation output
    
    * fixed test
    
    * fixed relative path
    
    * fixed copy
    
    * add ignore non auto configured
    
    * fixed docstring, add doc
    
    * fixed copies
    
    * Apply suggestions from code review
    
    merge suggestions
    
    Co-authored-by: NielsRogge <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * resolve comment, renaming model
    
    * delete unused attr
    
    * use fix copies
    
    * resolve comments
    
    * fixed attn
    
    * remove unused vars
    
    * refactor tests
    
    * resolve final comments
    
    * add demo notebook
    
    * fixed inconsitent default
    
    * Apply suggestions from code review
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * rename stage->stages
    
    * Create single GroupViTEncoderLayer class
    
    * Update conversion script
    
    * Simplify conversion script
    
    * Remove cross-attention class in favor of GroupViTAttention
    
    * Convert other model as well, add processor to conversion script
    
    * addressing final comment
    
    * fixed args
    
    * Update src/transformers/models/groupvit/modeling_groupvit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: NielsRogge <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: Niels Rogge <[email protected]>
    4 people authored Jun 28, 2022
    Configuration menu
    Copy the full SHA
    6c8f4c9 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    5a3d0cb View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    5f1e67a View commit details
    Browse the repository at this point in the history
  14. Fixing a regression with return_all_scores introduced in huggingfac…

    …e#17606 (huggingface#17906)
    
    Fixing a regression with `return_all_scores` introduced in huggingface#17606
    
    - The legacy test actually tested `return_all_scores=False` (the actual
      default) instead of `return_all_scores=True` (the actual weird case).
    
    This commit adds the correct legacy test and fixes it.
    
    Tmp legacy tests.
    
    Actually fix the regression (also contains lists)
    
    Less diffed code.
    Narsil authored Jun 28, 2022
    Configuration menu
    Copy the full SHA
    776855c View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2022

  1. Configuration menu
    Copy the full SHA
    6aae59d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    babd7b1 View commit details
    Browse the repository at this point in the history
  3. Fix the Conda package build (huggingface#16737)

    * Fix the Conda package build
    
    * Update build.sh
    
    * Update release-conda.yml
    bryant1410 authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    9041547 View commit details
    Browse the repository at this point in the history
  4. Remove render tags (huggingface#17897)

    Co-authored-by: Niels Rogge <[email protected]>
    NielsRogge and Niels Rogge authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    e113c5c View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    b814275 View commit details
    Browse the repository at this point in the history
  6. TF: XLA beam search + most generation-compatible models are now also …

    …XLA-generate-compatible (huggingface#17857)
    
    * working beam search 🎉
    
    * XLA generation compatible with ALL classes
    
    * add xla generation slow test
    gante authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    e6d27ca View commit details
    Browse the repository at this point in the history
  7. TF implementation of RegNets (huggingface#17554)

    * chore: initial commit
    
    Copied the torch implementation of regnets and porting the code to tf step by step. Also introduced an output layer which was needed for regnets.
    
    * chore: porting the rest of the modules to tensorflow
    
    did not change the documentation yet, yet to try the playground on the model
    
    * Fix initilizations (#1)
    
    * fix: code structure in few cases.
    
    * fix: code structure to align tf models.
    
    * fix: layer naming, bn layer still remains.
    
    * chore: change default epsilon and momentum in bn.
    
    * chore: styling nits.
    
    * fix: cross-loading bn params.
    
    * fix: regnet tf model, integration passing.
    
    * add: tests for TF regnet.
    
    * fix: code quality related issues.
    
    * chore: added rest of the files.
    
    * minor additions..
    
    * fix: repo consistency.
    
    * fix: regnet tf tests.
    
    * chore: reorganize dummy_tf_objects for regnet.
    
    * chore: remove checkpoint var.
    
    * chore: remov unnecessary files.
    
    * chore: run make style.
    
    * Update docs/source/en/model_doc/regnet.mdx
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * chore: PR feedback I.
    
    * fix: pt test. thanks to @ydshieh.
    
    * New adaptive pooler (huggingface#3)
    
    * feat: new adaptive pooler
    
    Co-authored-by: @Rocketknight1
    
    * chore: remove image_size argument.
    
    Co-authored-by: matt <[email protected]>
    
    Co-authored-by: matt <[email protected]>
    
    * Empty-Commit
    
    * chore: remove image_size comment.
    
    * chore: remove playground_tf.py
    
    * chore: minor changes related to spacing.
    
    * chore: make style.
    
    * Update src/transformers/models/regnet/modeling_tf_regnet.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * Update src/transformers/models/regnet/modeling_tf_regnet.py
    
    Co-authored-by: amyeroberts <[email protected]>
    
    * chore: refactored __init__.
    
    * chore: copied from -> taken from./g
    
    * adaptive pool -> global avg pool, channel check.
    
    * chore: move channel check to stem.
    
    * pr comments - minor refactor and add regnets to doc tests.
    
    * Update src/transformers/models/regnet/modeling_tf_regnet.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * minor fix in the xlayer.
    
    * Empty-Commit
    
    * chore: removed from_pt=True.
    
    Co-authored-by: Sayak Paul <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    Co-authored-by: matt <[email protected]>
    Co-authored-by: amyeroberts <[email protected]>
    Co-authored-by: NielsRogge <[email protected]>
    6 people authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    a7eba83 View commit details
    Browse the repository at this point in the history
  8. Fix job links in Slack report (huggingface#17892)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    5cdfff5 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    47b9165 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    8f40077 View commit details
    Browse the repository at this point in the history
  11. Add MVP model (huggingface#17787)

    * Add MVP model
    
    * Update README
    
    * Remove useless module
    
    * Update docs
    
    * Fix bugs in tokenizer
    
    * Remove useless test
    
    * Remove useless module
    
    * Update vocab
    
    * Remove specifying
    
    * Remove specifying
    
    * Add #Copied ... statement
    
    * Update paper link
    
    * Remove useless TFMvp
    
    * Add #Copied ... statement
    
    * Fix style in test mvp model
    
    * Fix some typos
    
    * Fix properties of unset special tokens in non verbose mode
    
    * Update paper link
    
    * Update MVP doc
    
    * Update MVP doc
    
    * Fix README
    
    * Fix typos in docs
    
    * Update docs
    StevenTang1998 authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    3cff4cc View commit details
    Browse the repository at this point in the history
  12. Fix img seg tests (load checkpoints from hf-internal-testing) (hugg…

    …ingface#17939)
    
    * Revert "Skip failing test until they are fixed."
    
    This reverts commit 8f40077.
    
    * Use `tiny-detr` checkpts from `hf-internal-testing`
    mishig25 authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    77b7667 View commit details
    Browse the repository at this point in the history
  13. Fix all is_torch_tpu_available issues (huggingface#17936)

    * Fix all is_torch_tpu_available
    muellerzr authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    7c4c6f6 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    4c722e9 View commit details
    Browse the repository at this point in the history
  15. Use explicit torch version in deepspeed CI (huggingface#17942)

    * use explicit torch version
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    9fe2403 View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    d444edb View commit details
    Browse the repository at this point in the history
  17. PyTorch 1.12.0 for scheduled CI (huggingface#17949)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    b089cca View commit details
    Browse the repository at this point in the history
  18. ExplicitEnum subclass str (JSON dump compatible) (huggingface#17933)

    * ExplicitEnum subclass str (JSON dump compatible)
    
    * allow union if one of the types is str
    Bram Vanroy authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    bc019b0 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    5feac3d View commit details
    Browse the repository at this point in the history
  20. add MobileViT model (huggingface#17354)

    * add MobileViT
    
    * fixup
    
    * Update README.md
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * remove empty line
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * use clearer variable names
    
    * rename to MobileViTTransformerLayer
    
    * no longer inherit from nn.Sequential
    
    * fixup
    
    * fixup
    
    * not sure why this got added twice
    
    * rename organization for checkpoints
    
    * fix it up
    
    * Update src/transformers/models/mobilevit/__init__.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/configuration_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/configuration_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/configuration_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update tests/models/mobilevit/test_modeling_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/modeling_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/modeling_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/modeling_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Update src/transformers/models/mobilevit/modeling_mobilevit.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * code style improvements
    
    * fixup
    
    * Update docs/source/en/model_doc/mobilevit.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update docs/source/en/model_doc/mobilevit.mdx
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update src/transformers/models/mobilevit/configuration_mobilevit.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Update src/transformers/models/mobilevit/configuration_mobilevit.py
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * download labels from hub
    
    * rename layers
    
    * rename more layers
    
    * don't compute loss in separate function
    
    * remove some nn.Sequential
    
    * replace nn.Sequential with new MobileViTTransformer class
    
    * replace nn.Sequential with MobileViTMobileNetLayer
    
    * fix pruning since model structure changed
    
    * fixup
    
    * fix doc comment
    
    * remove custom resize from feature extractor
    
    * fix ONNX import
    
    * add to doc tests
    
    * use center_crop from image_utils
    
    * move RGB->BGR flipping into image_utils
    
    * fix broken tests
    
    * wrong type hint
    
    * small tweaks
    
    Co-authored-by: NielsRogge <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    3 people authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    fbc7598 View commit details
    Browse the repository at this point in the history
  21. Fix huggingface#17893, removed dead code (huggingface#17917)

    * Removed dead position_id code, fix huggingface#17893
    
    * Removed unused var
    
    * Now ignores removed (dead) dict key for backward comp
    clefourrier authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    eb1493b View commit details
    Browse the repository at this point in the history
  22. Flax t5 Encoder (huggingface#17784)

    * first draft adding Flax-t5-encoder and Flax-mt5-encoder
    
    * imports
    
    * after make fixup
    
    * flax t5 encoder test
    
    * black on test
    
    * make fix-copies
    
    * clean
    
    * all_model_classes -> tuple
    
    * clean test
    
    * is_encoder_decoder=False in t5-enc tester
    
    * remove file docstring before FlaxT5Encoder
    
    * black
    
    * isort
    
    * commit suggestions on src/transformers/models/t5/modeling_flax_t5.py
    
    Co-authored-by: Suraj Patil <[email protected]>
    
    * commit suggestions on src/transformers/models/t5/modeling_flax_t5.py
    
    Co-authored-by: Suraj Patil <[email protected]>
    
    * Apply suggestions from code review
    
    Co-authored-by: Suraj Patil <[email protected]>
    
    * remove _get_encoder_module
    
    * self.decoder_seq_length -> self.encoder_seq_length as t5-enc does not have decoder
    
    * bugfix - self.module_class is class itself, not instance;
    
    * docs for mt5 and t5
    
    * call -> __call__ in t5 doc
    
    * FlaxMT5EncoderModel to TYPE_HINT
    
    * run doc-builder to allow change the files
    
    Co-authored-by: Suraj Patil <[email protected]>
    crystina-z and patil-suraj authored Jun 29, 2022
    Configuration menu
    Copy the full SHA
    692e61e View commit details
    Browse the repository at this point in the history

Commits on Jun 30, 2022

  1. Fix GPT-NeoX-20B past handling, attention computation (huggingface#17811

    )
    
    * Fix GPT-NeoX-20B past handling, swap attention computation to hopefully avoid NaN, update docs
    
    * 20B tests
    zphang authored Jun 30, 2022
    Configuration menu
    Copy the full SHA
    205bc41 View commit details
    Browse the repository at this point in the history
  2. Unifying training argument type annotations (huggingface#17934)

    * doc: Unify training arg type annotations
    
    * wip: extracting enum type from Union
    
    * blackening
    jannisborn authored Jun 30, 2022
    Configuration menu
    Copy the full SHA
    4f8361a View commit details
    Browse the repository at this point in the history
  3. [Pipelines] Add revision tag to all default pipelines (huggingface#17667

    )
    
    * trigger test failure
    
    * upload revision poc
    
    * Update src/transformers/pipelines/base.py
    
    Co-authored-by: Julien Chaumond <[email protected]>
    
    * up
    
    * add test
    
    * correct some stuff
    
    * Update src/transformers/pipelines/__init__.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * correct require flag
    
    Co-authored-by: Julien Chaumond <[email protected]>
    Co-authored-by: Sylvain Gugger <[email protected]>
    3 people authored Jun 30, 2022
    Configuration menu
    Copy the full SHA
    e4d2588 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    f25457b View commit details
    Browse the repository at this point in the history
  5. CLI: convert sharded PT models (huggingface#17959)

    * sharded conversion; add flag to control max hidden error
    
    * better hidden name matching
    
    * Add test: load TF from PT shards
    
    * fix test (PT data must be local)
    gante authored Jun 30, 2022
    Configuration menu
    Copy the full SHA
    91e1f24 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    fe14046 View commit details
    Browse the repository at this point in the history
  7. Add ONNX support for LayoutLMv3 (huggingface#17953)

    * Add ONNX support for LayoutLMv3
    
    * Update docstrings
    
    * Update empty description in docstring
    
    * Fix imports and type hints
    regisss authored Jun 30, 2022
    Configuration menu
    Copy the full SHA
    9cb7cef View commit details
    Browse the repository at this point in the history
  8. feat: add pipeline registry abstraction (huggingface#17905)

    * feat: add pipeline registry abstraction
    
    - added `PipelineRegistry` abstraction
    - updates `add_new_pipeline.mdx` (english docs) to reflect the api addition
    - migrate `check_task` and `get_supported_tasks` from
      transformers/pipelines/__init__.py to
      transformers/pipelines/base.py#PipelineRegistry.{check_task,get_supported_tasks}
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    * fix: update with upstream/main
    
    chore: Apply suggestions from sgugger's code review
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * chore: PR updates
    
    - revert src/transformers/dependency_versions_table.py from upstream/main
    - updates pipeline registry to use global variables
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    * tests: add tests for pipeline registry
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    * tests: add test for output warning.
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    * chore: fmt and cleanup unused imports
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    * fix: change imports to top of the file and address comments
    
    Signed-off-by: Aaron Pham <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    aarnphm and sgugger authored Jun 30, 2022
    Configuration menu
    Copy the full SHA
    49cd736 View commit details
    Browse the repository at this point in the history

Commits on Jul 1, 2022

  1. skip some gpt_neox tests that require 80G RAM (huggingface#17923)

    * skip some gpt_neox tests that require 80G RAM
    
    * remove tests
    
    * fix quality
    
    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    14fb8a6 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    cb42502 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    569b679 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    3a064bd View commit details
    Browse the repository at this point in the history
  5. fixing fsdp autowrap functionality (huggingface#17922)

    * fixing fsdp autowrap functionality
    
    * update version and quality
    
    * update torch version to latest stable version
    pacman100 authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    462b7f3 View commit details
    Browse the repository at this point in the history
  6. add ONNX support for BLOOM (huggingface#17961)

    * add onnx support for BLOOM
    
    * use TYPE_CHECKING for type annotations
    
    * fix past_shape for bloom (different from gpt2)
    
    * use logical_or instead of `+` for onnx support
    
    * bigger `atol_for_validation` for larger bloom models
    
    * copied -> taken because it's no longer an exact copy
    
    * remove "copied from" comment
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    NouamaneTazi and sgugger authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    b68d408 View commit details
    Browse the repository at this point in the history
  7. Fix FlaxBigBirdEmbeddings (huggingface#17842)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    8bb2c38 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    664688b View commit details
    Browse the repository at this point in the history
  9. [Flax] Add remat (gradient checkpointing) (huggingface#17843)

    * [Flax] Add remat (gradient checkpointing)
    
    * fix variable naming in test
    
    * flip: checkpoint using a method
    
    * fix naming
    
    * fix class naming
    
    * apply PVP's suggestions from code review
    
    * make fix-copies
    
    * fix big-bird, electra, roberta
    
    * cookie-cutter
    
    * fix flax big-bird
    
    * move test to common
    sanchit-gandhi authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    485bbe7 View commit details
    Browse the repository at this point in the history
  10. XLA train step fixes (huggingface#17973)

    * Copy inputs to train and test step before modifying them, as this breaks things
    
    * Add XLA tests, fix our loss functions to be XLA-compatible
    
    * make fixup
    
    * Update loss computation test to expect vector of per-sample losses
    
    * Patch loss for TFLED
    
    * Patch loss for TFAlbert
    
    * Add a tf_legacy_loss config flag that enables old loss functions
    
    * Stop using config.get() because it's not a dict
    
    * Skip loss computation test for RAG because its loss is very strange and I'm afraid to rewrite it
    
    * make fixup
    
    * Add XLA-compatible RAG loss
    
    * Fix dtype of loss mask for TFAlbert
    
    * Fix test for XLNet too because it overrides the default one
    
    * make fixup
    
    * Fix config test
    
    * No more depending on GPU NaN behaviour
    
    * Add test, avoid potential zero division
    
    * Fix test item assignment
    
    * Fix loss computation masking test
    
    * make fixup
    
    * Fix dtype bugs
    Rocketknight1 authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    d6cec45 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    009171d View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    6f0723a View commit details
    Browse the repository at this point in the history
  13. Shifting labels for causal LM when using label smoother (huggingface#…

    …17987)
    
    * Shifting labels for causal LM when using label smoother
    
    When training CausalLM, loss is computed within model's foward() function and
    labels are shifted internally. However, if label smoothing is applied, loss is
    computed in trainer's compute_loss function and labels are not shifted.
    This causes unintended confusion during the alignment of labels and corresponding
    inputs. This commit is for resolving this confusion.
    
    Resolves huggingface#17960
    
    On branch shift_labels_for_causalLM
    Changes to be committed:
    	modified:   src/transformers/trainer.py
    	modified:   src/transformers/trainer_pt_utils.py
    
    * Update trainer.py
    
    * Update src/transformers/trainer.py
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    seungeunrho and sgugger authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    6890d19 View commit details
    Browse the repository at this point in the history
  14. Exclude Databricks from notebook env only if the runtime is below 11.0 (

    huggingface#17988)
    
    * Exclude Databricks from notebook env only if the runtime is below 11.0
    
    * Dummy commit to trigger CI
    
    * Empty commit to trigger CI
    
    * Empty commit to trigger CI
    
    * Empty commit to trigger CI
    
    * Empty commit to trigger CI
    
    * Empty commit to trigger CI
    
    * Empty commit to trigger CI
    
    * Empty commit to trigger CI
    davidheryanto authored Jul 1, 2022
    Configuration menu
    Copy the full SHA
    49c8c67 View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2022

  1. Configuration menu
    Copy the full SHA
    a045cbd View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7b18702 View commit details
    Browse the repository at this point in the history
  3. Add TF ResNet model (huggingface#17427)

    * Rought TF conversion outline
    
    * Tidy up
    
    * Fix padding differences between layers
    
    * Add back embedder - whoops
    
    * Match test file to main
    
    * Match upstream test file
    
    * Correctly pass and assign image_size parameter
    
    Co-authored-by: Sayak Paul <[email protected]>
    
    * Add in MainLayer
    
    * Correctly name layer
    
    * Tidy up AdaptivePooler
    
    * Small tidy-up
    
    More accurate type hints and remove whitespaces
    
    * Change AdaptiveAvgPool
    
    Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. https://github.com/huggingface/transformers/pull/17554/files/9e26607e22aa8d069c86b50196656012ff0ce62a#r900109509
    
    Co-authored-by: From: matt <[email protected]>
    Co-authored-by: Sayak Paul <[email protected]>
    
    * Use updated AdaptiveAvgPool
    
    Co-authored-by: matt <[email protected]>
    
    * Make AdaptiveAvgPool compatible with CPU
    
    * Remove image_size from configuration
    
    * Fixup
    
    * Tensorflow -> TensorFlow
    
    * Fix pt references in tests
    
    * Apply suggestions from code review - grammar and wording
    
    Co-authored-by: NielsRogge <[email protected]>
    
    Co-authored-by: NielsRogge <[email protected]>
    
    * Add TFResNet to doc tests
    
    * PR comments - GlobalAveragePooling and clearer comments
    
    * Remove unused import
    
    * Add in keepdims argument
    
    * Add num_channels check
    
    * grammar fix: by -> of
    
    Co-authored-by: matt <[email protected]>
    
    Co-authored-by: Matt <[email protected]>
    
    * Remove transposes - keep NHWC throughout forward pass
    
    * Fixup look sharp
    
    * Add missing layer names
    
    * Final tidy up - remove from_pt now weights on hub
    
    Co-authored-by: Sayak Paul <[email protected]>
    Co-authored-by: matt <[email protected]>
    Co-authored-by: NielsRogge <[email protected]>
    Co-authored-by: Matt <[email protected]>
    5 people authored Jul 4, 2022
    Configuration menu
    Copy the full SHA
    77ea513 View commit details
    Browse the repository at this point in the history
  4. Refactor to inherit from nn.Module instead of nn.ModuleList (huggingf…

    …ace#17501)
    
    * Refactor to inherit from nn.Module instead of nn.ModuleList
    
    * Fix typo
    
    * Empty to trigger CI re-run
    
    Blender Bot tests failing (should be unrelated to this PR) and pass locally). I don't have sufficient permisisons to re-run the CI workflow (totally or from failed)
    amyeroberts authored Jul 4, 2022
    Configuration menu
    Copy the full SHA
    cf2578a View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    3cfdefa View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    7498db0 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    6cb1954 View commit details
    Browse the repository at this point in the history
  8. Return scalar losses instead of per-sample means (huggingface#18013)

    * Return scalar losses instead of per-sample means
    
    * Make loss shape (1,) instead of scalar
    
    * Allow scalar losses in test_loss_computation
    
    * Allow scalar losses in test_loss_computation
    
    * Allow scalar losses in test_loss_computation
    
    * Remove XLA loss function for RAG
    Rocketknight1 authored Jul 4, 2022
    Configuration menu
    Copy the full SHA
    96d833b View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    e3139ad View commit details
    Browse the repository at this point in the history
  10. TF: T5 can now handle a padded past (i.e. XLA generation) (huggingfac…

    …e#17969)
    
    * get the right slicing index for position_bias
    gante authored Jul 4, 2022
    Configuration menu
    Copy the full SHA
    f098268 View commit details
    Browse the repository at this point in the history

Commits on Jul 5, 2022

  1. Configuration menu
    Copy the full SHA
    97db5b4 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ec07ecc View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    5ae087c View commit details
    Browse the repository at this point in the history
  4. Enable Past CI (huggingface#17919)

    Co-authored-by: ydshieh <[email protected]>
    ydshieh and ydshieh authored Jul 5, 2022
    Configuration menu
    Copy the full SHA
    f681437 View commit details
    Browse the repository at this point in the history

Commits on Jul 6, 2022

  1. Squash commits (huggingface#17981)

    Co-authored-by: Niels Rogge <[email protected]>
    NielsRogge and Niels Rogge authored Jul 6, 2022
    Configuration menu
    Copy the full SHA
    22edb68 View commit details
    Browse the repository at this point in the history
  2. Fix T5 incorrect weight decay in Trainer and official summarization e…

    …xample (huggingface#18002)
    
    * Add ALL_LAYERNORM_LAYERS for LayerNorm
    
    * fix bug of appending layer norm
    ADAning authored Jul 6, 2022
    Configuration menu
    Copy the full SHA
    bf37e5c View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    360719a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    be79cd7 View commit details
    Browse the repository at this point in the history
  5. Doc to dataset (huggingface#18037)

    * Link to the Datasets doc
    
    * Remove unwanted file
    sgugger authored Jul 6, 2022
    Configuration menu
    Copy the full SHA
    2e90c3d View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    870ff9e View commit details
    Browse the repository at this point in the history

Commits on Jul 7, 2022

  1. Configuration menu
    Copy the full SHA
    1b5ea74 View commit details
    Browse the repository at this point in the history
  2. Sort doc toc (huggingface#18034)

    * Add script to sort doc ToC
    
    * Style and fixes
    
    * Add check to quality job
    sgugger authored Jul 7, 2022
    Configuration menu
    Copy the full SHA
    1b749a7 View commit details
    Browse the repository at this point in the history
  3. Added Command for windows VENV activation in installation docs (huggi…

    …ngface#18008)
    
    * Added command for windows VENV activation
    
    * changed linux and macos  specification
    darthvader2 authored Jul 7, 2022
    Configuration menu
    Copy the full SHA
    91c4a3a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    2544c14 View commit details
    Browse the repository at this point in the history
  5. Drop columns after loading samples in prepare_tf_dataset (huggingface…

    …#17967)
    
    * Drop columns after loading samples, rather than before, to avoid breaking transforms
    
    * make fixup
    
    * Add workaround so this PR can work with current datasets version
    Rocketknight1 authored Jul 7, 2022
    Configuration menu
    Copy the full SHA
    de46cde View commit details
    Browse the repository at this point in the history

Commits on Jul 8, 2022

  1. Fix slow CI by pinning resampy (huggingface#18077)

    * Fix slow CI by pinning resampy
    
    * Actually put it in the speech dependencies
    sgugger authored Jul 8, 2022
    Configuration menu
    Copy the full SHA
    9bd3968 View commit details
    Browse the repository at this point in the history
  2. Fix type issue in using bucketing with Trainer (huggingface#18051)

    * Fix type issue in using bucketing with Trainer
    
    - Fix type issues in LengthGrouperSampler,
      DistributedLengthGroupedSampler
    
    refs: huggingface#18003
    
    * Change logging type in LengthGroupedSampler
    
    - Change `logger.warning` to `logger.info`
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Change logging type in DistributedLengthGroupedSampler
    
    - Change `logger.warning` to `logger.info`
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Remove adundant clause in LengthGroupedSampler
    
    - Use `elif`
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Remove adundant clause in DistributedLengthGroupedSampler
    
    - Use `elif`
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    
    * Apply black, isort to modified codes in the script
    
    Co-authored-by: Sylvain Gugger <[email protected]>
    seopbo and sgugger authored Jul 8, 2022
    Configuration menu
    Copy the full SHA
    94ca7d2 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    7c046c5 View commit details
    Browse the repository at this point in the history
  4. Make predict() close progress bars after finishing (huggingface#17952) (

    huggingface#18078)
    
    * Make Trainer.predict call on_evaluate (huggingface#17952)
    
    * Add on_predict
    
    * Small fix
    
    * Small and different fix
    
    * Add tests
    neverix authored Jul 8, 2022
    Configuration menu
    Copy the full SHA
    8b332a6 View commit details
    Browse the repository at this point in the history

Commits on Jul 9, 2022

  1. Remove onnx conflicts.

    fadi212 committed Jul 9, 2022
    Configuration menu
    Copy the full SHA
    b63183f View commit details
    Browse the repository at this point in the history
  2. fix: remove the bug.

    fadi212 committed Jul 9, 2022
    Configuration menu
    Copy the full SHA
    6bb0faa View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2022

  1. wip

    albertoandreottiATgmail committed Sep 9, 2022
    Configuration menu
    Copy the full SHA
    cb2e4aa View commit details
    Browse the repository at this point in the history

Commits on Sep 10, 2022

  1. WIP

    fadi212 authored Sep 10, 2022
    Configuration menu
    Copy the full SHA
    eff30e0 View commit details
    Browse the repository at this point in the history
  2. Made export() compatible

    fadi212 authored Sep 10, 2022
    Configuration menu
    Copy the full SHA
    ceeb731 View commit details
    Browse the repository at this point in the history