Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading GGUF support #2

Draft
wants to merge 253 commits into
base: main
Choose a base branch
from
Draft

Loading GGUF support #2

wants to merge 253 commits into from

Conversation

LysandreJik
Copy link
Owner

WIP

Copy link
Owner Author

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok good first PR! Let's clean it up a bit, I want to take a look at the changes in the from_pretrained method to clean things up a bit as its currently doing a lot of changes in several places

Also I wonder if we can't change the loading methods to also only return the metadata and not tensors in some situations. As that read is done sequentially, and the config and tokenizer only need metadata, we could save a bunch of time by not requiring tensors load

Comment on lines 1402 to 1405
class GGUFTokenizer:
def __init__(self, dict_):
for k, v in dict_.items():
setattr(self, k, v)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to rename that/make it clearer

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These modifications should live under integrations as well

Comment on lines 1410 to 1411
# requires_backends(self, "gguf")
# super().__init__()
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to make that better as well

Comment on lines 176 to 201
def _gguf_read_value(f, data_type):
if data_type == DATA_TYPES["string"]:
length = struct.unpack("<Q", f.read(8))[0]
return f.read(length).decode("utf-8")

elif data_type == DATA_TYPES["uint32"]:
return struct.unpack("<I", f.read(4))[0]

elif data_type == DATA_TYPES["uint64"]:
return struct.unpack("<Q", f.read(8))[0]

elif data_type == DATA_TYPES["int32"]:
return struct.unpack("<i", f.read(4))[0]

elif data_type == DATA_TYPES["float32"]:
return struct.unpack("<f", f.read(4))[0]

elif data_type == DATA_TYPES["array"]:
data_type, count = struct.unpack("<IQ", f.read(4 + 8))
return [_gguf_read_value(f, data_type) for _ in range(count)]
elif data_type == DATA_TYPES["bool"]:
# This should correspond to `GGUF_METADATA_VALUE_TYPE_BOOL`
# 1-byte value where 0 is false and 1 is true.
return struct.unpack("<b", f.read(1))[0]
else:
raise NotImplementedError(f"Data type {data_type} not implemented")
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(younes) IMO this and the method below are doing basically the same thing as _gguf_parse_value and load_gguf_checkpoint_in_pytorch_model.

We should clean that up

Comment on lines 3141 to 3172
from .modeling_gguf_pytorch_utils import load_and_convert_gguf_config

if not is_gguf_available():
raise ValueError(
"You need to have `gguf` installed in order to convert GGUF weights. `pip install gguf`"
)

# Case 1: the GGUF file is present locally
if os.path.isfile(from_gguf):
gguf_path = from_gguf
# Case 2: The GGUF path is a location on the Hub
# Load from URL or cache if already cached
else:
cached_file_kwargs = {
"cache_dir": cache_dir,
"force_download": force_download,
"proxies": proxies,
"resume_download": resume_download,
"local_files_only": local_files_only,
"token": token,
"user_agent": user_agent,
"revision": revision,
"subfolder": subfolder,
"_raise_exceptions_for_gated_repo": False,
"_raise_exceptions_for_missing_entries": False,
"_commit_hash": commit_hash,
}

gguf_path = cached_file(pretrained_model_name_or_path, from_gguf, **cached_file_kwargs)

config = load_and_convert_gguf_config(gguf_path)
model_kwargs = kwargs
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would likely send that all to another method to take care of that to not add too much code to the already bloated from_pretrained

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! Done !

Comment on lines 3224 to 3196
if from_gguf is not None and hf_quantizer is not None:
raise ValueError(
"You cannot combine Quantization and loading a model from a GGUF file, try again by making sure you did not passed a `quantization_config` or that you did not loaded a quantized model from the Hub."
)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the file is loaded in a pt state dict, quantization cannot be applied?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(happy to not support that for now, indeed)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be complicated as we would only support the quant schemes that do not require data calibration and would require many patches and if/else checks everywhere :/

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

berk

src/transformers/models/auto/auto_factory.py Outdated Show resolved Hide resolved
src/transformers/models/cohere/modeling_cohere.py Outdated Show resolved Hide resolved
@LysandreJik LysandreJik changed the base branch from main to rename_ex_file April 19, 2024 15:41
@LysandreJik LysandreJik changed the base branch from rename_ex_file to main April 19, 2024 15:41
hiyouga and others added 3 commits April 19, 2024 17:45
…ned (huggingface#30299)

* Update modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py

* Update test_modeling_utils.py
* Duplicate swiftformer

* Convert SwiftFormerPatchEmbedding

* Convert SwiftFormerEmbeddings

* Convert TFSwiftFormerMlp

* Convert TFSwiftFormerConvEncoder

* Convert TFSwiftFormerLocalRepresentation

* convert TFSwiftFormerEncoderBlock

* Convert SwiftFormerStage

* Convert SwiftFormerEncoder

* Add TFSWiftFormerPreTrainedModel

* Convert SwiftFormerForImageClassification

* Add kwargs and start drop path

* Fix syntax

* Change Model class name

* Add TFSwiftFormer to __init__

* Duplicate test_modeling_swiftformer

* First test conversions

* Change require_torch to require_tf

* Add exports to swiftformer __init__

* Add TFSwiftFormerModel wrapper

* Fix __init__ and run black

* Remove docstring from MainLayer, fix padding

* Use keras.layers.Activation on keras.Sequential

* Fix swiftformer exports

* Fix activation layer from config

* Remove post_inits

* Use tf.keras.layers.ZeroPadding2D

* Convert torch normalize

* Change tf test input shape

* Fix softmax and reduce_sum

* Convert expand_dims and repeat

* Add missing reshape and tranpose

* Simplify TFSwiftFormerEncoderBlock.call

* Fix mismatch in patch embeddings

* Fix expected output shape to match channels last

* Fix swiftformer typo

* Disable test_onnx

* Fix TFSwiftFormerForImageClassification call

* Add unpack inputs

* Convert flatten(2).mean(-1)

* Change vision dummy inputs (to be reviewed)

* Change test_forward_signature to use .call

* Fix @unpack_inputs

* Set return_tensors="tf" and rename class

* Rename wrongly named patch_embeddings layer

* Add serving_output and change dummy_input shape

* Make dimensions BCHW and transpose inside embedding layer

* Change SwiftFormerEncoderBlock

* Fix ruff problems

* Add image size to swiftformer config

* Change tranpose to MainLayer and use -1 for reshape

* Remove serving_outputs and dummy_inputs

* Remove test_initialization test from tf model

* Make Sequential component a separate layer

* Fix layers' names

* Tranpose encoder outputs

* Fix tests and check if hidden states is not None

* Fix TFSwiftFormerForImageClassification

* Run make fixup

* Run make fix-copies

* Update modeling_tf_auto

* Update docs

* Fix modeling auto mapping

* Update modelint_tf_swiftformer docs

* Fill image_size doc and type

* Add reduction=None to loss computation

* Update docs

* make style

* Debug: Delete the tip to see if that changes anything

* Re-add tip

* Remove add_code_sample_docstrings

* Remove unused import

* Get the debug to actually tell us the problem it has with the docs

* Try a substitution to match the PyTorch file?

* Add swiftformer to ignore list

* Add build() methods

* Update copyright year

Co-authored-by: amyeroberts <[email protected]>

* Remove FIXME comment

* Remove from_pt

* Update copyright year

Co-authored-by: amyeroberts <[email protected]>

* Rename one-letter variables

* Remove FIXMEs related to momentum

* Remove old TODO comment

* Remove outstanding FIXME comments

* Get dropout rate from config

* Add specific dropout config for MLP

* Add convencoder dropout to config

* Pass config to SwiftFormerDropPath layer

* Fix drop_path variable name and add Adapted from comment

* Run ruff

* Removed copied from comment

* Run fix copies

* Change drop_path to identity to match pt

* Cleanup build() methods and move to new keras imports

* Update docs/source/en/model_doc/swiftformer.md

Co-authored-by: Matt <[email protected]>

* Raise error if drop_path_rate > 0.0

* Apply suggestions from code review

Replace (self.dim), with self.dim,

Co-authored-by: Matt <[email protected]>

* Remove drop_path function

* Add training to TFSwiftFormerEncoder

* Set self.built = True last

Co-authored-by: amyeroberts <[email protected]>

* Should have been added to previous commit

Co-authored-by: amyeroberts <[email protected]>

* Apply suggestions from code review

Co-authored-by: amyeroberts <[email protected]>

* Change default_feature_extractor to default_image_processor

Co-authored-by: amyeroberts <[email protected]>

* Import Keras from modeling_tf_utils

* Remove relative import

* Run ruff --fix

* Move import keras to tf_available

* Add copied from comment to test_forward_signature

* Reduce batch size and num_labels

* Extract loss logic to hf_compute_loss

* Run ruff format

---------

Co-authored-by: Matt <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Matt <[email protected]>
* Add resources

* Address comments

* Apply suggestions from code review

Co-authored-by: amyeroberts <[email protected]>

---------

Co-authored-by: amyeroberts <[email protected]>
@99991
Copy link

99991 commented Apr 22, 2024

I'm happy to see my code live in 🤗 transformers!

I added support for Q2_K, Q3_K and Q5_K yesterday. Feel free to copy as well.

99991/pygguf@a417edb

younesbelkada and others added 22 commits April 22, 2024 11:32
* Update llava_next.md

* Update seggpt.md
* feat: support for vitmatte

* feat: support for vivit

* feat: support for beit

* feat: support for blip :D

* feat: support for data2vec
* warn if pad token is negative

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: amyeroberts <[email protected]>

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <[email protected]>

---------

Co-authored-by: amyeroberts <[email protected]>
Co-authored-by: Joao Gante <[email protected]>
…gingface#30002)

* Add FSDP config for CPU RAM efficient loading

* Style fix

* Update src/transformers/training_args.py

Co-authored-by: Zach Mueller <[email protected]>

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <[email protected]>

* Add sync_module_states and cpu_ram_efficient_loading validation logic

* Update src/transformers/training_args.py

Co-authored-by: amyeroberts <[email protected]>

* Style

---------

Co-authored-by: Zach Mueller <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
* nit to make sure cache positions are not sliced

* fix other models

* nit

* style
* Update docstrings for text generation pipeline

* Fix docstring arg

* Update docstring to explain chat mode

* Fix doctests

* Fix doctests
* stash commit (will discard all of this)

* stash commit

* First commit - needs a lot of testing!

* Add a test

* Fix imports and make the tests actually test something

* Tests pass!

* Rearrange test

* Add comments (but it's still a bit confusing)

* Stop storing the tokenizer

* Comment fixup

* Fix for input_ids with a single sequence

* Update tests to test single sequences

* make fixup

* Fix incorrect use of isin()

* Expand tests to catch more cases

* Expand tests to catch more cases

* make fixup

* Fix length calculation and update tests

* Handle Ġ as a space replacement too

* Update src/transformers/generation/stopping_criteria.py

Co-authored-by: Joao Gante <[email protected]>

* Add optimizations from Joao's suggestion

* Remove TODO

* Update src/transformers/generation/stopping_criteria.py

Co-authored-by: Joao Gante <[email protected]>

* Update tests/generation/test_stopping_criteria.py

Co-authored-by: Joao Gante <[email protected]>

* make fixup

* Rename some variables and remove some debugging clauses for clarity

* Add tests for the sub-methods

* Clarify one test slightly

* Add stop_strings to GenerationConfig

* generate() supports stop_string arg, asks for tokenizer if not provided

* make fixup

* Cleanup code and rename variables for clarity

* Update tokenizer error

* Update tokenizer passing, handle generation on GPU

* Slightly more explanation cleanup

* More comment cleanup

* Factor out the token cleanup so it's more obvious what we're doing, and we can change it later

* Careful with that cleanup!

* Cleanup + optimizations to _get_matching_positions

* More minor performance tweaks

* Implement caching and eliminate some expensive ops (startup time: 200ms -> 9ms)

* Remove the pin_memory call

* Parallelize across all stop strings!

* Quick fix for tensor devices

* Update embeddings test for the new format

* Fix test imports

* Manual patching for BERT-like tokenizers

* Return a bool vector instead of a single True/False

* Better comment

* Better comment

* Add tests from @zucchini-nlp

* Amy's list creation nit

* tok_list -> token_list

* Push a big expanded docstring (should we put it somewhere else?)

* Expand docstrings

* Docstring fixups

* Rebase

* make fixup

* Make a properly general method for figuring out token strings

* Fix naming throughout the functions

* Move cache, refactor, fix tests

* Add comment

* Remove finished TODO

* Remove finished TODO

* make fixup

* Update src/transformers/generation/stopping_criteria.py

Co-authored-by: amyeroberts <[email protected]>

* Update and shorten docstring

* Update tests to be shorter/clearer and test specific cases

---------

Co-authored-by: Joao Gante <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
* Add class_embed to tied weights for DETA

* Fix test_tied_weights_keys for DETA model

* Replace error raise with assert statement
* add sdpa to wav2vec.
Co-authored-by: kamilakesbi <[email protected]>
Co-authored-by: jp1924 <[email protected]>

* add fa2 to wav2vec2

* add tests

* fix attention_mask compatibility with fa2

* minor dtype fix

* replace fa2 slow test

* fix fa2 slow test

* apply code review + add fa2 batch test

* add sdpa and fa2 to hubert

* sdpa and fa2 to data2vec_audio

* sdpa and fa2 to Sew

* sdpa to unispeech + unispeech sat

* small fix

* attention mask in tests

Co-authored-by: Sanchit Gandhi <[email protected]>

* add_speedup_benchmark_to_doc

---------

Co-authored-by: [email protected] <[email protected]>
Co-authored-by: Sanchit Gandhi <[email protected]>
* [FEAT]: EETQ quantizer support

* Update quantization.md

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: Marc Sun <[email protected]>

* Update docs/source/en/quantization.md

Co-authored-by: Marc Sun <[email protected]>

* Update docs/source/en/quantization.md

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/integrations/__init__.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/integrations/__init__.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/integrations/eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/integrations/eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/integrations/eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update tests/quantization/eetq_integration/test_eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/auto.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/auto.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/auto.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update tests/quantization/eetq_integration/test_eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/quantizers/quantizer_eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update tests/quantization/eetq_integration/test_eetq.py

Co-authored-by: Marc Sun <[email protected]>

* Update tests/quantization/eetq_integration/test_eetq.py

Co-authored-by: Marc Sun <[email protected]>

* [FEAT]: EETQ quantizer support

* [FEAT]: EETQ quantizer support

* remove whitespaces

* update quantization.md

* style

* Update docs/source/en/quantization.md

Co-authored-by: Younes Belkada <[email protected]>

* add copyright

* Update quantization.md

* Update docs/source/en/quantization.md

Co-authored-by: amyeroberts <[email protected]>

* Update docs/source/en/quantization.md

Co-authored-by: amyeroberts <[email protected]>

* Address the comments by amyeroberts

* style

---------

Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
younesbelkada and others added 30 commits May 13, 2024 12:08
* blip with interpolated pos encoding

* feat: Add interpolate_pos_encoding option to other models from `BLIP` family.

* include check for textual generated content in tests
* remove unrelated changes

* remove unrelated changes on phi and stable LM

* add: Test for Falcon 10B

* fix: formatting

* fix: loading the falcon 10B in 8 bit precision using bitsanbytes.

* fix: device placement

* fix: broken tests.

* fix: backwards compatibility for falcon 1B architecture.

* chore: updated test.

* chore: test_modeling_falcon.py to use the 11B model.

* chore: minor edit

* chore: formating.

---------

Co-authored-by: Pablo Montalvo <[email protected]>
Co-authored-by: ArthurZucker <[email protected]>
* Adding ms_deform_attn kernels to GroundingDino

* Pointing to deformable detr kernels
* 4d mask fixes

* Update custom 4D mask logic

* test moved to mixin

* extra tests 4d mask

* upd 4d mask and StaticCache handling

* added Mask4DTestHard to mistral tests

* post-rebase fixes

* test fixes for StaticCache

* make fix-copies

* upd 1 after huggingface#30476

* fix common tests

* rm elif attention_mask.dim() == 4:

* tests combined, fixed, mixtral supported

* bigbird style chg reverted

* rm if attention_mask.dim() == 2

* modeling_llama formatting chg

---------

Co-authored-by: Joao Gante <[email protected]>
* attempt to fix multi-device generation

* fix

* final fix

* final fix

* fix

* fix

* fix

* fix

* add joao suggestion

* fix
qwen does not support the new cache classes
* check model.device

* fix

* style fix

* move model device

* remove print

* add comment

* fix

* add unit test

* optimize

* change test names and add more cases

* Update tests/pipelines/test_pipelines_common.py

Co-authored-by: amyeroberts <[email protected]>

---------

Co-authored-by: amyeroberts <[email protected]>
* Initial commit

* Just a copy of modeling_idefics.py that will be ported to TF

* - Prepend TF to the name of all classes
- Convert pytorch ops to TF (not all operations are converted yet)

* Add TF imports

* Add autotranslated files

* Add TF classes to model_tf_auto.py

* Add the TF classes in model_doc

* include auto-translated code

* Adopted from auto-translated version

* Add a forgotten super().build

* Add test code for TF version.

* Fix indentation and load pytorch weights for now

* Some fixes. Many tests are still failing but some are passing now.

- I have added TODO's for some of the hacks I made to unblock me
  and I will address them soon
- I have the processing_idefics.py hacked in my view to support TF temporarily

* Add ALL_LAYERNORM_LAYERS to match pytorch

* Revert "Add ALL_LAYERNORM_LAYERS to match pytorch"

This reverts commit 7e0a351 as it
is not needed in the tf implementation.

* Fix freeze_relevant_params()

* Some more fixes

* Fix test_attention_outputs

* Add tf stuff to processing_idefics.py

processing_idefics.py supports both pytorch and tf now.

test_processor_idefics.py for pytorch is passing, so i didn't break anything
but still some issues with tf. I also need to add tf tests in
test_processor_idefics.py.

* Pass return_tensors to image processing code and fix test

* Pass return_tensors to the image processor __init__

* Fix several test cases

- Make input to some of the forward pass of type `TFModelInputType`
- Decorate main layer forward pass with `@unpack_inputs`
- Decorate main layer with `@keras_serializable`
- Pass `inputs` to TFIdeficsModel

* Some more fixes forgotten in last commit

* Fix processing code and vision_tf.py

* Fix perceiver bug

* Import from

* Auto-add build() methods + style pass

* Fix build() errors due to `None` being passed as shape to some layers

* Change name in TFIdeficsForVisionText2Text to attribute in IdeficsForVisionText2Text

* Fix pytorch weights load for tf2

There were a lot of `name=` missing in weight initialization code.

* Attempt to fix CI

* Add back accidently removed line

* Remove torch-specific stuff from the TF test file

* make fix-copies, make style, remove autotranslated files

* Fixes to imports/docstrings

* Let's try the from future import in desperation

* Fix the core random_attention_mask fn to match the torch/flax behaviour

* Clean random_attention_mask up correctly

* Remove torch-only test

* Fix loss shape, couple of nits

* make style

* Don't test for OOB embeddings because IDEFICS uses those deliberately

* Fix loss computation to handle masking

* Fix test failures when flattening

* Fix some test failures

- Add cross attention gate which was missing and wasn't being passed arround
- Fix overwriting of image_attention_mask due to hack I had for dummy inputs

* Add a proper stateless scaled_dot_product_attention

* make style

* Adding missing attribute from the PyTorch version

* Small cleanups to decoupledlinearlayer in case that helps

* Pass epsilon to LayerNormalization

* Attemp to fix pytorch weight cross-loading for TFIdeficsEmbedding

* Fix a bug in TFIdeficsGatedCrossAttentionLayer

* Patching up build() methods

* Constant self.inv_freq

* Constant self.inv_freq

* First working version

The TF implementation works now, there was a bug in the TFIdeficsDecoupledLinear
where the weights were mis-intialized (in_features,out_features)
when it should be: (out_features, in_features)

I have tested this so far with tiny-random and idefics-9b-instruct
and gives correct output.

I also dumped the final outputs for both pytorch and TF
and they are identical.

* Fix some test failures

* remove print statement

* Fix return_tensors

* Fix CI test failure check_code_quality

* Attempt to fix CI failures by running `make fixup`

The hardcoded IDs in test_modeling_tf_idefics.py are for the integration
test and makes that file unreadable and should probably be moved to a seperate file.

* Attempt to fix tests_pr_documentation_tests

* Fix a test failure in test_image_processing_idefics.py

* Fix test test_pt_tf_model_equivalence

* Fix a few failures

* Tiny fix

* Some minor fixes

* Remove a duplicate test

* Override a few test failures for IDEFICS

- `test_keras_save_load` is passing now
- `test_compile_tf_model` is still failing

* Fix processing_idefics.py after rebase

* Guard import keras with is_tf_available

* fix check code quality

* fix check code quality

* Minor fixes

* Skip test_save_load temporarily

This test passed on my local box but fails on the CI, skipping
for now to see if there are other remaining failures on the CI.

* Run `ruff format tests src utils`

* Fix last failing test, `test_compile_tf_model`

* Add fixes for vision_tf.py

I forgot to add this file in last commit.

* Minor fixes

* Replace "<<<" with "<<" for doc tests

IDEFICS-9B is too big for doctest runner, so don't run it there

* Make code more readable

* Fix bug after code review

I added a layer_norm_eps to IdeficsConfig but I don't even need it
since the vision config has a layer_norm_eps.

* Fix after code review

Use original code tokenizer.convert_tokens_to_ids

* Keep PyTorch as the default return_tensors

* Fixes to modeling_tf after code review

* Fixes from code review

- Remove all references of `TF_IDEFICS_PRETRAINED_MODEL_ARCHIVE_LIST`
- Pass 1e-5 to LayerNormalization in perceiver

* Run ruff

* Undo a change

* Refactor processing code after Matt's suggestion

* Remove TODO's that aren't needed anymore

* For pytorch, Use original pytorch processing code from main

Since this PR is a TF port it shouldn't make any modifications
to pytorch IDEFICS code. This changes undo's the pytorch processing
modifications I made and uses original code from main.

* Update tests/models/idefics/test_modeling_idefics.py

* Update tests/models/idefics/test_modeling_tf_idefics.py

* Add missing imports for is_pt_tf_cross_test

* [DO NOT MERGE]: This is a commit for debugging and will be reverted

The cross test `test_pt_tf_model_equivalence` passes locally but
fails when running on the CI. This commit is to help debug that
and will be reverted.

* Revert "[DO NOT MERGE]: This is a commit for debugging and will be reverted"

This reverts commit 8f0d709.

* [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted

* [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted

* Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"

This reverts commit 998cc38.

* Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted"

This reverts commit 1c695ac.

* Don't skip test_save_load

IIRC test_save_load was also failing on the CI but not on my local
box, it might be easier to debug that on the CI first than the cross tests

* Debugging commit, will be reverted

* Revert "Debugging commit, will be reverted"

This reverts commit 8eafc8e.

* Override `test_save_load` and push model to save

Maybe this will help me repro this weird bug

* pass my repo_id

* add endpoint

* Pass a temp (write) token just for this CI

* Undo last few commits, still pushing to hub for model debugging

The issue seems to be with save_pretrained(),  when I looked at the model saved
from the CI test failure it is basically empty and has no weights.
`self.save_weights(..)` seems to be failing in save_pretrained but needs
more debugging

* Add logging to modeling tf utils, will be reverted just for debugging

* Debugging, will revert

* Revert "Debugging, will revert"

This reverts commit 9d0d307.

* Revert "Add logging to modeling tf utils, will be reverted just for debugging"

This reverts commit 774b6b7.

* Remove `test_save_load`

The CI failures are gone after my latest rebase, no idea why
but I was still saving the model to my hub on HF and the tf_model.h5
file now has everything.

* Run make fix-copies

* Run ruff format tests src utils

* Debugging commit, will be reverted

* Run ruff, also trigger CI run

* Run ruff again

* Undo debugging commit

---------

Co-authored-by: Matt <[email protected]>
Co-authored-by: Matt <[email protected]>
…e#30778)

* assistant should be greedy

* better comment

* Update src/transformers/generation/candidate_generator.py

Co-authored-by: amyeroberts <[email protected]>

---------

Co-authored-by: amyeroberts <[email protected]>
huggingface#30699)

* update

* update

* update

* update

* update

* update

* update

* update

* Update utils/notification_service.py

Co-authored-by: amyeroberts <[email protected]>

---------

Co-authored-by: ydshieh <[email protected]>
Co-authored-by: amyeroberts <[email protected]>
* Add utility for finding candidate models for deprecation

* Update model init

* Make into configurable script

* Fix path

* Add sorting of base object alphabetically

* Tidy

* Refactor __init__ alpha ordering

* Update script with logging

* fix import

* Fix logger

* Fix logger

* Get config file before moving files

* Take models from CLI

* Split models into lines to make easier to feed to deprecate_models script

* Update

* Use posix path

* Print instead

* Add example in module docstring

* Fix up

* Add clarifying comments; add models to DEPRECATE_MODELS

* Address PR comments

* Don't update relative paths on the same level
* update to ROCm 6.0.2 and test MI300

* add callers for mi300

* update dockerfile

* fix trainer tests

* remove apex

* style

* Update tests/trainer/test_trainer_seq2seq.py

* Update tests/trainer/test_trainer_seq2seq.py

* Update tests/trainer/test_trainer_seq2seq.py

* Update tests/trainer/test_trainer_seq2seq.py

* update to torch 2.3

* add workflow dispatch target

* we may need branches: mi300-ci after all

* nit

* fix docker build

* nit

* add check runner

* remove docker-gpu

* fix issues

* fix

---------

Co-authored-by: Yih-Dar <[email protected]>
Co-authored-by: ydshieh <[email protected]>
standardize cache in idefics2
…9676)

* add watermarking processor

* remove the other hashing (context width=1 always)

* make style

* Update src/transformers/generation/logits_process.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/logits_process.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/logits_process.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <[email protected]>

* update watermarking process

* add detector

* update tests to use detector

* fix failing tests

* rename `input_seq`

* make style

* doc for processor

* minor fixes

* docs

* make quality

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/logits_process.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/watermarking.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/watermarking.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/watermarking.py

Co-authored-by: Joao Gante <[email protected]>

* add PR suggestions

* let's use lru_cache's default max size (128)

* import processor if torch available

* maybe like this

* lets move the config to torch independet file

* add docs

* tiny docs fix to make the test happy

* Update src/transformers/generation/configuration_utils.py

Co-authored-by: Joao Gante <[email protected]>

* Update src/transformers/generation/watermarking.py

Co-authored-by: Joao Gante <[email protected]>

* PR suggestions

* add docs

* fix test

* fix docs

* address pr comments

* style

* Revert "style"

This reverts commit 7f33cc3.

* correct style

* make doctest green

---------

Co-authored-by: Joao Gante <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.