Allow saving, loading and pushing adapter compositions together (#771)

Closes #441; closes #747. This PR introduces a set of new methods for saving, loading and pushing entire adapter compositions with one command: - `save_adapter_setup()` - `load_adapter_setup()` - `push_adapter_setup_to_hub()` They require two main params: - `adapter_setup`: the adapter composition to be saved. Identical to what can be specified for `active_adapters` - `head_setup`: for models with heads, the head setup to save along with the adapters. Identical to what can be specified for `active_head` Docs [here](https://github.com/adapter-hub/adapters/blob/04e69957a2bfc8093e2593186f7ebb2e71f88ec9/docs/loading.md#saving-and-loading-adapter-compositions) ### Example ```python model = AutoAdapterModel.from_pretrained("roberta-base") # create a complex setup model.add_adapter("a", config=SeqBnConfig()) model.add_adapter("b", config=SeqBnConfig()) model.add_adapter("c", config=SeqBnConfig()) model.add_adapter_fusion(["a", "b"]) model.add_classification_head("head_a") model.add_classification_head("head_b") adapter_setup = Stack(Fuse("a", "b"), "c") head_setup = BatchSplit("head_a", "head_b", batch_sizes=[1, 1]) model.set_active_adapters(adapter_setup) model.active_head = head_setup # save model.save_adapter_setup("checkpoint", adapter_setup, head_setup=head_setup) # push model.push_adapter_setup_to_hub("calpt/random_adapter_setup_test", adapter_setup, head_setup=head_setup) # re-load # model2 = AutoAdapterModel.from_pretrained("roberta-base") # model2.load_adapter_setup("checkpoint", set_active=True) ``` --------- Co-authored-by: Timo Imhof <[email protected]>
adapter-hub · Jan 8, 2025 · 9edc20d · 9edc20d
1 parent 7c2357f
commit 9edc20d
Show file tree

Hide file tree

Showing 9 changed files with 497 additions and 8 deletions.
diff --git a/docs/adapter_composition.md b/docs/adapter_composition.md
@@ -125,6 +125,8 @@ model.active_adapters = ac.Fuse("d", "e", "f")
 
 To learn how training an _AdapterFusion_ layer works, check out [this Colab notebook](https://colab.research.google.com/github/Adapter-Hub/adapters/blob/main/notebooks/03_Adapter_Fusion.ipynb) from the `adapters` repo.
 
+To save and upload the full composition setup with adapters and fusion layer in one line of code, check out the docs on [saving and loading adapter compositions](loading.md#saving-and-loading-adapter-compositions).
+
 ### Retrieving AdapterFusion attentions
 
 Finally, it is possible to retrieve the attention scores computed by each fusion layer in a forward pass of the model.

diff --git a/docs/loading.md b/docs/loading.md
@@ -94,3 +94,39 @@ We will go through the different arguments and their meaning one by one:
 To load the adapter using a custom name, we can use the `load_as` parameter.
 
 - Finally, `set_active` will directly activate the loaded adapter for usage in each model forward pass. Otherwise, you have to manually activate the adapter via `set_active_adapters()`.
+
+## Saving and loading adapter compositions
+
+In addition to saving and loading individual adapters, you can also save, load and share entire [compositions of adapters](adapter_composition.md) with a single line of code.
+_Adapters_ provides three methods for this purpose that work very similar to those for single adapters:
+
+- [`save_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter_setup) to save an adapter composition along with prediction heads to the local file system.
+- [`load_adapter_setup()`](adapters.ModelWithHeadsAdaptersMixin.load_adapter_setup) to load a saved adapter composition from the local file system or the Model Hub.
+- [`push_adapter_setup_to_hub()`](adapters.hub_mixin.PushAdapterToHubMixin.push_adapter_setup_to_hub) to upload an adapter setup along with prediction heads to the Model Hub. See our [Hugging Face Model Hub guide](huggingface_hub.md) for more.
+
+As an example, this is how you would save and load an AdapterFusion setup of three adapters with a prediction head:
+
+```python
+# Create an AdapterFusion
+model = AutoAdapterModel.from_pretrained("bert-base-uncased")
+model.load_adapter("sentiment/sst-2@ukp", config=SeqBnConfig(), with_head=False)
+model.load_adapter("nli/multinli@ukp", config=SeqBnConfig(), with_head=False)
+model.load_adapter("sts/qqp@ukp", config=SeqBnConfig(), with_head=False)
+model.add_adapter_fusion(["sst-2", "mnli", "qqp"])
+model.add_classification_head("clf_head")
+adapter_setup = Fuse("sst-2", "mnli", "qqp")
+head_setup = "clf_head"
+model.set_active_adapters(adapter_setup)
+model.active_head = head_setup
+
+# Train AdapterFusion ...
+
+# Save
+model.save_adapter_setup("checkpoint", adapter_setup, head_setup=head_setup)
+
+# Push to Hub
+model.push_adapter_setup_to_hub("<user>/fusion_setup", adapter_setup, head_setup=head_setup)
+
+# Re-load
+# model.load_adapter_setup("checkpoint", set_active=True)
+```
diff --git a/docs/quickstart.md b/docs/quickstart.md
@@ -105,7 +105,7 @@ model = AutoAdapterModel.from_pretrained(example_path)
 model.load_adapter(example_path)
 ```
 
-Similar to how the weights of the full model are saved, the `save_adapter()` will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.
+Similar to how the weights of the full model are saved, [`save_adapter()`](adapters.ModelWithHeadsAdaptersMixin.save_adapter) will create a file for saving the adapter weights and a file for saving the adapter configuration in the specified directory.
 
 Finally, if we have finished working with adapters, we can restore the base Transformer to its original form by deactivating and deleting the adapter:
 

diff --git a/src/adapters/composition.py b/src/adapters/composition.py
@@ -1,4 +1,5 @@
 import itertools
+import sys
 import warnings
 from collections.abc import Sequence
 from typing import List, Optional, Set, Tuple, Union
@@ -45,6 +46,31 @@ def parallel_channels(self):
     def flatten(self) -> Set[str]:
         return set(itertools.chain(*[[b] if isinstance(b, str) else b.flatten() for b in self.children]))
 
+    def _get_save_kwargs(self):
+        return None
+
+    def to_dict(self):
+        save_dict = {
+            "type": self.__class__.__name__,
+            "children": [
+                c.to_dict() if isinstance(c, AdapterCompositionBlock) else {"type": "single", "children": [c]}
+                for c in self.children
+            ],
+        }
+        if kwargs := self._get_save_kwargs():
+            save_dict["kwargs"] = kwargs
+        return save_dict
+
+    @classmethod
+    def from_dict(cls, data):
+        children = []
+        for child in data["children"]:
+            if child["type"] == "single":
+                children.append(child["children"][0])
+            else:
+                children.append(cls.from_dict(child))
+        return getattr(sys.modules[__name__], data["type"])(*children, **data.get("kwargs", {}))
+
 
 class Parallel(AdapterCompositionBlock):
     def __init__(self, *parallel_adapters: List[str]):
@@ -80,12 +106,18 @@ def __init__(self, *split_adapters: List[Union[AdapterCompositionBlock, str]], s
         super().__init__(*split_adapters)
         self.splits = splits if isinstance(splits, list) else [splits] * len(split_adapters)
 
+    def _get_save_kwargs(self):
+        return {"splits": self.splits}
+
 
 class BatchSplit(AdapterCompositionBlock):
     def __init__(self, *split_adapters: List[Union[AdapterCompositionBlock, str]], batch_sizes: Union[List[int], int]):
         super().__init__(*split_adapters)
         self.batch_sizes = batch_sizes if isinstance(batch_sizes, list) else [batch_sizes] * len(split_adapters)
 
+    def _get_save_kwargs(self):
+        return {"batch_sizes": self.batch_sizes}
+
 
 class Average(AdapterCompositionBlock):
     def __init__(
@@ -105,6 +137,9 @@ def __init__(
         else:
             self.weights = [1 / len(average_adapters)] * len(average_adapters)
 
+    def _get_save_kwargs(self):
+        return {"weights": self.weights}
+
 
 # Mapping each composition block type to the allowed nested types
 ALLOWED_NESTINGS = {

diff --git a/src/adapters/hub_mixin.py b/src/adapters/hub_mixin.py
@@ -4,6 +4,8 @@
 
 from transformers.utils.generic import working_or_temp_dir
 
+from .composition import AdapterCompositionBlock
+
 
 logger = logging.getLogger(__name__)
 
@@ -35,7 +37,7 @@
 from adapters import AutoAdapterModel
 
 model = AutoAdapterModel.from_pretrained("{model_name}")
-adapter_name = model.load_adapter("{adapter_repo_name}", set_active=True)
+adapter_name = model.{load_fn}("{adapter_repo_name}", set_active=True)
 ```
 
 ## Architecture & Training
@@ -66,6 +68,7 @@ def _save_adapter_card(
         language: Optional[str] = None,
         license: Optional[str] = None,
         metrics: Optional[List[str]] = None,
+        load_fn: str = "load_adapter",
         **kwargs,
     ):
         # Key remains "adapter-transformers", see: https://github.com/huggingface/huggingface.js/pull/459
@@ -103,6 +106,7 @@ def _save_adapter_card(
             model_name=self.model_name,
             dataset_name=dataset_name,
             head_info=head_info,
+            load_fn=load_fn,
             adapter_repo_name=adapter_repo_name,
             architecture_training=kwargs.pop("architecture_training", DEFAULT_TEXT),
             results=kwargs.pop("results", DEFAULT_TEXT),
@@ -133,8 +137,6 @@ def push_adapter_to_hub(
         Args:
             repo_id (str): The name of the repository on the model hub to upload to.
             adapter_name (str): The name of the adapter to be uploaded.
-            organization (str, optional): Organization in which to push the adapter
-                (you must be a member of this organization). Defaults to None.
             datasets_tag (str, optional): Dataset identifier from https://huggingface.co/datasets. Defaults to
                 None.
             local_path (str, optional): Local path used as clone directory of the adapter repository.
@@ -156,6 +158,8 @@ def push_adapter_to_hub(
                 Branch to push the uploaded files to.
             commit_description (`str`, *optional*):
                 The description of the commit that will be created
+            adapter_card_kwargs (Optional[dict], optional): Additional arguments to pass to the adapter card text generation.
+                Currently includes: tags, language, license, metrics, architecture_training, results, citation.
 
         Returns:
             str: The url of the adapter repository on the model hub.
@@ -190,3 +194,88 @@ def push_adapter_to_hub(
                 revision=revision,
                 commit_description=commit_description,
             )
+
+    def push_adapter_setup_to_hub(
+        self,
+        repo_id: str,
+        adapter_setup: Union[str, list, AdapterCompositionBlock],
+        head_setup: Optional[Union[bool, str, list, AdapterCompositionBlock]] = None,
+        datasets_tag: Optional[str] = None,
+        local_path: Optional[str] = None,
+        commit_message: Optional[str] = None,
+        private: Optional[bool] = None,
+        token: Optional[Union[bool, str]] = None,
+        overwrite_adapter_card: bool = False,
+        create_pr: bool = False,
+        revision: str = None,
+        commit_description: str = None,
+        adapter_card_kwargs: Optional[dict] = None,
+    ):
+        """Upload an adapter setup to HuggingFace's Model Hub.
+
+        Args:
+            repo_id (str): The name of the repository on the model hub to upload to.
+            adapter_setup (Union[str, list, AdapterCompositionBlock]): The adapter setup to be uploaded. Usually an adapter composition block.
+            head_setup (Optional[Union[bool, str, list, AdapterCompositionBlock]], optional): The head setup to be uploaded.
+            datasets_tag (str, optional): Dataset identifier from https://huggingface.co/datasets. Defaults to
+                None.
+            local_path (str, optional): Local path used as clone directory of the adapter repository.
+                If not specified, will create a temporary directory. Defaults to None.
+            commit_message (:obj:`str`, `optional`):
+                Message to commit while pushing. Will default to :obj:`"add config"`, :obj:`"add tokenizer"` or
+                :obj:`"add model"` depending on the type of the class.
+            private (:obj:`bool`, `optional`):
+                Whether or not the repository created should be private (requires a paying subscription).
+            token (`bool` or `str`, *optional*):
+                The token to use as HTTP bearer authorization for remote files. If `True`, will use the token generated
+                when running `huggingface-cli login` (stored in `~/.huggingface`). Will default to `True` if `repo_url`
+                is not specified.
+            overwrite_adapter_card (bool, optional): Overwrite an existing adapter card with a newly generated one.
+                If set to `False`, will only generate an adapter card, if none exists. Defaults to False.
+            create_pr (bool, optional):
+                Whether or not to create a PR with the uploaded files or directly commit.
+            revision (`str`, *optional*):
+                Branch to push the uploaded files to.
+            commit_description (`str`, *optional*):
+                The description of the commit that will be created
+            adapter_card_kwargs (Optional[dict], optional): Additional arguments to pass to the adapter card text generation.
+                Currently includes: tags, language, license, metrics, architecture_training, results, citation.
+
+        Returns:
+            str: The url of the adapter repository on the model hub.
+        """
+        use_temp_dir = not os.path.isdir(local_path) if local_path else True
+
+        # Create repo or get retrieve an existing repo
+        repo_id = self._create_repo(repo_id, private=private, token=token)
+
+        # Commit and push
+        logger.info('Pushing adapter setup "%s" to model hub at %s ...', adapter_setup, repo_id)
+        with working_or_temp_dir(working_dir=local_path, use_temp_dir=use_temp_dir) as work_dir:
+            files_timestamps = self._get_files_timestamps(work_dir)
+            # Save adapter and optionally create model card
+            if head_setup is not None:
+                save_kwargs = {"head_setup": head_setup}
+            else:
+                save_kwargs = {}
+            self.save_adapter_setup(work_dir, adapter_setup, **save_kwargs)
+            if overwrite_adapter_card or not os.path.exists(os.path.join(work_dir, "README.md")):
+                adapter_card_kwargs = adapter_card_kwargs or {}
+                self._save_adapter_card(
+                    work_dir,
+                    str(adapter_setup),
+                    repo_id,
+                    datasets_tag=datasets_tag,
+                    load_fn="load_adapter_setup",
+                    **adapter_card_kwargs,
+                )
+            return self._upload_modified_files(
+                work_dir,
+                repo_id,
+                files_timestamps,
+                commit_message=commit_message,
+                token=token,
+                create_pr=create_pr,
+                revision=revision,
+                commit_description=commit_description,
+            )