Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve LLM Backbone Explanations #766

Merged
merged 5 commits into from
Jul 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Follow the relevant steps below to create an experiment in H2O LLM Studio.
4. Provide a meaningful **Experiment name**.

5. Define the parameters. The most important parameters are:
- **LLM Backbone**: This parameter determines the LLM architecture to use. It is the foundation model that you continue training. H2O LLM Studio has a predefined list of recommended types of foundation models, but you can also use [Hugging Face models](https://huggingface.co/models).
- **LLM Backbone**: This parameter determines the LLM architecture to use. It is the foundation model that you continue training. H2O LLM Studio has a predefined list of recommended foundation models available in the dropdown list. You can also type in the name of a [Hugging Face model](https://huggingface.co/models) that is not in the list, for example: `h2oai/h2o-danube2-1.8b-sft` or the path of a local folder that has the model you would like to fine-tune.
- **Mask Prompt Labels**: This option controls whether to mask the prompt labels during training and only train on the loss of the answer.
- Hyperparameters such as **Learning rate**, **Batch size**, and number of epochs determine the training process. You can refer to the tooltips that are shown next to each hyperparameter in the GUI to learn more about them.
- **Evaluate Before Training**: This option lets you evaluate the model before training, which can help you judge the quality of the LLM backbone before fine-tuning.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ To publish a trained model to Hugging Face Hub:

6. Click **Export**.

![export model to hugging face](export-model-to-huggingface.png)
![export model to Hugging Face](export-model-to-huggingface.png)

## Download a model

Expand All @@ -34,7 +34,7 @@ Use the following code snippet to utilize the converted model in Jupyter Noteboo
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "path_to_downloaded_model" # either local folder or huggingface model name
model_name = "path_to_downloaded_model" # either local folder or Hugging Face model name

# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
Expand Down
6 changes: 3 additions & 3 deletions documentation/docs/tooltips/experiments/_llm-backbone.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
The **LLM Backbone** option is the most important setting as it sets the pretrained model weights.

- Usually, it is good to use smaller architectures for quicker experiments and larger models when aiming for the highest accuracy
- If possible, leverage backbones pre-trained closely to your use case
- Any huggingface model can be used here (not limited to the ones in the dropdown list)
- Use smaller models for quicker experiments and larger models for higher accuracy
- Aim to leverage models pre-trained on tasks similar to your use case when possible
- Select a model from the dropdown list or type in the name of a Hugging Face model of your preference
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ You will also need to download the classification head, either manually, or by r
```python
from huggingface_hub import hf_hub_download

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
hf_hub_download(repo_id=model_name, filename="classification_head.pth", local_dir="./")
```

Expand All @@ -29,7 +29,7 @@ You can make classification predictions by following the example below:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
prompt = "{{text_prompt_start}}How are you?{{end_of_sentence}}{{text_answer_separator}}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ You will also need to download the classification head, either manually, or by r
```python
from huggingface_hub import hf_hub_download

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
hf_hub_download(repo_id=model_name, filename="classification_head.pth", local_dir="./")
```

Expand All @@ -46,7 +46,7 @@ You can make classification predictions by following the example below:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
prompt = "{{text_prompt_start}}How are you?{{end_of_sentence}}{{text_answer_separator}}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ You may also construct the pipeline from the loaded model and tokenizer yourself
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
messages = {{sample_messages}}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ pip install transformers=={{transformers_version}}
```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
prompt = "{{text_prompt_start}}How are you?{{end_of_sentence}}{{text_answer_separator}}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ For inference, you can use the following code snippet:
```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "{{repo_id}}" # either local folder or huggingface model name
model_name = "{{repo_id}}" # either local folder or Hugging Face model name
# Important: The prompt needs to be in the same format the model was trained with.
# You can find an example prompt in the experiment logs.
prompt = "{{text_prompt_start}}How are you?{{end_of_sentence}}{{text_answer_separator}}"
Expand Down