-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding two notebooks on fine-tuning gemini 1.5 using new experimental google gen ai sdk #1516
Conversation
🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use -- conventional-commit-lint bot |
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @erwinh85 and reviewers,
I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, here's a quick summary to help everyone get up to speed:
This PR adds two Jupyter notebooks demonstrating how to fine-tune the Gemini 1.5 Flash model for question answering using the new experimental Google Gen AI SDK. The notebooks cover various aspects, including:
- Setting up the environment and installing necessary packages.
- Authenticating with Google Cloud and initializing the Gen AI SDK.
- Preparing the SQuAD dataset for fine-tuning (including data normalization and formatting into JSONL).
- Establishing a baseline by evaluating the performance of the base Gemini model.
- Running the fine-tuning job on Vertex AI using the specified configurations.
- Monitoring the training and evaluation metrics using TensorBoard.
- Evaluating the performance of the fine-tuned model.
The main changes involve the addition of two new .ipynb
files (gemini/tuning/gen_ai_sdk_supervised_finetuning_using_gemini_qa.ipynb
) containing comprehensive code and explanations for the fine-tuning process. The notebooks guide users through each step, from data preparation to model evaluation, providing a practical example of how to leverage the new SDK for supervised fine-tuning.
Now for a little fun, a haiku to brighten your day:
Models learn and grow,
Fine-tuning's gentle touch guides,
Answers clear and bright.
Looking forward to the discussion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request introduces a new notebook that demonstrates how to fine-tune Gemini 1.5 Flash for question answering using the Google Gen AI SDK. The notebook is well-structured and provides a comprehensive walkthrough of the fine-tuning process, including data preparation, model training, evaluation, and usage. The use of the SQuAD dataset is appropriate for the task, and the explanations provided throughout the notebook are generally clear and helpful. However, there are a few areas where the notebook could be improved, particularly in terms of code efficiency, error handling, and data handling.
"source": [ | ||
"def get_predictions(question: str, model_version: str) -> str:\n", | ||
"\n", | ||
" prompt = question\n", | ||
" base_model = model_version\n", | ||
"\n", | ||
" response = client.models.generate_content(\n", | ||
" model = base_model,\n", | ||
" contents = prompt,\n", | ||
" config={\n", | ||
" 'system_instruction': systemInstruct,\n", | ||
" 'temperature': 0.3,\n", | ||
" },\n", | ||
" )\n", | ||
"\n", | ||
" return response.text" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_predictions
function unnecessarily re-initializes the GenerativeModel
with the system instruction in every call. This is inefficient, as the model and system instruction remain the same throughout the notebook. It's better to initialize the model once outside the function and reuse it for all predictions. This will significantly reduce overhead and improve performance, especially when making multiple predictions.
"source": [ | |
"def get_predictions(question: str, model_version: str) -> str:\n", | |
"\n", | |
" prompt = question\n", | |
" base_model = model_version\n", | |
"\n", | |
" response = client.models.generate_content(\n", | |
" model = base_model,\n", | |
" contents = prompt,\n", | |
" config={\n", | |
" 'system_instruction': systemInstruct,\n", | |
" 'temperature': 0.3,\n", | |
" },\n", | |
" )\n", | |
"\n", | |
" return response.text" | |
def get_predictions(question: str, model: genai.GenerativeModel) -> str: | |
prompt = question | |
response = model.generate_content( | |
contents = prompt, | |
config={ | |
'temperature': 0.3, | |
}, | |
) | |
return response.text |
"# Apply the get_prediction() function to the 'question_column'\n", | ||
"test_df[\"predicted_answer\"] = test_df[\"input_question\"].apply(get_predictions)\n", | ||
"test_df.head(2)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation uses .apply
to generate predictions for each question, which can be inefficient for large datasets due to the overhead of individual API calls. Batch prediction is a more efficient approach, as it allows sending multiple questions to the API at once, reducing overhead and improving performance. Consider using the batch_generate_content
method for batch prediction.
"# Apply the get_prediction() function to the 'question_column'\n", | |
"test_df[\"predicted_answer\"] = test_df[\"input_question\"].apply(get_predictions)\n", | |
"test_df.head(2)" | |
def batch_get_predictions(questions: list[str], model: genai.GenerativeModel) -> list[str]: | |
responses = model.batch_generate_content( | |
contents=questions, | |
config={ | |
'temperature': 0.3, | |
}, | |
) | |
return [response.text for response in responses] | |
predictions = batch_get_predictions(test_df['input_question'].tolist(),tuned_model) | |
test_df["predicted_answer"] = predictions | |
test_df.head(2) |
"train_dataset = f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\"\n", | ||
"validation_dataset = f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\"\n", | ||
"\n", | ||
"training_dataset= {\n", | ||
" 'gcs_uri': train_dataset,\n", | ||
"}\n", | ||
"\n", | ||
"validation_dataset = types.TuningValidationDataset(gcs_uri=validation_dataset)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validation dataset path is the same as the training dataset path. This is likely a mistake. Using the same data for training and validation will not give you a true measure of the model's performance on unseen data. It's crucial to use a separate validation dataset to evaluate how well the model generalizes. Please double-check the paths and ensure they point to the correct training and validation datasets.
"train_dataset = f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\"\n", | |
"validation_dataset = f\"\"\"{BUCKET_URI}/squad_train.jsonl\"\"\"\n", | |
"\n", | |
"training_dataset= {\n", | |
" 'gcs_uri': train_dataset,\n", | |
"}\n", | |
"\n", | |
"validation_dataset = types.TuningValidationDataset(gcs_uri=validation_dataset)" | |
train_dataset = f"""{BUCKET_URI}/squad_train.jsonl""" | |
validation_dataset = f"""{BUCKET_URI}/squad_validation.jsonl""" | |
training_dataset= { | |
'gcs_uri': train_dataset, | |
} | |
validation_dataset = types.TuningValidationDataset(gcs_uri=validation_dataset) |
"# Apply the get_prediction() function to the 'question_column'\n", | ||
"test_df[\"predicted_answer\"] = test_df[\"input_question\"].apply(get_predictions)\n", | ||
"test_df.head(2)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to a previous comment, using .apply
for predictions can be inefficient. Consider batching the requests.
"# Apply the get_prediction() function to the 'question_column'\n", | |
"test_df[\"predicted_answer\"] = test_df[\"input_question\"].apply(get_predictions)\n", | |
"test_df.head(2)" | |
# Apply the get_prediction() function to the 'question_column' | |
predictions = batch_get_predictions(test_df['input_question'].tolist(),tuned_model) | |
test_df["predicted_answer"] = predictions | |
test_df.head(2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Thank you for opening a Pull Request!
Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
CONTRIBUTING
Guide.CODEOWNERS
for the file(s).nox -s format
from the repository root to format).Fixes #<issue_number_goes_here> 🦕