Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 #1208

Merged
merged 23 commits into from
Dec 13, 2024

Conversation

shuoweil
Copy link
Contributor

@shuoweil shuoweil commented Dec 11, 2024

feat: add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5

Bug: b/381936588 and b/344891364

Shuowei Li added 10 commits December 3, 2024 00:16
merge to main branch to avoid conflict
merge to main branch to avoid conflict
Update ll.TextEmbeddingGenerator to 005, update feature, docs, and test.
The default vlaue is kept as text-embedding-004 for better customer
experience.

Bug:381935784
merge to main branch to avoid conflict
…ethod in Gemini-pro-1.5 \n Bug: b/381936588 and b/344891364
…ethod in Gemini-pro-1.5 \n Bug: b/381936588 and b/344891364
@shuoweil shuoweil requested a review from GarrettWu December 11, 2024 18:51
@shuoweil shuoweil self-assigned this Dec 11, 2024
@shuoweil shuoweil requested review from a team as code owners December 11, 2024 18:51
@product-auto-label product-auto-label bot added size: s Pull request size is small. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Dec 11, 2024
Copy link
Contributor

@GarrettWu GarrettWu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title has typos.

It should be adding new features instead of "update".

@@ -892,8 +892,12 @@ def fit(
Returns:
GeminiTextGenerator: Fitted estimator.
"""
if self._bqml_model.model_name.startswith("gemini-1.5"):
raise NotImplementedError("Fit is not supported for gemini-1.5 model.")
# Support gemini-1.5 and gemini-pro
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the consts to the top of the file for better organization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

supported_models = ["gemini-pro", "gemini-1.5-pro-002", "gemini-1.5-flash-002"]
if self.model_name not in supported_models:
raise NotImplementedError(
"Score is not supported models other than gemini-pro or gemini-1.5 model."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

be more explicit to "gemini-1.5-pro-002" and "gemini-1.5-flash-002".

Since other gemini 1.5 endpoints aren't supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

supported_models = ["gemini-pro", "gemini-1.5-pro-002", "gemini-1.5-flash-002"]
if self.model_name not in supported_models:
raise NotImplementedError(
"Score is not supported models other than gemini-pro or gemini-1.5 model."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in fit() in stead of score().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am working on two bugs, b/381936588 and b/344891364, both fit() and score() needed to be updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean the message is incorrect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if self.model_name not in supported_models:
raise NotImplementedError(
"Score is not supported models other than gemini-pro or gemini-1.5 model."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't leave a comment at unchanged lines. Need to update line 905 to each endpoint respectively. (still use gemini-1.0-pro-002 for gemini-pro, but issuing a warning for that case maybe more appropriate)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -1009,7 +1013,7 @@ def score(
"text_generation", "classification", "summarization", "question_answering"
] = "text_generation",
) -> bpd.DataFrame:
"""Calculate evaluation metrics of the model. Only "gemini-pro" model is supported for now.
"""Calculate evaluation metrics of the model. Only "gemini-pro" and "gemini-1.5" models are supported for now.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, more explicit to "gemini-1.5-pro-002" and "gemini-1.5-flash-002"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if self._bqml_model.model_name.startswith("gemini-1.5"):
raise NotImplementedError("Score is not supported for gemini-1.5 model.")
# Support gemini-1.5 and gemini-pro
supported_models = ["gemini-pro", "gemini-1.5-pro-002", "gemini-1.5-flash-002"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move and share consts to top.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing the change? I mean add const variables to the top of the file like https://github.com/googleapis/python-bigquery-dataframes/blob/main/bigframes/ml/llm.py#L67

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

supported_models = ["gemini-pro", "gemini-1.5-pro-002", "gemini-1.5-flash-002"]
if self.model_name not in supported_models:
raise NotImplementedError(
"Score is not supported models other than gemini-pro or gemini-1.5 model."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, more explicit.

Also the sentence seems awkward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message is ill-formed. Maybe just "score() method only supports xxx models".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@pytest.mark.flaky(retries=2)
def test_llm_gemini_pro_score(llm_fine_tune_df_default_index):
model = llm.GeminiTextGenerator(model_name="gemini-pro")
# test score() function for "gemini-pro" and "gemini-1.5" model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: test name already indicates it. The comment seems redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"gemini-1.5-flash-002",
),
)
def test_llm_gemini_score(llm_fine_tune_df_default_index, model_name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add for next test "test_llm_gemini_pro_score_params" as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@shuoweil shuoweil changed the title feat: Update lUpdate GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 feat: add Gemini-pro-1.5 to GeminiTextGenerator Tuning and Support score() method in Gemini-pro-1.5 Dec 11, 2024
@product-auto-label product-auto-label bot added size: m Pull request size is medium. and removed size: s Pull request size is small. labels Dec 12, 2024
_GEMINI_1P5_PRO_002_ENDPOINT,
_GEMINI_1P5_FLASH_002_ENDPOINT,
)
_GEMINI_SCORE_ENDPOINTS = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge the 2 lists? They are identical, don't think they will diverge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@shuoweil shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 13, 2024
@bigframes-bot bigframes-bot removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 13, 2024
@shuoweil shuoweil added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 13, 2024
@bigframes-bot bigframes-bot removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Dec 13, 2024
@shuoweil shuoweil merged commit 298fc73 into main Dec 13, 2024
22 checks passed
@shuoweil shuoweil deleted the shuowei-text-generator-tuning branch December 13, 2024 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants