Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add a code sample for creating a kmeans model #267

Merged
merged 40 commits into from
Feb 27, 2024
Merged

docs: add a code sample for creating a kmeans model #267

merged 40 commits into from
Feb 27, 2024

Conversation

SalemJorden
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

@SalemJorden SalemJorden requested a review from tswast December 11, 2023 20:46
@SalemJorden SalemJorden requested review from a team as code owners December 11, 2023 20:46
@SalemJorden SalemJorden requested a review from parthea December 11, 2023 20:46
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. samples Issues that are directly related to samples. labels Dec 11, 2023
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
Copy link

snippet-bot bot commented Dec 12, 2023

Here is the summary of changes.

You are about to add 3 region tags.

This comment is generated by snippet-bot.
If you find problems with this result, please file an issue at:
https://github.com/googleapis/repo-automation-bots/issues.
To update this comment, add snippet-bot:force-run label or use the checkbox below:

  • Refresh this comment

samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
@googleapis googleapis locked as resolved and limited conversation to collaborators Jan 3, 2024
@SalemJorden SalemJorden requested a review from tswast February 5, 2024 15:59
Copy link
Collaborator

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, but a few comments regarding the comments.

samples/snippets/create_kmeans_model_test.py Outdated Show resolved Hide resolved
# value.
cluster_model = KMeans(n_clusters=4)
cluster_model.fit(stationstats)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do a to_gbq() here to save the model to a permanent location.

It should look very similar to the getting started tutorial:

# The model.fit() call above created a temporary model.
# Use the to_gbq() method to write to a permanent location.
model.to_gbq(
your_model_id, # For example: "bqml_tutorial.sample_model",
replace=True,
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to_gbq() to save the model.

samples/snippets/create_kmeans_model_test.py Show resolved Hide resolved
samples/snippets/conftest.py Show resolved Hide resolved
@@ -0,0 +1,151 @@
# Copyright 2023 Google LLC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 2024 now. These headers should reflect when the text was first written.

Suggested change
# Copyright 2023 Google LLC
# Copyright 2024 Google LLC

cluster_model = KMeans(n_clusters=4)
cluster_model.fit(stationstats)
cluster_model.to_gbq(
your_gcp_project_id, # For example: "bqml_tutorial.sample_model"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Project ID isn't sufficient. We want a model ID.

# from BigQuery, but you could also use the `cluster_model` object from
# previous steps.
cluster_model = bpd.read_gbq_model(
your_gcp_project_id,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not project ID. This should be the same model ID as the one we use in to_gbq above (once you change that).

Copy link
Collaborator

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but let's wait for e2e tests to pass before merging.

# limitations under the License.


def test_kmeans_sample(project_id: str, random_model_id: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test failure on samples/snippets/create_kmeans_model_test.py is a real one. We need to create a random_model_id_eu fixture.

>           raise exceptions.from_http_response(response)
E           google.api_core.exceptions.NotFound: 404 POST [https://bigquery.googleapis.com/bigquery/v2/projects/bigframes-load-testing/jobs?prettyPrint=false](https://www.google.com/url?q=https://bigquery.googleapis.com/bigquery/v2/projects/bigframes-load-testing/jobs?prettyPrint%3Dfalse&sa=D): Not found: Dataset bigframes-load-testing:python_bigquery_dataframes_samples_snippets_20240223223225_52ccfc was not found in location EU
E           
E           Location: EU
E           Job ID: 3a8974a6-1007-483a-97f0-2e9c267f17e4

[.nox/samples-3-9/lib/python3.9/site-packages/google/cloud/_http/__init__.py:494](https://cs.corp.google.com/piper///depot/google3/.nox/samples-3-9/lib/python3.9/site-packages/google/cloud/_http/__init__.py?l=494): NotFound

samples/snippets/conftest.py Show resolved Hide resolved
Copy link
Collaborator

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Thanks for your work on this.

@tswast tswast added the automerge Merge the pull request once unit tests and other checks pass. label Feb 27, 2024
@tswast tswast merged commit 4291d65 into main Feb 27, 2024
13 of 14 checks passed
@tswast tswast deleted the Salem branch February 27, 2024 21:14
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Feb 27, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. samples Issues that are directly related to samples. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants