Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI/CD should include tests for endpoints used in example notebooks #456

Closed
3 tasks
kheal opened this issue Jan 24, 2024 · 15 comments
Closed
3 tasks

CI/CD should include tests for endpoints used in example notebooks #456

kheal opened this issue Jan 24, 2024 · 15 comments
Assignees

Comments

@kheal
Copy link

kheal commented Jan 24, 2024

We want to make sure the example notebooks in this repo: https://github.com/microbiomedata/notebook_hackathons do not break with any changes or pushes to the NMDC-runtime API.

The following endpoints are used (with example tests).

@kheal
Copy link
Author

kheal commented Jan 24, 2024

@brynnz22 - can you also add an example of the endpoint you used to download the tsvs of the taxonomic information? I couldn't figure out an easy way to access the url for this step. The url I'm talking about is in code chunk 30 in this notebook.

@brynnz22
Copy link
Contributor

@kheal that url was just taken from the metadata retrieved using the metadata collection endpoint that you already mentioned.

@kheal
Copy link
Author

kheal commented Jan 25, 2024

Great - so the three endpoints I point to above will cover the API calls we've used in the notebook, correct? @brynnz22

@brynnz22
Copy link
Contributor

Yep! That should be right.

@PeopleMakeCulture
Copy link
Collaborator

Related to #301

@dwinston dwinston moved this from Bench to Lineup in Polyneme mixset Jan 25, 2024
@dwinston dwinston moved this from Lineup to At bat in Polyneme mixset Jan 25, 2024
@PeopleMakeCulture PeopleMakeCulture moved this from At bat to On base in Polyneme mixset Jan 29, 2024
@PeopleMakeCulture
Copy link
Collaborator

PeopleMakeCulture commented Feb 2, 2024

@kheal Is this the example notebook you're referring to in your first comment? https://github.com/microbiomedata/notebook_hackathons/tree/main/taxonomic_dist_by_soil_layer

Could you share the relevant code chunks in this notebook?

@dwinston dwinston removed their assignment Feb 2, 2024
@kheal
Copy link
Author

kheal commented Feb 2, 2024

@PeopleMakeCulture - that is one of the notebooks.

These two also use the runtime API: https://github.com/microbiomedata/notebook_hackathons/tree/main/NEON_soil_metadata and https://github.com/microbiomedata/notebook_hackathons/tree/main/bioscales_biogeochemical_metadata (in both the R and python versions, for a total of 5 notebooks).

Do you want/need me to point to each chunk in each notebook (5 notebooks total) that pings the API?

@PeopleMakeCulture
Copy link
Collaborator

@kheal Gotcha. The notebook links should be enough. Thanks!

@dwinston
Copy link
Collaborator

dwinston commented Feb 2, 2024

@kheal @brynnz22 are the notebooks all "quick"? We could potentially just run them all to make sure they don't error, with e.g. papermill:

import papermill as pm

for nb_filename in nb_filenames:
    try:
        pm.execute_notebook(
            nb_filename,
            'output_' + nb_filename,
            parameters=dict(parameter_name='value')
        )
    except pm.exceptions.PapermillExecutionError as e:
        print("An error occurred during execution:", e)
        # Custom error handling or cleanup code here
        # raise for pytest
    

@kheal
Copy link
Author

kheal commented Feb 2, 2024

@dwinston

Unfortunately no. This notebook takes a couple of hours (in part bc there is not an easy API route to go from biosample ids to data objects, see #355).

The get requests I have at the top of this thread are type examples and should be sufficient as tests to make sure the endpoints are still good.

@shreddd
Copy link
Collaborator

shreddd commented Feb 8, 2024

We should discuss potential notebook testing options as well, and what makes the most sense. I have some folks in my group who have experience with other Jupyter testing tools like nbmake (https://github.com/treebeardtech/nbmake)

Also fine with papermill but the typical papermill use case I've seen is centered around running a notebook job in parallel where it gets parameterized across different inputs. Just want to make sure we are using the right tool for the job.

I've also seen this (from the same people at Netflix that made papermill) - https://github.com/nteract/testbook

@kheal
Copy link
Author

kheal commented Feb 8, 2024

I should have noted that the other four notebooks in these locations
https://github.com/microbiomedata/notebook_hackathons/tree/main/NEON_soil_metadata and https://github.com/microbiomedata/notebook_hackathons/tree/main/bioscales_biogeochemical_metadata are pretty quick and it'd be great to have those tested in the CI/CD.

@dwinston dwinston moved this from On base to Lineup in Polyneme mixset Feb 9, 2024
@PeopleMakeCulture PeopleMakeCulture moved this from Lineup to On base in Polyneme mixset Mar 21, 2024
@PeopleMakeCulture PeopleMakeCulture moved this from On base to Lineup in Polyneme mixset Mar 21, 2024
@PeopleMakeCulture PeopleMakeCulture moved this from Lineup to Bench in Polyneme mixset Mar 21, 2024
@PeopleMakeCulture PeopleMakeCulture moved this from Bench to Lineup in Polyneme mixset Apr 10, 2024
@PeopleMakeCulture PeopleMakeCulture moved this from Lineup to Bench in Polyneme mixset Apr 10, 2024
@kheal
Copy link
Author

kheal commented Jul 22, 2024

FYI @dwinston @shreddd @PeopleMakeCulture - I've added some CI/CD to check the notebooks' stability via gh actions to the notebook repo (microbiomedata/nmdc_notebooks#63). I'm not sure exactly how to add these checks (or something) similar to test to changes to the runtime during dev, but I wanted to let you all know. I've scheduled the test for 1/week (as well as for PRs within the https://github.com/microbiomedata/nmdc_notebooks repo).

@PeopleMakeCulture
Copy link
Collaborator

@kheal IIRC all runtime API endpoints in the notebooks are currently covered by existing API endpoint tests, but we do not perform an end-to-end run of the notebooks as part of the runtime's CI pipeline (what you have in microbiomedata/nmdc_notebooks#63 in the notebooks repo.

It is possible for a trigger in one repo (eg a pull request) to trigger a gh action in another. Here is a discussion on how to implement, with examples.

I'm pulling this ticket off the bench for now

@PeopleMakeCulture PeopleMakeCulture moved this from Bench to Lineup in Polyneme mixset Jul 24, 2024
@kheal
Copy link
Author

kheal commented Nov 13, 2024

I think this testing is well covered now - no need to keep this issue open in my opinion.

@kheal kheal closed this as completed Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants