Convert Swagger tag metadata (freeform blurbs) into parameter annotations #665

eecavanna · 2024-09-02T21:58:42Z

I volunteered to attempt to transfer the endpoint documentation currently implemented as freeform blurbs for "sections" (i.e. tags) in Swagger UI; into endpoint parameter-specific annotations.

Today, the freeform blurbs are stored in a list variable named tags_metadata in the file nmdc_runtime/api/main.py:

nmdc-runtime/nmdc_runtime/api/main.py

Lines 71 to 292 in ac6cf8a

    
           tags_metadata = [ 
        
               { 
        
                   "name": "sites", 
        
                   "description": ( 
        
                       """A site corresponds to a physical place that may participate in job execution. 
        
           A site may register data objects and capabilties with NMDC. It may claim jobs to execute, and it may 
        
           update job operations with execution info. 
        
           A site must be able to service requests for any data objects it has registered. 
        
           A site may expose a "put object" custom method for authorized users. This method facilitates an 
        
           operation to upload an object to the site and have the site register that object with the runtime 
        
           system. 
        
           """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "users", 
        
                   "description": ( 
        
                       """Endpoints for user identification. 
        
           Currently, accounts for use of the runtime API are created manually by system administrators. 
        
                     """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "workflows", 
        
                   "description": ( 
        
                       """A workflow is a template for creating jobs. 
        
           Workflow jobs are typically created by the system via trigger associations between 
        
           workflows and object types. A workflow may also require certain capabilities of sites 
        
           in order for those sites to claim workflow jobs. 
        
                       """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "capabilities", 
        
                   "description": ( 
        
                       """A workflow may require an executing site to have particular capabilities. 
        
           These capabilities go beyond the simple ability to access the data object resources registered with 
        
           the runtime system. Sites register their capabilities, and sites are only able to claim workflow 
        
           jobs if they are known to have the capabilities required by the workflow. 
        
           """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "object types", 
        
                   "description": ( 
        
                       """An object type is an object annotation that is useful for triggering workflows. 
        
           A data object may be annotated with one or more types, which in turn can be associated with 
        
           workflows through trigger resources. 
        
           The data-object type system may be used to trigger workflow jobs on a subset of data objects when a 
        
           new version of a workflow is deployed. This could be done by minting a special object type for the 
        
           occasion, annotating the subset of data objects with that type, and registering the association of 
        
           object type to workflow via a trigger resource. 
        
                       """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "triggers", 
        
                   "description": ( 
        
                       """A trigger is an association between a workflow and a data object type. 
        
           When a data object is annotated with a type, perhaps shortly after object registration, the NMDC 
        
           Runtime will check, via trigger associations, for potential new jobs to create for any workflows. 
        
                       """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "jobs", 
        
                   "description": """A job is a resource that isolates workflow configuration from execution. 
        
           Rather than directly creating a workflow operation by supplying a workflow ID along with 
        
           configuration, NMDC creates a job that pairs a workflow with configuration. Then, a site can claim a 
        
           job ID, allowing the site to execute the intended workflow without additional configuration. 
        
           A job can have multiple executions, and a workflow's executions are precisely the executions of all 
        
           jobs created for that workflow. 
        
           A site that already has a compatible job execution result can preempt the unnecessary creation of a 
        
           job by pre-claiming it. This will return like a claim, and now the site can register known data 
        
           object inputs for the job without the risk of the runtime system creating a claimable job of the 
        
           pre-claimed type. 
        
           """, 
        
               }, 
        
               { 
        
                   "name": "objects", 
        
                   "description": ( 
        
                       """\ 
        
           A [Data Repository Service (DRS) 
        
           object](https://ga4gh.github.io/data-repository-service-schemas/preview/release/drs-1.1.0/docs/#_drs_datatypes) 
        
           represents content necessary for a workflow job to execute, and/or output from a job execution. 
        
           An object may be a *blob*, analogous to a file, or a *bundle*, analogous to a folder. Sites register 
        
           objects, and sites must ensure that these objects are accessible to the NMDC data broker. 
        
           An object may be associated with one or more object types, useful for triggering workflows. 
        
           """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "operations", 
        
                   "description": """An operation is a resource for tracking the execution of a job. 
        
           When a job is claimed by a site for execution, an operation resource is created. 
        
           An operation is akin to a "promise" or "future" in that it should eventually resolve to either a 
        
           successful result, i.e. an execution resource, or to an error. 
        
           An operation is parameterized to return a result type, and a metadata type for storing progress 
        
           information, that are both particular to the job type. 
        
           Operations may be paused, resumed, and/or cancelled. 
        
           Operations may expire, i.e. not be stored indefinitely. In this case, it is recommended that 
        
           execution resources have longer lifetimes / not expire, so that information about successful results 
        
           of operations are available. 
        
                   """, 
        
               }, 
        
               { 
        
                   "name": "queries", 
        
                   "description": ( 
        
                       """A query is an operation (find, update, etc.) against the metadata store. 
        
           Metadata -- for studies, biosamples, omics processing, etc. -- is used by sites to execute jobs, 
        
           as the parameterization of job executions may depend not only on the content of data objects, but 
        
           also on objects' associated metadata. 
        
           Also, the function of many workflows is to extract or produce new metadata. Such metadata products 
        
           should be registered as data objects, and they may also be supplied by sites to the runtime system 
        
           as an update query (if the latter is not done, the runtime system will sense the new metadata and 
        
           issue an update query). 
        
                       """ 
        
                   ), 
        
               }, 
        
               { 
        
                   "name": "metadata", 
        
                   "description": """ 
        
           The [metadata endpoints](https://api.microbiomedata.org/docs#/metadata) can be used to get and filter metadata from  
        
           collection set types (including [studies](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/Study.html),  
        
           [biosamples](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/Biosample.html),  
        
           [data objects](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/DataObject.html), and  
        
           [activities](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/Activity.html)).<br/> 
        
           The __metadata__ endpoints allow users to retrieve metadata from the data portal using the various GET endpoints  
        
           that are slightly different than the __find__ endpoints, but some can be used similarly. As with the __find__  endpoints,  
        
           parameters for the __metadata__ endpoints that do not have a red ___* required___ next to them are optional. <br/> 
        
           Unlike the compact syntax used in the __find__  endpoints, the syntax for the filter parameter of the metadata endpoints  
        
           uses [MongoDB-like language querying](https://www.mongodb.com/docs/manual/tutorial/query-documents/).  
        
           The applicable parameters of the __metadata__ endpoints, with acceptable syntax and examples, are in the table below. 
        
           <details> 
        
           <summary>More Details</summary> 
        
           | Parameter | Description | Syntax | Example | 
        
           | :---: | :-----------: | :-------: | :---: |  
        
           | collection_name | The name of the collection to be queried. For a list of collection names please see the [Database class](https://microbiomedata.github.io/nmdc-schema/Database/) of the NMDC Schema | String | `biosample_set` | 
        
           | filter | Allows conditions to be set as part of the query, returning only results that satisfy the conditions | [MongoDB-like query language](https://www.mongodb.com/docs/manual/tutorial/query-documents/). All strings should be in double quotation marks. | `{"lat_lon.latitude": {"$gt": 45.0}, "ecosystem_category": "Plants"}` |  
        
           | max_page_size | Specifies the maximum number of documents returned at a time | Integer | `25` 
        
           | page_token | Specifies the token of the page to return. If unspecified, the first page is returned. To retrieve a subsequent page, the value received as the `next_page_token` from the bottom of the previous results can be provided as a `page_token`. ![next_page_token](../_static/images/howto_guides/api_gui/metadata_page_token_param.png) | String | `nmdc:sys0ae1sh583` 
        
           | projection | Indicates the desired attributes to be included in the response. Helpful for trimming down the returned results | Comma-separated list of attributes that belong to the documents in the collection being queried | `name, ecosystem_type` | 
        
           | doc_id | The unique identifier of the item being requested. For example, the identifier of a biosample or an extraction | Curie e.g. `prefix:identifier` | `nmdc:bsm-11-ha3vfb58` |<br/> 
        
           <br/> 
        
           </details>         
        
                   """, 
        
               }, 
        
               { 
        
                   "name": "find", 
        
                   "description": """ 
        
           The [find endpoints](https://api.microbiomedata.org/docs#/find:~:text=Find%20NMDC-,metadata,-entities.) are provided with  
        
           NMDC metadata entities already specified - where metadata about [studies](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/Study.html),  
        
           [biosamples](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/Biosample.html),  
        
           [data objects](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/DataObject.html), and  
        
           [activities](https://nmdc-documentation.readthedocs.io/en/latest/reference/metadata/Activity.html) can be retrieved using GET requests.  
        
           Each endpoint is unique and requires the applicable attribute names to be known in order to structure a query in a meaningful way.  
        
           Please note that endpoints with parameters that do not have a red ___* required___ label next to them are optional.<br/> 
        
           The applicable parameters of the ___find___ endpoints, with acceptable syntax and examples, are in the table below. 
        
           <details><summary>More Details</summary> 
        
           | Parameter | Description | Syntax | Example | 
        
           | :---: | :-----------: | :-------: | :---: | 
        
           | filter | Allows conditions to be set as part of the query, returning only results that satisfy the conditions | Comma separated string of attribute:value pairs. Can include comparison operators like >=, <=, <, and >. May use a `.search` after the attribute name to conduct a full text search of the field that are of type string. e.g. `attribute:value,attribute.search:value` | `ecosystem_category:Plants, lat_lon.latitude:>35.0` | 
        
           | search | Not yet implemented | Coming Soon | Not yet implemented | 
        
           | sort | Specifies the order in which the query returns the matching documents | Comma separated string of attribute:value pairs, where the value can be empty, `asc`, or `desc` (for ascending or descending order) e.g. `attribute` or `attribute:asc` or `attribute:desc`| `depth.has_numeric_value:desc, ecosystem_type` | 
        
           | page | Specifies the desired page number among the paginated results | Integer | `3` | 
        
           | per_page | Specifies the number of results returned per page. Maximum allowed is 2,000 | Integer | `50` | 
        
           | cursor | A bookmark for where a query can pick up where it has left off. To use cursor paging, set the `cursor` parameter to `*`. The results will include a `next_cursor` value in the response's `meta` object that can be used in the `cursor` parameter to retrieve the subsequent results ![next_cursor](../_static/images/howto_guides/api_gui/find_cursor.png) | String | `*` or `nmdc:sys0zr0fbt71` | 
        
           | group_by | Not yet implemented | Coming Soon | Not yet implemented | 
        
           | fields | Indicates the desired attributes to be included in the response. Helpful for trimming down the returned results | Comma-separated list of attributes that belong to the documents in the collection being queried | `name, ess_dive_datasets` | 
        
           | study_id | The unique identifier of a study | Curie e.g. `prefix:identifier` | `nmdc:sty-11-34xj1150` | 
        
           | sample_id | The unique identifier of a biosample | Curie e.g. `prefix:identifier` | `nmdc:bsm-11-w43vsm21` | 
        
           | data_object_id | The unique identifier of a data object | Curie e.g. `prefix:identifier` | `nmdc:dobj-11-7c6np651` | 
        
           | activity_id | The unique identifier for an NMDC workflow execution activity | Curie e.g. `prefix:identifier` | `nmdc:wfmgan-11-hvcnga50.1`|<br/> 
        
           <br/> 
        
           </details> 
        
           """, 
        
               }, 
        
               { 
        
                   "name": "runs", 
        
                   "description": ( 
        
                       "[WORK IN PROGRESS] Run simple jobs. " 
        
                       "For off-site job runs, keep the Runtime appraised of run events." 
        
                   ), 
        
               }, 
        
           ]

An example of "parameter-specific annotations" is shown in this issue: #651

The text was updated successfully, but these errors were encountered:

eecavanna · 2024-09-02T23:08:39Z

Some of the tag metadata was copied — at least, partially — from this how-to guide (https://github.com/microbiomedata/NMDC_documentation/blob/main/docs/howto_guides/api_gui.md) via commit 1a9ff23.

eecavanna · 2024-09-02T23:34:59Z

The FindRequest class (which is something that the definitions of find-related parameters are "consolidated" within) is implemented as a Pydantic class. According to https://stackoverflow.com/questions/64364499/set-description-for-query-parameter-in-swagger-doc-using-pydantic-model-fastapi, this can't directly be used to specify an endpoint's parameters in a way that results in Swagger UI showing the parameter metadata. There is a workaround. At this point, I don't know why someone implemented the FindRequest class as a Pydantic class (maybe it was to take advantage of some Pydantic feature).

ssarrafan · 2024-09-06T23:53:42Z

Moving to next sprint @eecavanna

eecavanna · 2024-09-18T07:34:03Z

I'll defer this until after the Berkeley Schema Roll Out (unless the Runtime code freeze allows for this type of change via an exception process—TBD).

CC: @aclum

turbomam · 2024-09-18T14:41:02Z

@eecavanna @ssarrafan

I got an email, in my LBL inbox, that claimed to be a comment on this issue from a user named "levente". It consisted of a suggestion to click on a bit.ly link.

I was tagged as spam by Gmail

Did any of you get that. Does it concern you?

eecavanna · 2024-09-19T06:16:43Z

Hi @turbomam,

If the username was Klevente12 (which contains the substring, "levente") and the comment date was September 2nd (or a few days before that); I did see a spam comment from that user and reported it to GitHub (the company), who confirmed it violated their terms. Here's an excerpt from the follow-up message I got from GitHub (the company):

Our review of the account named in your report has concluded. We have determined that one or more violations of GitHub’s Terms of Service have occurred and have taken appropriate action in response.

When I see spam comments like that, I (a) refrain from engaging with them and (b) report them to GitHub using the action menu on the comment.

At this point, it's not something I'm worried about (any more than spam on other public forums).

eecavanna · 2024-09-20T21:45:03Z

These are some web pages (related to this task) I want to preserve links to before I reboot my laptop:

ssarrafan · 2024-10-04T05:23:35Z

@eecavanna can this be moved out of the sprint and backlog labeled?

eecavanna · 2024-10-04T05:26:29Z

If there is a board for the "sprint after next," I'd like for this to be moved to there instead of to a backlog. Otherwise, backlog is OK with me. The task is partially done, just got lowered in priority in favor of other tasks.

ssarrafan · 2024-10-04T05:32:18Z

Moved to sprint 48. Thanks Eric.

eecavanna added documentation Improvements or additions to documentation cleanup 🧹 Related to cleaning up code, documentation, etc. labels Sep 2, 2024

eecavanna self-assigned this Sep 2, 2024

eecavanna added this to 2024 - Sprint 44 - August 26 - September 6, 2024 Sep 2, 2024

eecavanna changed the title ~~Convert tag metadata (freeform blurbs) into parameter annotations (Swagger)~~ Convert Swagger tag metadata (freeform blurbs) into parameter annotations Sep 2, 2024

microbiomedata deleted a comment Sep 2, 2024

eecavanna moved this to In Progress in 2024 - Sprint 44 - August 26 - September 6, 2024 Sep 2, 2024

ssarrafan added this to 2024 - Sprint 45 - September 9 - September 20, 2024 Sep 6, 2024

ssarrafan moved this to In Progress in 2024 - Sprint 45 - September 9 - September 20, 2024 Sep 6, 2024

ssarrafan removed this from 2024 - Sprint 44 - August 26 - September 6, 2024 Sep 6, 2024

eecavanna removed this from 2024 - Sprint 45 - September 9 - September 20, 2024 Sep 18, 2024

eecavanna added this to 2024 - Sprint 46 - September 23 - October 4, 2024 Sep 20, 2024

ssarrafan added this to 2024 - Sprint 48 - October 21 - November 1, 2024 Oct 4, 2024

ssarrafan moved this to In Progress in 2024 - Sprint 48 - October 21 - November 1, 2024 Oct 4, 2024

ssarrafan removed this from 2024 - Sprint 46 - September 23 - October 4, 2024 Oct 4, 2024

eecavanna removed this from 2024 - Sprint 48 - October 21 - November 1, 2024 Nov 1, 2024

eecavanna added this to 2024 - Sprint 49 - November 4 - 15, 2024 Nov 1, 2024

eecavanna removed this from 2024 - Sprint 49 - November 4 - 15, 2024 Nov 14, 2024

eecavanna added this to 2024 - Sprint 50 - November 18 - 29, 2024 Nov 14, 2024

eecavanna removed this from 2024 - Sprint 50 - November 18 - 29, 2024 Nov 27, 2024

eecavanna added this to 2024 - Sprint 51 - December 2 - 13, 2024 Nov 27, 2024

eecavanna moved this to In Progress in 2024 - Sprint 51 - December 2 - 13, 2024 Dec 1, 2024

eecavanna mentioned this issue Dec 1, 2024

Document query parameters directly instead of via freeform text blurbs #804

Merged

17 tasks

eecavanna linked a pull request Dec 1, 2024 that will close this issue

Document query parameters directly instead of via freeform text blurbs #804

Merged

17 tasks

eecavanna moved this from In Progress to In Review in 2024 - Sprint 51 - December 2 - 13, 2024 Dec 3, 2024

eecavanna closed this as completed in #804 Dec 6, 2024

github-project-automation bot moved this from In Review to Done in 2024 - Sprint 51 - December 2 - 13, 2024 Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert Swagger tag metadata (freeform blurbs) into parameter annotations #665

Convert Swagger tag metadata (freeform blurbs) into parameter annotations #665

eecavanna commented Sep 2, 2024 •

edited

Loading

eecavanna commented Sep 2, 2024

eecavanna commented Sep 2, 2024 •

edited

Loading

ssarrafan commented Sep 6, 2024

eecavanna commented Sep 18, 2024 •

edited

Loading

turbomam commented Sep 18, 2024

eecavanna commented Sep 19, 2024

eecavanna commented Sep 20, 2024

ssarrafan commented Oct 4, 2024

eecavanna commented Oct 4, 2024

ssarrafan commented Oct 4, 2024

Convert Swagger tag metadata (freeform blurbs) into parameter annotations #665

Convert Swagger tag metadata (freeform blurbs) into parameter annotations #665

Comments

eecavanna commented Sep 2, 2024 • edited Loading

eecavanna commented Sep 2, 2024

eecavanna commented Sep 2, 2024 • edited Loading

ssarrafan commented Sep 6, 2024

eecavanna commented Sep 18, 2024 • edited Loading

turbomam commented Sep 18, 2024

eecavanna commented Sep 19, 2024

eecavanna commented Sep 20, 2024

ssarrafan commented Oct 4, 2024

eecavanna commented Oct 4, 2024

ssarrafan commented Oct 4, 2024

eecavanna commented Sep 2, 2024 •

edited

Loading

eecavanna commented Sep 2, 2024 •

edited

Loading

eecavanna commented Sep 18, 2024 •

edited

Loading