Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate API endpoint development for new plots and tables #75

Open
logstar opened this issue Jun 10, 2022 · 0 comments
Open

Automate API endpoint development for new plots and tables #75

logstar opened this issue Jun 10, 2022 · 0 comments

Comments

@logstar
Copy link
Contributor

logstar commented Jun 10, 2022

Currently, API endpoint development for new plots and tables is a complex procedure. As an example, the development procedure for new DNA methylation plots and tables is briefly described in #69 and #70.

Optimally, developers can add a new plot and table to the API by simply adding a plotting function. Based on the plotting function, the API framework can automatically generate required code and data. As each plot is generated from a table, the table generation function is included in the plotting function.

This ticket intends to discuss options to automate API endpoint development. This overall direction is conceptualized based on the discussions with @jharenza and @chinwallaa.

Following is a brief description on the current development procedure:

For any new table and plot to be added to API, example plots and tables need to be generated using https://github.com/PediatricOpenTargets/OpenPedCan-analysis data. The code for generating example plots and tables are further integrated into the API. For example, CNV example plot creation is tracked by d3b-center/ticket-tracker-OPC#349.

However, integrating the code for generating example plots and tables is time consuming, because the developers generally need to work on the following steps:

  • Convert data release files into database tables.
    • Clean up data.
    • Handle independent samples, all cohorts, n sample filtering. If a plot needs more than one data files, align them so there is no duplicated or missing links.
    • Load processed data into database.
    • Dump database for deployment on remote server.
  • Write R code to generate plots via API:
    • Query database. Performance may need to be evaluated by adding additional columns to the data development step.
    • Process query results. There may be missing or duplicated records.
    • Plot. The plot code for generating example plots needs to be greatly refactored to take care of all possible cases, e.g., plot width/height, title/legend location, text length. The refactoring step usually takes a lot longer than developing example plots.
  • Write tests for new plots.
  • Evaluate new test plots and tables.

cc @taylordm @kelseykeith @afarrel @ewafula

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant