📝 Sql query tutorial #642

chris-s-friedman · 2021-12-07T19:20:23Z

add a tutorial on one method for sourcing data from a postgresql database and moved the tutorial on sourcing from study creator to the same directory as sourcing data from sql.

This is a potential solution for #627

🚚 move index for sourcing data from study creator now it's in its own dir

fiendish · 2021-12-07T21:42:08Z

that's a very clever trick

docs/source/tutorial/sourcing_data/query_sql.rst

nicholasvk · 2021-12-08T14:21:21Z

Very cool! Just want to chime in with some history/background toward developing standards on when it makes sense to use SQL directly. We thought about doing this for the CBTN refresh work which relies on data sources in the D3b warehouse. However, after discussions as a team with Allison/Bailey, we decided that we should write SQL output to Data Tracker via the study creator API for audit purposes and then use the files on data tracker for actual ingest. Database tables will change over time, rows will be added/deleted and/or updated, so having a record of exactly what is ingested for a given run seemed important at the time. A bonus is that Data Tracker makes this history easily accessible to non-ADAPT team members. You can review the code (authored by Meen and Avi) for that here:
https://github.com/d3b-center/d3b-warehouse-kids-first-refresh/blob/main/etl_from_eig_into_warehouse/dumping.py

Is this the best workflow? Are there other solutions to explore for this? Absolutely, just wanted to add to the conversation with history on the CBTN side.

chris-s-friedman · 2021-12-08T16:51:57Z

@nicholasvk i think those are really good points about

being able to audit what data is used for ingest
the volatility inherent to using sql as a data source - i.e. rows being added/ deleted/ updated
facilitate non-technical folks be able to have eyes on the data being used for ingest.

my thinking proposing a tutorial on how to query sql in an ingest is for two specific circumstances:

ingesting information about genomic files from aws scrapes. instead of having ingesters generate aws manifests, we can pull directly from the file_metadata schema in postgres
provide a tutorial for external users that use the ingest library but don't have the data tracker but do use databases as source data.

that said - I think I should prepend this tutorial with a note about your three points and how querying directly from sql impacts those points. Perhaps even say " instead of querying sql, you may want to create a static view of your database in a single file"

docs/source/tutorial/sourcing_data/query_sql.rst

@nicholasvk

🚨 remove trailing whitespace ✏️ fix spelling and capitalization Co-authored-by: Giovanni Santia <[email protected]> ✨ add note about considering not querying sql directly thanks @nicholasvk for the suggestion :rotating_light: remove more trailing whitespace

nicholasvk

Updated documentation for SQL considerations looks great!

## Release 1.11.0 ### Summary - Emojis: ? x3, ✨ x3 - Categories: Additions x3, Other Changes x3 ### New features and changes - [#645](#645) - add gru-npu consent group - [1da6dd7](1da6dd7) by [chris-s-friedman](https://github.com/chris-s-friedman) - [#644](#644) - ✨ specify external IDs for clinical markers - [c1f6c1c](c1f6c1c) by [chris-s-friedman](https://github.com/chris-s-friedman) - [#643](#643) - ✨ Add new sequencing center Tempus - [87f5cdf](87f5cdf) by [youngnm](https://github.com/youngnm) - [#642](#642) - Sql query tutorial - [92889ef](92889ef) by [chris-s-friedman](https://github.com/chris-s-friedman) - [#641](#641) - ✨ Add NIH and Methylation constants - [6a1e70b](6a1e70b) by [youngnm](https://github.com/youngnm) - [#639](#639) - ✨ add CSIR sequencing Center - [81e5752](81e5752) by [chris-s-friedman](https://github.com/chris-s-friedman)

🚚 move study creator file to be in a dir about sourcing files

26bee0c

🚚 move index for sourcing data from study creator now it's in its own dir

chris-s-friedman requested a review from a team as a code owner December 7, 2021 19:20

chris-s-friedman requested review from Christina-J-Diaz and youngnm December 7, 2021 19:21

chris-s-friedman force-pushed the sql_query_tutorial branch from a4e17da to 6a58c39 Compare December 7, 2021 19:28

gsantia reviewed Dec 7, 2021

View reviewed changes

docs/source/tutorial/sourcing_data/query_sql.rst Outdated Show resolved Hide resolved

chris-s-friedman force-pushed the sql_query_tutorial branch from ce2c0ac to a34c23f Compare December 8, 2021 13:29

chris-s-friedman commented Dec 8, 2021

View reviewed changes

docs/source/tutorial/sourcing_data/query_sql.rst Show resolved Hide resolved

chris-s-friedman force-pushed the sql_query_tutorial branch from a34c23f to 91bdb12 Compare December 8, 2021 21:17

chris-s-friedman force-pushed the sql_query_tutorial branch from 91bdb12 to 75afaa0 Compare December 8, 2021 21:19

chris-s-friedman requested review from nicholasvk and gsantia December 8, 2021 21:20

gsantia approved these changes Dec 8, 2021

View reviewed changes

nicholasvk approved these changes Dec 8, 2021

View reviewed changes

chris-s-friedman merged commit 92889ef into master Dec 9, 2021

chris-s-friedman deleted the sql_query_tutorial branch December 9, 2021 00:04

chris-s-friedman mentioned this pull request Feb 23, 2022

🏷 Release 1.11.0 #646

Merged

fiendish changed the title ~~Sql query tutorial~~ 📝 Sql query tutorial Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📝 Sql query tutorial #642

📝 Sql query tutorial #642

chris-s-friedman commented Dec 7, 2021 •

edited

Loading

fiendish commented Dec 7, 2021 •

edited

Loading

nicholasvk commented Dec 8, 2021

chris-s-friedman commented Dec 8, 2021

nicholasvk left a comment

📝 Sql query tutorial #642

📝 Sql query tutorial #642

Conversation

chris-s-friedman commented Dec 7, 2021 • edited Loading

fiendish commented Dec 7, 2021 • edited Loading

nicholasvk commented Dec 8, 2021

chris-s-friedman commented Dec 8, 2021

nicholasvk left a comment

Choose a reason for hiding this comment

chris-s-friedman commented Dec 7, 2021 •

edited

Loading

fiendish commented Dec 7, 2021 •

edited

Loading