Soil contig tax #24

brynnz22 · 2023-12-20T21:09:30Z

Added a python notebook to look out how the taxonomic distribution of contigs differ by soil layer (mineral vs. organic) in Colorado. This uses NMDC metadata to access and analyze metagenome data.

Merge branch 'main' into soil-contig-tax

kheal · 2023-12-20T23:32:15Z

@brynnz22 Overall, an impressive feat to pull together all these API calls and merges in a consumable way. Wow! The notebook rendered fine without running the calls or accessing the pkl files, so I think we are good on that front; though we should add to the readme for the google colab that rerunning this in an interactive environment is not recommended and will likely break for these reasons.

Do you use the 'taxonomic_dist_by_soil_layer/python/mongodb_query.txt.js' file? Maybe I'm missing it, but if not please remove from the branch.

In first Markdown cell, the word 'object' is misspelled.

Once we have the tsv urls, I think it would be useful to show a single sample's results before concatenating them all together. That dataframe should have soil horizion, biosample id, geo_loc, taxa, and count.

Biologically, its not correct to add together counts between samples, so I think we need to revisit the last couple code chunks to make a bit more sense. I have a couple ideas for this that shouldn't be too painful (hopefully!).

brynnz22 · 2023-12-22T01:19:58Z

@kheal addressing your points above, I:

edited the readme to explain that running in the interactive environment is not recommended
We should the mongodb_query.txt.js file because this helped inform the API request traversals. Also, this is helpful to inform the endpoint being created.
I fixed the mispelling of object
I printed a snippet of a TSV to show what it looks like
Finally, we discussed the last point and that the way I did it was correct.

I also created a second plot faceted by locations in Colorado

Thanks for the feedback :)

kheal · 2023-12-22T20:15:50Z

In md cell 35 "Example of what the TSV contig taxa file looks like"; we decided the third column is not percent (otherwise it would add up to > 100%), so that text should be edited. How about something along the lines of "The first column is the identifier of a single contig, the second is the taxonomic placement of the contig, the third is a simple count". In py cell 36; I would rename the percent column to count. Also, we will need to calculate relative abundance per sample per taxa, and then calculate average relative abundance per horizon, as we discussed.

brynnz22 · 2024-01-02T23:23:05Z

Okay! I think we are good! Thanks again for all of your help!!

brynnz22 · 2024-01-03T21:18:43Z

@kheal I believe I got the nbviewer to work now: https://nbviewer.org/github/microbiomedata/notebook_hackathons/blob/soil-contig-tax/taxonomic_dist_by_soil_layer/python/taxonomic_dist_soil_layer.ipynb .

kheal

Everything looks good, we should be good to merge now. Cheers!

kheal · 2024-01-03T21:51:49Z

closes #8!

brynnz22 and others added 14 commits November 15, 2023 15:35

add mongodb initial query

208e930

save query as txt file?

6ebd0a2

add initial requests

97fc752

merge main

3a96df4

Merge branch 'main' into soil-contig-tax

Format MongoDB query (text)

8cae54b

change mongo file to .txt.js

91f77be

:wq

0bbcd63

Merge branch 'main' into soil-contig-tax

change mongo query to .txt.js

819507b

change mongo file to .txt.js

b24f519

Merge branch 'main' into soil-contig-tax

a83bf10

updated contig files. Copy is df instead of looping

1808b37

Merge branch 'main' into soil-contig-tax

1a686e7

drop_duplicates for faster joining, save as csv to avoid rerunning cells

9082c36

fix large files

8849675

brynnz22 requested a review from kheal December 20, 2023 21:11

add Readme for taxonomic dist notebook

05dc456

brynnz22 added 2 commits December 21, 2023 17:14

fixes to notebook, add gitignore pickle file

c1f70c8

edit readme

505fe8a

fix notebook for relative abundances, add dill to requirements

66d79f1

brynnz22 added 3 commits January 2, 2024 16:04

save widget state

8d7e379

try plotly offline

e33b14b

fix plotly renderings?

9feb0a5

Modify readmes for NEON soil taxonomy notebooks

2a290ea

kheal approved these changes Jan 3, 2024

View reviewed changes

kheal merged commit 99adf59 into main Jan 3, 2024

kheal deleted the soil-contig-tax branch May 29, 2024 22:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Soil contig tax #24

Soil contig tax #24

brynnz22 commented Dec 20, 2023 •

edited

Loading

kheal commented Dec 20, 2023

brynnz22 commented Dec 22, 2023 •

edited

Loading

kheal commented Dec 22, 2023

brynnz22 commented Jan 2, 2024

brynnz22 commented Jan 3, 2024

kheal left a comment

kheal commented Jan 3, 2024

Soil contig tax #24

Soil contig tax #24

Conversation

brynnz22 commented Dec 20, 2023 • edited Loading

kheal commented Dec 20, 2023

brynnz22 commented Dec 22, 2023 • edited Loading

kheal commented Dec 22, 2023

brynnz22 commented Jan 2, 2024

brynnz22 commented Jan 3, 2024

kheal left a comment

Choose a reason for hiding this comment

kheal commented Jan 3, 2024

brynnz22 commented Dec 20, 2023 •

edited

Loading

brynnz22 commented Dec 22, 2023 •

edited

Loading