Skip to content
Ken Nakatsu edited this page Apr 15, 2024 · 46 revisions

Welcome to the sRNAfrag wiki!

Thank you for checking out our github repo!

All code is built by Ken Nakatsu, Emory University, '25. The wiki was generated with the assistance of ChatGPT. Please email me at [email protected] or [email protected] if you find any issues.

sRNAfrag is a pipeline to analyze small RNA fragmentation from small RNA sequencing data. It is packaged with a set of tools to work with GTF files specifically in the context of small RNAs.

Overview

Current we have over 60 functions documented! They generate annotations (specifically, they label copy ids in the genome such that full genome alignment can be better utilized) compare databases, and so much more!

Usage — sRNAfrag, sRNA fragmentation Pipeline

For installation please see the main page..

  1. FAQ - For some common problems I came across during development.
  2. Scripts - For the scripts that can be used to process sRNAfrag outputs.
  3. Reproduce Results — Reproduce results from our paper.
  4. Minimal Tutorial — An example of you can use use sRNAfrag.

Usage - sRNAscripts, Scripts to work with GTF files

  1. alias_work.py - To work with chromosomal aliases
  2. conversion_tools.py - To convert files to and from GTF, fasta, and more.
  3. gtf_generation.py - To create new annotation files from sequences and GTF files
  4. gtf_modifiers.py - To modify existing gtf files, i.e. changing keys, adding parent-child relationships
  5. gtf_groundtruth.py - To generate a ground truth dataset of different levels of mismatching of sequences
  6. gtf_descriptors.py - To describe annotation files, i.e. the number of times certain attributes appear
  7. Use Cases - How these scripts can be used to parse annotations.

Contact

Please feel free to contact me:

Ken Nakatsu

[email protected]

If you have any questions regarding the wiki, choices, or issues. Please include sRNAfrag in the title.

Feel free to request annotation file generation requests.

Table Info

sRNA_tables

Development Idea and TDL. Please Email me if you'd like to help (or if you'd like to bring this development into a cross lab group effort) :)

  • Use Neo4J to manage and query graphs
  • Do more extensive graph based analyses.
  • make it more intuitive to compare across datasets
  • Implement isoform handling.
  • Simulate isoform generation.
  • Model fitting.