Promoter discovery using natural language processing

This repo contains code for promoter discovery and biodiversity mining using natural language processing. This effort is being pursued through three specific aims:

Aim 1: Develop natural processing-based model for promoter identification
Aim 2: Extend model to identify inducible promoter sequences
Aim 3: Experimentally validate promoter predictions

Promoter sequences were collected from three main databases: EPDnew, RegulonDB, DBTBS.

Directory structure

data

All data files are found in and/or will be written to data/

data/DBTBS/
- Contains raw data from DBTBS: Bacillus subtilis promoter database
data/EPDnew/
- Contains raw data from EPDnew: Eukaryote promoter database (promoter data for 15 different organisms
data/RegulonDB/
- Contains raw data from RegulonDB: Escherichia coli promoter database
data/parsed_promoter_data/
- Promoter data parsed from each database
data/20191114promoter_identification_ML_curation
- Manually curated information on other state-of-the-art ML models for promoter prediction

src

All code are found in and/or will be written to src/ in either notebook or script form

Notebooks:

src/notebooks/20191125_promoter_database_parsing.ipynb
- Notebook containing code for parsing promoter data

figs

All raw and edits figures will be writted to figs/

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Promoter_papers		Promoter_papers
data		data
figs		figs
src		src
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Promoter discovery using natural language processing

Directory structure

data

src

figs

About

Releases

Packages

Languages

maalcantar/promoter_ML

Folders and files

Latest commit

History

Repository files navigation

Promoter discovery using natural language processing

Directory structure

data

src

figs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages