GitHub - hbctraining/Accessing_public_genomic_data: Tutorials on accessing public reference and genomic data

Accessing genomic reference and experimental sequencing data

Audience	Computational skills required	Duration
Biologists	Beginner bash	3 hour workshop (~3 hours of trainer-led time)

Description

This repository has teaching materials for a 3 hour, hands-on Accessing genomic reference and experimental sequencing data workshop led at a relaxed pace.

For many types of sequencing analyses, we need access to public data stored in various databases and repositories. This workshop will discuss types of genomic reference data available through public databases such as Ensembl, NCBI, and UCSC, and step through how to find and download this data. The workshop will also explore how to find and download publicly available experimental data, such as data (FASTQ files and count matrices) from published papers, using GEO and the SRA repositories. While most of the workshop will access data using a web browser, downloading data from the SRA will require beginner knowledge of the command-line interface.

Learning Objectives

Understanding what is a genome build
Identifying differences in reference data available from Ensembl, NCBI, and UCSC
Finding and downloading experiment-appropriate genome reference data
Finding and downloading publically available experimental sequence data

These materials are developed for a trainer-led workshop, but also amenable to self-guided learning.

Lessons

Click here for links to lessons and the suggested schedule using the HMS-RC O2 cluster
Click here for links to lessons and the suggested schedule using the FAS-RC Odyssey cluster

Dataset

We will be demonstrating how to access the data in the lessons.

Installation Requirements

Mac users: No installation requirements.

Windows users: GitBash

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
img		img
lessons		lessons
schedules		schedules
README.md		README.md
_config.yml		_config.yml
igenomes_download.png		igenomes_download.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accessing genomic reference and experimental sequencing data

Description

Learning Objectives

Lessons

Dataset

Installation Requirements

About

Releases

Packages

Contributors 3

Languages

hbctraining/Accessing_public_genomic_data

Folders and files

Latest commit

History

Repository files navigation

Accessing genomic reference and experimental sequencing data

Description

Learning Objectives

Lessons

Dataset

Installation Requirements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages