Skip to content
Michele Cosi edited this page Oct 29, 2024 · 43 revisions

Exploring Tools for Data Analysis and AI Applications in Biosciences and Genomics


Fall 2024 Workshop: Bioinformatics & Genomics: From Data Analysis to AI Applications

This workshop provides graduate students in public universities with the necessary skills and tools to analyze biological data using high-performance computing resources.

Participants will acquire hands-on experience with industry-standard command-line tools (CLI) for DNA and RNA sequencing analysis, sequence manipulation and alignment, and pipeline management for automating complex workflows. They will also learn about differential expression analysis for identifying genes with altered expression levels, data visualization techniques for effectively presenting results, and the basics of artificial intelligence (AI) and machine learning (ML) in bioinformatics.

Upon completion of this workshop, graduates will be capable of using these powerful tools and methods to address real-world biological challenges and make significant contributions to bioinformatics research.

Required Skills

Skill Description
Basic understanding of biology This workshop assumes a basic understanding of biological concepts, such as DNA, RNA, genes, and genomes.
Familiarity with the command line (optional, but helpful) While not required, familiarity with the command line will help navigate the tools covered in the workshop.
Enthusiasm for learning new computational skills A strong interest in learning new computational skills is essential for success in this workshop.

Workshop Program

Time: Tuesdays @3PM (for Zoom link/in person information, please sign up at the U of A Data Science Institute DataLab website)

All sessions are recorded and uploaded to the University of Arizona's DataLab YouTube channel, where you can also find the other DataLab series: Natural Language Processing (NLP), Generative AI, NextGen Geospatial.

Date Title Topic Description Instructors Material Link/Recording
(09/03) Exploring Analysis Platforms and High-Performance Computing Resources for Bioinformatics Platforms and high-performance computing resources for bioinformatics (HPC/CyVerse) Learn about the infrastructure behind bioinformatics analysis - high-performance computing (HPC) clusters; Explore CyVerse, a user-friendly platform for accessing HPC resources and running bioinformatics analyses. Michele Cosi / Carlos Lizárraga Material, Recording
(09/10) Intro to Essential Tools for Every Bioinformatician CLI Tools for DNA and RNA-seq (BLAST, SAMtools, FastQC/MultiQC, STAR/TopHat/Bowtie2, BWA, HISAT2, GATK, BUSCO) Master the CLI, a bioinformatics fundamental; Learn essential tools such as BLAST, SAMtools, FastQC/MultiQC, and aligners like STAR, TopHat, Bowtie2, BWA, HISAT2; Become proficient in GATK for variant calling and BUSCO for genome completeness assessment. Michele Cosi Material, Recording
(09/17) An Introduction to Sequence Manipulation, Alignment, and Assessment for Bioscience Analyses Sequence manipulation, alignment, and assessment (BLAST, BWA, SAMtools, STAR/TopHat/Bowtie2, FastQC/MultiQC) Enhance your knowledge of sequence manipulation for sequencing data preparation; Improve your alignment skills using various tools, and understand the assessment of alignment quality. Michele Cosi Material, Recording
(09/24) Using Nextflow for Streamlining Bioscience Analysis Pipelines Pipeline management: NextFlow Learn automation with Nextflow, tools that simplify complex bioinformatics analyses; Acquire the skills to design and run efficient pipelines, saving time and reducing errors. Michele Cosi Material, Recording
(10/01) Using Snakemake for Streamlining Data Analysis Pipelines Pipeline management: SnakeMake Doing the previous tasks, with SnakeMake option. Michele Cosi Material, Recording
(10/08) A Beginners Guide to Gene Expression: Diving into DESeq Analysis Differential Expression Analysis (DESeq) Master DESeq to identify genes with significant expression changes between biological conditions; Learn to interpret results for biological insights. Michele Cosi, Carlos Lizárraga Material, Recording
(10/15) Fundamentals of Data Visualization Presenting and reading data (PCA plot, Volcano plot, heatmap, ggplot2) Create compelling data visualizations using powerful tools like PCA plots, volcano plots, heatmaps, and ggplot; Effectively present your bioinformatics findings. Carlos Lizárraga Material, Recording
(10/22) Explore Current AI/ML Trends and Tools in Bioinformatics AI/ML in Bioinformatics Explore AI and ML applications in bioinformatics; Understand their role in revolutionizing biological research and solving complex challenges. Carlos Lizárraga Link
(10/29) Using MLFlow to Track Machine Learning Projects in Bioinformatics MLFlow Explore MLFlow, a platform for managing the machine learning lifecycle; Learn practical skills for experiment tracking, model deployment, and collaboration in bioinformatics research. Artin Majdi / Carlos Lizárraga Link

References:


Updated: 08/22/2024 (M. Cosi)

UArizona Data Lab, Data Science Institute, University of Arizona.

CC BY-NC-SA 4.0