Learn about read counts, RPKM, volcano plots, heatmaps, differential gene expression, biomaRt annotation, and pathway analysis with DAVID (Database for Annotation, Visualization and Integrated Discovery). Lecture 3 will explain the downstream analysis of BAM files from RNA Sequencing data using free and open source bioinformatics software.
Previous lectures:
https://youtu.be/PlqDQBl22DI (Part 1) and
https://youtu.be/hWCDwLByxvs (Part 2)
Code on GitHub: https://gist.github.com/DannyArends/c70f21208438cd1305162f25435922f7
Presentation on OneDrive: https://1drv.ms/b/s!AtYWSYRMmSHZh4h_dY5wmLuOnIgMUQ?e=RrGqPG
Thanks for taking an interest in my channel 😄If you've made it this far down, support me by giving a like or subscribing. Join me during my live streams Thursday afternoons on YouTube or follow me on Twitch @ https://www.twitch.tv/dannyarends
Chapters:
00:00:00 - Sound check and introduction
00:00:51 - Overview for today
00:02:48 - Quick overview of the project data
00:05:01 - Adjusting the pipeline, input from the command line
00:07:36 - Executing the pipeline for each SRA sample
00:10:26 - Copy BAM files to windows using a shared folder in VirtualBox
00:14:50 - Preparing a new R script, required packages for today
00:18:00 - Building the exons per gene database
00:20:48 - Adding the install script to the Github gist
00:22:28 - Computing the length in base pairs of all transcripts
00:29:03 - Loading the BAM alignment files into R
00:35:16 - Extracting raw read counts from BAM files
00:44:07 - Computing RPKM values in R
00:52:38 - Violin plots of the RPKM distribution per sample
00:55:38 - Quantile normalization & log2 transformation of RPKM values
01:04:05 - Computing differential gene expression and fold change
01:09:02 - Creating the volcano plot
01:15:04 - Subset up/down regulated genes, a clustered heatmap
01:24:23 - Creating an overview table for publication / supplemental files
01:29:27 - biomaRt for gene annotation of up/down regulated genes
01:41:00 - Data to Excel
01:43:33 - Pathway over-representation analysis using DAVID
01:44:50 - Microsoft Excel trick: Prevent genes turning into dates
01:47:43 - Pathway over-representation analysis using DAVID (Continued)
01:50:35 - Making sense of KEGG pathway results
01:55:04 - Overview, Questions, and Outro
#rnaseq #howto #bioinformatics #computationalbiology #academicyoutube #sequencing #ngs #nextgenerationsequencing #educationalvideos #biostreaming #academia #software #volcanoplot #foldchange #geneexpression #statistics