Perhaps you just looked over the resources for your local high performance computer and learned that you'll have to use something called SLURM. WHAT!?!?! Yup. You've reached that point where you have more data or need to do an analysis that is more than your computer can handle. The solution is to turn to a high performance computer (HPC) to run your analysis. To do that, you will often need to use a workload manager like SLURM or TORQUE. In this episode of Code Club, I'll demonstrate three different ways to submit jobs to your #HPC using #SLURM. This episode is part of a wider effort to demonstrate the utility of the #mikropml package his lab created to facilitate machine learning analyses. The data he uses is from a microbiome study his lab has published looking for biomarkers associated with colorectal cancer.
In this episode, Pat will use functions from the mikropml R package and data handling functions from dplyr in #RStudio. The accompanying blog post can be found at https://www.riffomonas.org/code_club/2021-07-21-slurm.
Resource guide for using command line and SLURM from the University of Michigan Advanced Research Computing Technology Services:
https://arc.umich.edu/wp-content/uploads/sites/4/2020/05/Great-Lakes-Cheat-Sheet.pdf
If you're interested in taking an upcoming 3 day R workshop, email me at
[email protected]!
R: https://r-project.org
RStudio: https://rstudio.com
Raw data: https://github.com/riffomonas/raw_data/releases/latest
Workshops: https://www.mothur.org/wiki/workshops
You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: https://www.riffomonas.org/minimalR/
General data: https://www.riffomonas.org/generalR/
0:00 Introduction
1:57 High performance computers
8:54 Workload management tools
12:49 Interactive mode
16:51 Batch mode - single command
21:49 Batch mode - array job
26:26 Recap