Attempting to identify the "best" or "optimal" regression model is a mix of art and science. In this lecture we discuss a host of methods we can use to compare models: coefficient of determination, F-tests, model selection criterion (AIC, AICc, and BIC are highlighted), and Mallows' Cp. These are illustrated in R (and with corresponding SAS code) for an all subsets regression approach, where all possible combinations of variables are fit and then compared.
A video for the Biostatistical Methods I (BIOS 6611) course in the Department of Biostatistics and Informatics at the University of Colorado-Anschutz Medical Campus taught by Dr. Alex Kaizer. Slides and additional material available at https://www.alexkaizer.com/bios_6611.
Table of Contents:
00:00 - Intro Song
00:15 - Welcome
00:46 - Background and Model Purpose by Type
03:38 - General Model Selection Considerations
05:27 - Methods for Model Selection: R-squared
07:03 - F-tests and Partial F-tests
08:01 - Model Selection Criterion: AIC
09:38 - Model Selection Criterion: AICc
10:53 - Model Selection Criterion: BIC
12:20 - Model Selection Criterion: Mallows' Cp
15:27 - All Possible Subsets Regression
18:05 - Example
19:09 - Example: R Code
20:38 - Example: SAS Code
21:02 - Example: Results
24:19 - Closing Thoughts