Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your R code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
`dplyr` docs: dplyr.tidyverse.org/reference/
- http://dplyr.tidyverse.org/reference/setops.html
- http://dplyr.tidyverse.org/reference/join.html
----------------
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup
https://youtu.be/jOd65mR1zfw
- /
01:44 Intro and what’s covered
Ground Rules:
- /
02:40 What’s a tibble
- /
04:50 Use View
- /
05:25 The Pipe operator:
- /
07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr
https://youtu.be/1ELALQlO-yM
- /
00:48 Goal 1 Making your data suitable for R
- /
01:40 `tidyr` “Tidy” Data introduced and motivated
- /
08:10 `tidyr::gather`
- /
12:30 `tidyr::spread`
- /
15:23 `tidyr::unite`
- /
15:23 `tidyr::separate`
Pt. 3: Data manipulation tools: `dplyr`
https://youtu.be/Zc_ufg4uW4U
- /00.40 setup
- /
02:00 `dplyr::select`
- /
03:40 `dplyr::filter`
- /
05:05 `dplyr::mutate`
- /
07:05 `dplyr::summarise`
- /
08:30 `dplyr::arrange`
- /
09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
- /
11:45 `dplyr::group_by`
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins
https://youtu.be/AuBgYDCg1Cg
Combining two datasets together
- 00.42 `dplyr::bind_cols`
-
01:27 `dplyr::bind_rows`
-
01:42 Set operations
`dplyr::union`, `dplyr::intersect`, `dplyr::set_diff`
-
02:15 joining data - `dplyr::left_join`, `dplyr::inner_join`, -
`dplyr::right_join`, `dplyr::full_join`,
______________________________________________________________
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
`tidyr` docs: tidyr.tidyverse.org/reference/
- `tidyr` vignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html
`dplyr` docs: http://dplyr.tidyverse.org/reference/
- `dplyr` one-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
- `dplyr` two-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
______________________________________________________________