Data wrangling is too often the most time-consuming part of data science and applied statistics. Two tidyverse packages, tidyr and dplyr, help make data manipulation tasks easier. Keep your code clean and clear and reduce the cognitive load required for common but often complex data science tasks.
`dplyr` docs: dplyr.tidyverse.org/reference/
- http://dplyr.tidyverse.org/reference/union.html
- http://dplyr.tidyverse.org/reference/intersect.html
- http://dplyr.tidyverse.org/reference/set_diff.htm
----------------
Pt. 1: What is data wrangling? Intro, Motivation, Outline, Setup
https://youtu.be/jOd65mR1zfw
- /
01:44 Intro and what’s covered
Ground Rules
- /
02:40 What’s a tibble
- /
04:50 Use View
- /
05:25 The Pipe operator:
- /
07:20 What do I mean by data wrangling?
Pt. 2: Tidy Data and tidyr
https://youtu.be/1ELALQlO-yM
- /
00:48 Goal 1 Making your data suitable for R
- /
01:40 `tidyr` “Tidy” Data introduced and motivated
- /
08:10 `tidyr::gather`
- /
12:30 `tidyr::spread`
- /
15:23 `tidyr::unite`
- /
15:23 `tidyr::separate`
Pt. 3: Data manipulation tools: `dplyr`
https://youtu.be/Zc_ufg4uW4U
- 00.40 setup
-
02:00 `dplyr::select`
-
03:40 `dplyr::filter`
-
05:05 `dplyr::mutate`
-
07:05 `dplyr::summarise`
-
08:30 `dplyr::arrange`
-
09:55 Combining these tools with the pipe (Setup for the Grammar of Data Manipulation)
-
11:45 `dplyr::group_by`
Pt. 4: Working with Two Datasets: Binds, Set Operations, and Joins
https://youtu.be/AuBgYDCg1Cg
Combining two datasets together
- /00.42 `dplyr::bind_cols`
- /
01:27 `dplyr::bind_rows`
- /
01:42 Set operations
`dplyr::union`, `dplyr::intersect`, `dplyr::set_diff`
- /
02:15 joining data
`dplyr::left_join`, `dplyr::inner_join`, `dplyr::right_join`, `dplyr::full_join`,
______________________________________________________________
Cheatsheets: https://www.rstudio.com/resources/cheatsheets/
Documentation:
`tidyr` docs: tidyr.tidyverse.org/reference/
- `tidyr` vignette: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html
`dplyr` docs: http://dplyr.tidyverse.org/reference/
- `dplyr` one-table vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html
- `dplyr` two-table (join operations) vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html
______________________________________________________________