A brief maintenance accident turns for the worse as GitHub's database automatically fails over and breaks the website.
Sources:
https://github.blog/2018-10-30-oct21-post-incident-analysis/
https://github.blog/2016-12-08-orchestrator-github/
https://github.blog/2018-06-20-mysql-high-availability-at-github/
https://news.ycombinator.com/item?id=18272928
https://www.reddit.com/r/programming/comments/9q94am/github_major_service_outage/
https://hub.packtpub.com/github-down-for-over-7-hours-due-to-failure-in-its-data-storage-system/
https://github.blog/2017-10-12-evolution-of-our-data-centers/
Chapters:
0:00 Part 1: Intro
1:25 Part 2: GitHub's database explained
3:40 Part 3: The 43 seconds
5:04 Part 4: Fail back or not?
6:54 Part 5: Recovery process
10:32 Part 6: Aftermath
Notes:
- Funnily enough in this blog post from 4 months prior to the incident https://github.blog/2018-06-20-mysql-high-availability-at-github/ they specifically explained how cross-data-center failovers could be carried out successfully
Music:
- Hitman by Kevin MacLeod
- Blue Mood by Robert Munzinger
- Pixelland by Kevin MacLeod
- Dumb as a Box by Dan Lebowitz
Twitter: https://twitter.com/kevinfaang/
Instagram: https://instagram.com/kevinfaang_yt/