Today's discussion focused on the evolution of SRE from its origins at Google to its growing adoption by enterprises, and the reasons behind this shift. Putting SRE principles in platform engineering using Reliability Map that provides a simple “Tech Tree” to track your progress, Work across your entire organisation, and Visualise dependencies.
So, you want to make your platform more reliable but how, where do you start, and what does that really mean? And what comes next? 🤔
The r9y.dev map helps you:
✅ Assess your current reliability posture.
✅ Define where you want to be.
✅ Build a roadmap with the right tools, processes, and cultural shifts to get there.
It's your guide to improving system resilience in a structured and actionable way! 🚀
Additionally, we explored:
🔹 Gaining leadership support and making reliability a priority
🔹 The myths we tell ourselves to justify incidents and failures
🔹 SLOs as the bridge between development and operations
🔹 Why the traditional reliability pyramid model falls short
▬▬▬▬▬▬ 👋 Meet our Guest 👋 ▬▬▬▬▬▬
🎩 Steve McGhee: Reliability Advocate, SRE 🎩
Steve was an SRE at Google for about 10 years, working in Android, YouTube, and Google Cloud. He then joined a company to build reliable systems on the Cloud. Now he's back at Google, helping more companies do that.
▬▬▬▬▬▬ 🛠️ About Reliability Map 🛠️ ▬▬▬▬▬▬
Reliability Engineering is vital to the successful operation of any platform. R9y provides a simple “Tech Tree” that you can follow across multiple areas of your business to achieve the appropriate number of “nines” you need to provide value to your customers
👉 https://www.r9y.dev/
▬▬▬▬▬▬ 🤝 Book a Call 🤝 ▬▬▬▬▬▬
If you'd like to be our next guest, schedule a convenient time slot via https://calendly.com/saimsafder14/30min. We'll discuss further details. Alternatively, you can contact me via Twitter or LinkedIn (details below).
▬▬▬▬▬▬ 👋 Contact me 👋 ▬▬▬▬▬▬
➡ Twitter: https://twitter.com/cloudnativeboy
➡ LinkedIn: https://www.linkedin.com/in/saim-safder
▬▬▬▬▬▬ 🔔 Other Channels 🔔 ▬▬▬▬▬▬
🎤 Cloud Native Islamabad: @CloudNativeIslamabad
▬▬▬▬▬▬ ⏱️ Chapters ⏱️ ▬▬▬▬▬▬
0:00 Agenda
1:04 Welcoming Steve 👋
2:00 DevOps vs SRE vs Platform Engineering
6:20 SRE Today vs PAST
10:00 Cloud Native and SRE Adaptation
11:30 Origin of Reliability Map
18:00 Proliferation of tools and Lack of Assessment
20:00 Beginner's Guide to Reliability Map
26:20 Capabilities for building a Reliable Platform
30:00 People and Culture in Reliability Map
31:50 Reliability Map - UI Enhancements
32:48 More SRE coming SOON
33:34 Thanks for watching