It is 7 AM; you awake after a night of uninterrupted slumber. Being on-call, you check for issues, was your pager out of batteries? Nope, things are quiet.
Imagine a world where outages are a myth. Where a failure occurs, but there is no customer impact and no engineer is engaged. This is the aspiration of Reliability Engineering - to operate complex distributed systems effectively, without customer facing outages or heavy operational burden.
In this 101 talk, I will share the basics every team should know to start their reliability journey off on the right foot.
VIDEOS RELATED TO MANAGING
Colin Bodell, VP Engineering at Shopify Plus
Ben John, CTO at Xandr
Lena Reinhard, Former VP Engineering at CircleCi
Nick Rockwell, SVP, Eng & Infra at Fastly
Matt Cielecki, Sr. Director of Engineering at JibJab
Rajesh Jayaraman, CTO at Ellevest
Dmitry Koltunov, Co-Founder and CTO at ALICE
Suvajit Gupta, EVP Engineering at Appian
Brandon Turner, Senior Director at Rapid7
Rob Zuber, CTO at CircleCI
Jerrold Jackson, Head of Machine Learning & Data at EXOS
Tim Olshansky, CTO/CPO at Zenput
Copyright © 2024 CTO Connection, All Rights Reserved