It is 7 AM; you awake after a night of uninterrupted slumber. Being on-call, you check for issues, was your pager out of batteries? Nope, things are quiet.
Imagine a world where outages are a myth. Where a failure occurs, but there is no customer impact and no engineer is engaged. This is the aspiration of Reliability Engineering - to operate complex distributed systems effectively, without customer facing outages or heavy operational burden.
In this 101 talk, I will share the basics every team should know to start their reliability journey off on the right foot.
VIDEOS RELATED TO MANAGING
Camille Fournier, Head of Platform Engineering at Two Sigma
Russell Smith, CTO at Rainforest QA
Randy Shoup, VP Engineering and Chief Architect at eBay
Heidi Waterhouse, Transformation Advocate at LaunchDarkly
Mai Irie, Director of Engineering at Spring Health
Johnny ray Austin, CTO at Till
Juan pablo Buriticá, Head of Engineering, LATAM at Stripe
Jeff Smith, Senior Research Engineering Manager at Facebook Artificial Intelligence Research (FAIR)
Lisa Van gelder, SVP, Engineering at Spring Health
Dalia Havens, VP of Engineering at Netlify
Jeff Ammons, Director of Engineering at One Medical
Gil Shklarski, CTO at Flatiron Health
Copyright © 2024 CTO Connection, All Rights Reserved