May 18, 2022

Oops 1 - IT Outage & Degradation Response

Oops 1 - IT Outage & Degradation Response


We are very much looking forward to welcoming you to the first Oops Meetup - The London Outage Ops & Incident response Group.

Location: WeShape - 20 St Dunstans Hill, Monument, London EC3R 8HL

Date and Time: 18th of May 2022 - 18:00

We are very excited to have the below speakers for this event:

6:45pm - Introductions from organisers

7pm - Talk: Rodrigo Campos - Director of Engineering at Meta and Facebook - Meta Production Engineering and Building a Culture of Reliability

In this talk we'll discuss about how we were able to build company wide awareness about the importance of reliability and how to develop strong ownership and better engineering practices. I'll cover how we use a federated structure to own reliability across multiple domains, how we deal with incidents, how we increase resiliency by continuously testing our systems and how we ensure that our incident review process is productive and blameless. I'll talk about real case scenarios and how we deal with large scale outages. Rodrigo will provide a first hand perspective of the major incident in October 2021 at Meta and how engineering engages in these situations.

7:40 - We’ll also be joined by an all-star panel to discuss a range of topics around IT Outages in 2022.

Marilyn Kruger - Service Delivery Manager at IG (Compere)
Peter Raftery - Director of Technical Operations at Light & Wonder
Daniel Cook - Senior Incident Manger at Trainline
Alex Hibbitt - Head of Site Reliability Engineering at Photobox
8:30pm - Networking, food and drinks - Rooftop bar

Find out more and register on MeetUp by clicking here

Event sign up

Thank you! Your submission has been received!
Something went wrong while submitting the form. Please try again.
How can we help you today?
B Corporation
UK IT Industry Awards Winner
Tech Talent Charter
ISO 27001
DevOps award win
Business Declares
AWS Partner