I’m super excited to present From Oops to Ops: Incident Response with Jupyter Notebooks on Oct 28, 2021. While I have presented this topic before for data professionals, this session at All Day DevOps (ADDO) 2021 will cover the broader Site Reliability Engineer audience.
In case you would like to follow along the slides and the session notes, you can find them on my GitHub repo:
From Oops to Ops: Incident Response with Jupyter Notebooks
Abstract:
What if you can apply software engineering practices to your troubleshooting guides (TSGs) / knowledge base for your team’s on-call or for your customers? What if you can reduce stress and mistakes in your incident response workflow and activate a more scientific approach?
Join this session to learn more about TSG Ops framework using Jupyter Notebook for executable and automatable troubleshooting guides / knowledge base. TSG Ops innovates incident response approach by applying software engineering practice in curating TSGs and activating our scientific approach when troubleshooting. We will share our learnings and our journey in implementing TSG Ops using open source technology, internally in Azure Data and externally for our Azure Data customers.
No responses yet