Agents do not necessarily need to do the entire job of an SRE. They can deliver tremendous value by doing portions of the job. i.e. bringing in relevant data for a given alert or writing post-mortems. There are aspects of the role that are ripe for LLM-powered tools
I really can’t think of anything more counter-productive than AI post-mortems.
The whole point of them is for the team to understand what were wrong and how processes can be improved in order to prevent issues happening again. The idea of throwing away all the detail and nuance and having some AI generated summary will only make the SRE field worse.
Also really don’t understand the benefit of LLM for bringing in relevant data. Would much prefer that be statically defined i.e. when a database query takes 10x longer, bring me the dashboards for OS and hardware as well.