Caught What the Others Missed: A conversation with Maria Sivenkova

“The runbook is better because two humans and several AIs each caught what the others missed”

Remediation runbooks live or die on precision. One wrong flag, one missed CVE, one unverified command, and a fix that was supposed to close a vulnerability can open a new one instead. We sat down with Maria Sivenkova to talk about how she uses LLMs in her writing process, where she still doesn’t trust them, and how a recent runbook update for n8n became a case study in what human-AI collaboration can look like when it’s done right.

What drew you to vulnerability management?

Maria: I like contributing to people and organisations becoming less vulnerable to cyberattacks. Each remediation runbook I help create makes someone more resilient. Making the world a safer place — that’s a meaningful mission.

Has LLM-assisted writing ever caught something you would have missed on your own?

Maria: Yes, and I have a recent example. I was writing a remediation runbook for CVE-2025-20064, a UEFI firmware vulnerability affecting Intel reference platforms. At one point I asked Claude whether my existing runbook would also cover eight other CVEs listed in the same Intel advisory.

In the past, that cross-referencing would have taken me an hour of careful, manual comparison. This time I had a structured breakdown within seconds: where the CVEs diverged, their different severity scores, the different modules they affected, different CIA impact profiles, and a clear recommendation on how to scope the document.

That said, I don’t trust the output blindly. The analysis still goes through review, and I make the final call on every finding.

What do you still not trust LLMs to do in your writing workflow?

Maria: Hallucination is a residual risk with these models — it’s a property of how they work, not a flaw you can fully engineer away. In regulated or technical contexts, that risk has to be documented and managed against your organisation’s risk appetite.

For us, the answer is human verification at the end of the pipeline: someone who actually runs the commands on real systems. An LLM gets you eighty or ninety percent of the way there, very fast. But that last stretch still needs a human — someone who will feel the consequences of getting it wrong, where a bad command could brick a server or leave a vulnerability wide open.

That verification step is really the heart of this conversation — the collaboration between you and Lazuardi N. Putra, our DevOps engineer, on updating a runbook for n8n. Can you walk us through it?

Maria: Gladly. I started the way I always do: reading the official documentation, cross-checking other sources, and running a rigorous discussion with several LLMs — Claude, ChatGPT, and Gemini.

Then Lazuardi took the first draft and put it through its paces. He examined it critically and ran every command himself. His verification pass surfaced two gaps and one bug.

I still remember the bug: --no-colors, an invalid flag in PM2 v7 that throws a hard error on unknown options. The fix was simple — remove --no-colors or replace it with --raw. But without Lazuardi’s verification, that command would have silently failed in production. And this was after multiple LLM passes had already declared the runbook correct.

I fixed the gaps too, because what one DevOps engineer catches is usually representative of what others would catch as well. I acted on his review, corrected what the LLMs and I had gotten wrong between us, and the result is now live — on the Autobahn Security platform, on LinkedIn, and on our blog, thanks to Claudia K. Putri’s editorial work.

The runbook is better because two humans and several AIs each caught what the others missed.

Author: Maria Sivenkova, Senior Technical Communicator at Autobahn Security

Caught What the Others Missed: A conversation with Maria Sivenkova

What drew you to vulnerability management?

Has LLM-assisted writing ever caught something you would have missed on your own?

What do you still not trust LLMs to do in your writing workflow?

That verification step is really the heart of this conversation — the collaboration between you and Lazuardi N. Putra, our DevOps engineer, on updating a runbook for n8n. Can you walk us through it?

More to Read

What Is a Hackability Score and Why It Beats Traditional Risk Ratings

The Awareness-Action Gap: Why Training Alone Can’t Secure Your Organization

Stop Waiting for Mythos: AI Pentest Readiness Survival Guide

From Vulnerability Detection to Effective Remediation