Your AI agent already knows your system better than ours ever will

Every incident management vendor just shipped an AI agent. PagerDuty has one. incident.io has one. Even Linear just announced that agents are their entire future.

The pitch is always the same: "Our AI understands your incidents."

Here's the problem. Their AI doesn't know your codebase. It doesn't know that your payments service was rewritten last month, or that deploy #4,271 changed the retry logic, or that the last three outages were all caused by the same Redis connection pool. Their AI reads your incident titles and severity levels. That's it.

Your agent, the one running in Cursor or Claude Code or your custom pipeline, already knows all of that. It's read your code. It's seen your commits. It's helped you debug at 2 AM.

It just can't create an incident, page someone, or update a timeline. That's not an AI problem. That's an API problem.

The captive agent trap

Here's what's happening across the industry right now.

PagerDuty builds an AI agent that lives inside PagerDuty. It can summarize incidents and suggest runbooks, but only PagerDuty incidents, only PagerDuty runbooks. It doesn't know your deploy pipeline or your architecture.

incident.io builds a copilot that helps during incidents. It's useful inside their product. But it doesn't connect to your IDE, your CI/CD, your monitoring dashboards, or the agent that already knows your system.

Linear built agents as a core part of their product. Skills, automations, code intelligence, all built into Linear. Their framing is "the shared product system that turns context into execution."

Each of these is a captive agent. It lives inside the vendor's product, operates on the vendor's data, and sees your world through the vendor's lens.

The pitch sounds good in a demo. In practice, you end up with five different AI agents across five different tools, none of which talk to each other, each with a partial view of what's actually happening.

Why your own agent has more context

Think about what your agent already knows when an alert fires.

It's read the service that's failing. It knows the recent changes, can grep for the function that's throwing errors, and can tell you what changed in the last three deploys. It knows the payments service calls the billing service which calls Stripe. If it's been in your repo for a few weeks, it's picked up your deploy cadence, your branch strategy, how you test things. It's seen your postmortems.

An agent with access to Datadog or Grafana can correlate the alert with metrics, logs, and traces before anyone opens a browser tab.

No vendor-built AI will ever have this context. They'd need access to your entire codebase, your deploy history, your monitoring stack, and your team's communication patterns. That's not something you hand to every SaaS tool you use.

The API problem, not the AI problem

When your agent sees an alert, it can diagnose what's wrong. What it can't do without the right API is act on it.

It can't create an incident in your system of record. It can't check who's on call and page them. It can't escalate when no one responds. It can't log what it found to the timeline so the human responder walks in with full context.

This is an integration problem. The agent needs an API that lets it participate in the incident lifecycle the same way a human would.

That's what we built. (If you're weighing whether to build this yourself, we wrote up the three-year TCO math on build vs buy.)

npx @runframe/mcp-server --setup

A tightly scoped MCP server for the incident lifecycle, plus a full REST API. Your agent creates incidents, acknowledges them, pages responders, logs findings, escalates, and drafts postmortems. Doesn't matter if the agent is Claude, GPT, a custom model, or something that doesn't exist yet.

What this looks like in practice

An engineer is working in Cursor. A Datadog alert fires for elevated latency on the payments service.

Their agent, which already has the repo open, checks recent deploys, finds a retry logic change merged two hours ago, and creates an incident in Runframe with the relevant context. It checks who's on call, pages them with a summary that includes the suspected commit, and logs everything to the incident timeline.

The on-call engineer opens Slack, sees the page, and finds a timeline that already contains the alert details, the suspected root cause, the relevant commit, and a link to the diff. They're diagnosing in 30 seconds instead of 10 minutes.

No vendor AI did this. The engineer's own agent did, because it had the context and the API to act.

Captive vs. open: the architectural bet

This is a real architectural decision, not a marketing angle.

Captive agents are built by the vendor, trained on the vendor's data, and locked to the vendor's product. Easy to demo. Hard to extend. When you switch tools, the AI doesn't come with you.

Open agents are yours. They run in your IDE, your CI/CD, your custom pipelines. They use whatever model you want. When you switch vendors, the agent stays.

Captive agent | Open agent
	Captive agent	Open agent
Context	Only what the vendor sees	Your entire codebase + infra
Model	Vendor's choice	Your choice
Portability	Locked to vendor	Works across tools
Customization	Vendor's features	Your workflows
Cost	Bundled (opaque)	You control spend

Cursor, Claude Code, VS Code, Windsurf all support MCP. The agent that helps you write code is the same agent that should help you respond to incidents. The industry is heading that direction whether any individual vendor likes it or not.

"But isn't MCP dead?"

You've seen the posts. Perplexity's CTO moved away from MCP. Eric Holmes wrote "MCP is dead. Long live the CLI." A database MCP server with 106 tools burned 54,600 tokens just on tool discovery before doing anything useful. Security researchers found OAuth flaws, prompt injection vectors, and tool poisoning across open MCP servers.

These are real criticisms. And they mostly apply to MCP servers that shouldn't be MCP servers.

A database with 106 query tools? That's a bad MCP server. Of course the token overhead is brutal. You're asking the agent to discover and evaluate 106 tools it probably doesn't need. A CLI wrapper for git commands? Probably better as a CLI.

Runframe's MCP server is tightly scoped to one domain: the incident lifecycle. Create, acknowledge, escalate, page, resolve. An agent doesn't need to evaluate 106 options. It needs to manage an incident. The tool discovery overhead is minimal because the tool set is focused.

The critics are right that MCP isn't the answer for everything. But they're wrong that it's dead. 97 million monthly SDK downloads. 17,000+ servers. OpenAI, Google, Microsoft, and AWS all adopted it. The Linux Foundation is stewarding it as an open standard. Bloomberg cut deployment timelines from days to minutes.

What actually died is the hype phase. The "just add MCP to everything" era. What replaced it is pragmatic adoption: use MCP where agent-driven tool discovery matters, use direct APIs where the workflow is stable and known.

Incident management is one of the places where MCP fits well. An agent doesn't know ahead of time whether it'll need to create an incident, or just check who's on call, or escalate. The workflow depends on what's happening. That's what tool discovery is for.

And for teams that prefer direct API calls? We ship a full REST API too. Same capabilities, different interface. Use whatever your agent prefers.

What we're building first

We're not starting with a Runframe AI agent. We're starting with the API and MCP server that lets your agent operate through us.

Your team is already choosing its AI stack. Claude, GPT, open-source models, custom agents wired into deploy pipelines. The bigger gap right now isn't another vendor AI — it's that your agent can't create an incident, page someone, or write a postmortem.

That's what we're fixing first. An incident management platform that your existing agent can operate through. A system of record with a clean API and MCP support, so the agent you already trust can participate in the incident lifecycle.

That's Runframe.

The bottom line

Linear calls themselves "the shared product system that turns context into execution." That's a good line. Here's ours: Runframe is the incident system of record that your agents operate through.

Not our agent. Yours. We provide the API, the MCP server, the data model, and the notification system. Your agent provides the context.

Every incident management vendor is racing to build their own AI. PagerDuty, incident.io, Rootly, they're all shipping captive agents that live inside their products. We think this gets the architecture wrong. The best AI for your incidents is the one that already knows your code, your deploys, and your team's patterns. That's your agent, not ours.

What your agent needs is access. A clean API and MCP server that lets it participate in the incident lifecycle. We built that.

npx @runframe/mcp-server --setup

One MCP server, scoped to the incident lifecycle. Works with Cursor, Claude Code, VS Code, and Claude Desktop. Here's how to set it up.

Common questions

Doesn't this mean Runframe has no AI features?

We have AI-powered postmortem drafts (your choice of Claude or GPT). But our primary AI strategy is being the platform your agents interact with, not building a competing agent.

What if I don't use AI agents yet?

Runframe works the same as any incident management tool. Slack integration, on-call scheduling, escalation policies, postmortems. The MCP server and API are there when you're ready.

Which AI models work with the MCP server?

Any model that supports MCP or can make API calls. Claude, GPT, Gemini, open-source models, custom agents. The MCP server is model-agnostic.

Isn't MCP dead?

The hype phase is over. The "add MCP to everything" era. What's left is pragmatic adoption: 97M monthly SDK downloads, Linux Foundation governance, adoption by every major AI provider. MCP makes sense where workflows are dynamic and tool discovery matters. Incident management is exactly that. For stable pipelines, we also ship a full REST API.

How is this different from just having an API?

The MCP server handles tool discovery and structured inputs/outputs — it's a higher-level interface than raw REST calls, built for how agents actually work. If your agent prefers direct API calls, the v1 REST API covers the same capabilities.

What about data privacy? Does my agent send incident data to a model?

Your agent, your model, your data flow. We don't sit in the middle. The MCP server talks to Runframe's API. What your agent does with the data depends on your model provider and your configuration.

Your AI agent already knows your system better than ours ever will

The captive agent trap

Why your own agent has more context

The API problem, not the AI problem

What this looks like in practice

Captive vs. open: the architectural bet

"But isn't MCP dead?"

What we're building first

The bottom line

Common questions

Share this article

Related Articles

Alert Fatigue: Causes, Examples, and How to Reduce It

Your AI Agent Just Handled That Incident. Now What?

OpsGenie End of Life 2027: Support End Date

Incident management for early-stage engineering teams

Your Agent Can Manage Incidents Now

Best OpsGenie Alternatives in 2026: What Teams Actually Switch To

Build, Open Source, or Buy Incident Management in 2026

Slack Incident Management: What Works and What Breaks

PagerDuty Alternatives 2026: Compare Costs and Features

Incident Communication Templates: 8 Copy-Paste Examples

SLA vs. SLO vs. SLI: What Actually Matters (With Templates)

Runbook vs Playbook: Differences, Examples & Templates

OpsGenie Shutdown 2027: The Complete Migration Guide

How to Reduce MTTR in 2026: The Coordination Framework

Incident Severity Levels: SEV0-SEV4 Matrix, Examples & Template

Incident Management vs Incident Response: What's the Difference?

State of Incident Management 2026: Toil Rose 30% Despite AI

Slack Incident Response Playbook: Roles, Scripts & Templates

On-Call Rotation Guide: Schedule Templates, Handoffs & Examples

Post-Incident Review Template: Free PIR & Postmortem Examples

Incident Coordination: Cut Context Switching, Fix Faster

Scaling Incident Management: A Guide for Teams of 40-180 Engineers

Compare Tools

Runframe vs PagerDuty

Runframe vs incident.io

Runframe vs Grafana OnCall

All Comparisons

Automate Your Incident Response