AI Product Case Study · Raoul Kahn

MCP Product Decision Copilot

An AI agent that orchestrates product metrics, roadmap constraints, and OKR alignment to deliver structured Ship / Delay / Kill recommendations for feature requests.

Raoul Kahn · raoulkahn.com/portfolio
MCP Product Decision Copilot Architecture — Metrics, Roadmap, and OKRs flowing into AI Agent producing Ship, Delay, or Kill recommendations

The Problem

Product managers make prioritization decisions by manually pulling data from multiple systems — analytics dashboards, project management tools, OKR trackers — then synthesizing it in their heads or in spreadsheets. The data exists, but the workflow is fragmented.

What if an AI agent could pull all three signals and deliver a structured recommendation? Not to replace PM judgment, but to accelerate it — surfacing the relevant data, identifying conflicts between signals, and framing the tradeoffs so the PM can make a faster, better-informed decision.

This project demonstrates how MCP (Model Context Protocol) enables AI agents to orchestrate multiple data sources and deliver structured product recommendations — the same pattern that would power internal decision tools at any product-led company.

Architecture

Three MCP tools connected to Claude Desktop, each reading from structured data files. The agent calls all three, cross-references the signals, and synthesizes a recommendation.

get_metrics(feature_area)

Product Metrics

MAU, adoption rate, retention, NPS, revenue influence, support tickets, feature requests, competitive positioning

get_roadmap(quarter)

Roadmap State

Sprint capacity, committed deliverables, tech debt, dependencies, team bandwidth, risk factors

get_okrs(quarter)

OKR Alignment

Company objectives, team key results, progress tracking, strategic themes, priority levels

The MCP server runs locally and exposes these tools via the Model Context Protocol. Claude Desktop discovers the tools automatically and calls them when a feature request is submitted. No external APIs, no database — the tools read from structured JSON files that simulate what a production system would pull from real analytics and project management platforms.

▶ Watch the Demo

Live demo: Claude calling 3 MCP tools and delivering a structured recommendation

How It Works

01

Submit Feature Request

The PM describes a feature being considered — "Should we build AI-powered smart templates for our collaboration suite?"

02

Agent Pulls Data

Claude automatically calls all three MCP tools — metrics, roadmap, and OKRs — to gather the relevant signals for that feature area.

03

Cross-Reference Signals

The agent identifies where signals align (strong demand + capacity + OKR fit) and where they conflict (strong demand but no capacity).

04

Structured Recommendation

Delivers a Ship / Delay / Kill decision with reasoning, risks, dependencies, metrics impact, and recommended next steps.

Structured Output

DecisionShip / Delay / Kill
ConfidenceHigh / Medium / Low
ReasoningWhy the data supports this decision
RisksTop 2-3 risks identified
DependenciesWhat needs to be true first
Metrics ImpactExpected effect on KPIs
Next StepOne concrete recommended action

Demo Scenarios

Four feature requests, each producing a different recommendation based on the underlying data. The agent doesn't always say yes — and the most valuable recommendations are the ones that say not yet or no with clear reasoning.

Ship

Real-Time Collaboration

"Our users are asking for real-time collaboration features — live cursors, co-editing, and inline commenting. Should we build this for Q1?"

Every signal aligned: 312 feature requests with co-editing as #1 ask, $380K ARR at risk from enterprise deals conditioning on the feature, committed Q1 OKR (P-KR1), and feasible 9-week build with WebSocket infra already planned.

Key signal: The rare case where demand, revenue, OKRs, and capacity all point to the same answer.
Kill

Integration Marketplace

"Our enterprise clients are requesting custom API integrations. Should we prioritize building an open integration marketplace?"

The feature area is in crisis, not growth mode. NPS at -12 and declining, 23% of users experiencing sync failures, 3x churn rate for affected users. Building a marketplace on an unreliable foundation would amplify the problem. Fix reliability first.

Key signal: "You don't open a restaurant to the public while the kitchen is on fire."
Delay

Workflow Automation

"We're getting requests to build an advanced workflow automation engine with conditional logic and triggers. Should we prioritize this for Q1?"

Strongest demand signal of any feature area (487 requests, $520K at-risk ARR), but architecturally blocked — legacy engine requires 6-week rewrite before any feature work. 16-18 total weeks needed vs. 8 available. The right move: design and scope in Q1, ship as Q2's anchor initiative.

Key signal: Strong demand doesn't override architectural reality. Sequence correctly or risk shipping on a broken foundation.
Tradeoff

AI-Powered Reporting

"Our sales team says AI-powered reporting and predictive analytics would close more enterprise deals. Should we build this?"

The sales team's framing was slightly off. Users aren't asking for AI reports — they're asking for smarter task prioritization (118 votes, #1 request). Full AI build needs ML infrastructure that doesn't exist. But a rule-based "smart prioritization" v1 ships in 4-5 weeks and solves 80% of the pain without any ML debt.

Key signal: The agent reframed the question — the right answer wasn't in the original ask.

What the Agent Revealed

Beyond the individual recommendations, the agent surfaced patterns that would take a PM hours to piece together manually.

01

Signals often conflict

Strong user demand doesn't mean "build now." Workflows had the strongest demand signal (487 requests, $520K at risk) but was the wrong Q1 bet due to architectural blockers. The agent caught what enthusiasm alone would miss.

02

The right answer isn't always in the original question

The AI reporting scenario showed the agent reframing a sales team's request into what users actually needed — task prioritization, not predictive analytics. The best PM tools challenge the premise, not just answer it.

03

Cross-referencing surfaces hidden dependencies

The integration marketplace scenario looked reasonable until the agent pulled support data showing -12 NPS and critical sync failures. The demand existed, but the foundation couldn't support it. No single data source tells that story alone.

Limitations & Production Considerations

Mock data, not live systems

This demo uses static JSON files simulating product data. In production, the MCP tools would connect to live analytics platforms (Amplitude, Mixpanel), project management tools (Jira, Linear), and OKR systems (Lattice, Ally.io). The architecture is the same — only the data source changes.

Agent as accelerator, not replacement

The agent synthesizes data and frames tradeoffs, but it doesn't make the final call. A PM still needs to apply context the data doesn't capture — team morale, political dynamics, strategic bets that haven't materialized in metrics yet. The value is in faster, better-informed decisions, not automated ones.

Reasoning varies across runs

The underlying data is deterministic (same JSON, same numbers), but Claude's reasoning and phrasing will differ each time. The conclusions remain consistent because the data points the same direction — but the specific wording, structure, and emphasis will vary.

This project demonstrates how I approach AI agent design: structured data inputs, cross-signal reasoning, and product-level judgment — not just tool calls.

The agent pattern here applies anywhere PMs need to synthesize multiple data sources into a decision: feature prioritization, resource allocation, go/no-go launches, and investment reviews.

Technical Details

Protocol
Model Context Protocol (MCP)
Server
Python + FastMCP SDK
Client
Claude Desktop
Model
Claude Opus 4.6
Tools
3 MCP tools (metrics, roadmap, OKRs)
Data
4 scenarios × 3 data sources (JSON)
MCP Model Context Protocol AI agents tool orchestration product management feature prioritization Claude Desktop Python FastMCP structured reasoning