Skills demonstrated through building.
I built ProductIntel to explore what product management looks like when AI can participate in discovery, specification, prioritization, execution, observability, and continuous learning. Every skill below was developed designing and shipping it as a 21-module platform deployed to production. Not theoretical knowledge. Working, deployed evidence.
AI-first UX design
Designing interfaces where AI leads the experience: recommendations before controls, progressive trust, human-in-the-loop patterns.
- ·3-level AI-first framework: L1 Recommend, L2 Adaptive UI, L3 Autonomous Execution
- ·AI Triage as default landing, with narrative briefing before manual backlog
- ·Team Creation Wizard: AI pre-fills everything, humans override
- ·Deploy Risk Assessment: visualized risk scoring before destructive actions
Specification precision
Writing instructions so explicit that machines reproduce your exact intent, leaving no gaps for hallucination to fill.
- ·21 agent specs with explicit capabilities, constraints, and output schemas
- ·Per-agent tool scoping via toolsEnabled arrays (principle of least privilege)
- ·Spec Pipeline: 4-level maturity system with LLM enrichment for agent-ready stories
Task decomposition
Breaking complex objectives into agent-sized tasks. Knowing which subtasks an agent handles solo vs. which need human oversight.
- ·5-stage intelligence pipeline: Discovery, Analyst, Prototype, Persona, Product Owner
- ·Pointer-based handoff protocol with 40x token reduction vs. full-document passing
- ·4 team topologies: Solo, Supervised, Collaborative, Hybrid
- ·Agent execution pipeline: story, worktree, implement, test, PR, with clear handoff points
Adoption-aware design
Designing AI products for the user who actually shows up, not the one the builder wishes existed. Serving teams across the adoption curve in one product.
- ·Dual-mode workflow: Today (daily cycle, agent-native) and Work (sprint-based) on shared data and AI infrastructure
- ·Company-level work-mode preference (agent / dual / human) swaps surfaces at runtime. No migration required.
- ·Same artifact schema underneath, so spec-enrichment, smart-assign, and agent execution work in either mode
Context architecture
Organizing knowledge so AI agents find exactly what they need. The "Dewey Decimal System" for your AI, covering persistent vs. ephemeral context.
- ·Context Engine: getRelevantContext(), getRecommendationContext(), getStoryContext()
- ·Tiered onboarding: 5 min (~70% AI effectiveness) to 30 min (~95%)
- ·KB Quality Score: coverage assessment across 7 categories with gap identification
- ·Scoped retrieval per feature, so consumer chat vs. internal agents see different context
RAG pipeline design
End-to-end retrieval-augmented generation: embedding strategy, hybrid search, prompt assembly, and output grounding.
- ·Full embedding pipeline with pgvector, stale marking, and re-embedding triggers
- ·Hybrid search: full-text + vector similarity with configurable weighting
- ·Production-deployed Consumer Chat: external-facing RAG chatbot with scoped retrieval
- ·Knowledge Chat: multi-turn RAG conversation with source attribution
Multi-agent orchestration
Designing how multiple AI agents collaborate: handoff protocols, shared state, error propagation, and recovery strategies.
- ·AgentWeave: purpose-built orchestration engine with pipeline, solo, and collaborative modes
- ·21 agents across 7 teams with defined topologies and supervisor styles
- ·Training Arena: A/B testing different orchestration configurations against benchmarks
- ·Team config export to Claude Code, Cursor, GitHub Copilot, Aider, and Continue.dev
Prompt engineering at scale
Moving beyond one-off prompts to systematic prompt management: database-stored, admin-editable, multi-provider compatible.
- ·21 agent prompts stored in database, editable from admin UI, not hardcoded
- ·Model config table: prompts paired with per-feature model selection
- ·Multi-provider compatibility: same patterns work across Anthropic Claude and Google Gemini
- ·Structured output schemas: agents return typed JSON or Markdown with specific formats
Evaluation & quality
Building systematic ways to measure whether AI output is actually correct, not just fluent. Catching the subtle failures.
- ·Model Calibration: golden test cases with side-by-side comparison across providers
- ·Training Arena: LLM-as-judge evaluation scoring completeness, precision, feasibility
- ·Inference Inspector: full retrieval attribution to verify source relevance
- ·Agent defect tracking with auto-creation on build/test failure
Trust & security
Designing boundaries between human and machine. Building blast radius assessments. Ensuring AI systems fail safely.
- ·8-scanner security pipeline: npm audit, auth guards, SQL injection, XSS, RLS, and more
- ·Git worktree isolation for agent execution: disposable branches, never main
- ·Per-agent tool scoping, team guardrails, and configurable review policies
- ·RLS write-blocking on all 85+ tables for demo user, belt-and-suspenders protection
Cost & token economics
Calculating ROI of every AI operation. Right model for the right task. The new infrastructure cost optimization.
- ·3-tier model config: Lite (Haiku/Flash), Standard (Sonnet), Advanced (Opus) per feature
- ·Budget service: monthly spend tracking, burn rate, projection, stories-remaining
- ·Per-agent token tracking logged to every run with cost estimation
- ·AI Triage surfaces estimated token cost per recommended action
AI observability
Monitoring AI systems in production: not just uptime, but output quality, cost trends, and retrieval attribution.
- ·Inference Inspector: per-trace observability with plain-English explanations
- ·Workforce Monitor: real-time activity, queue depth, token burn rate
- ·Agent run logging: full prompt + response + metadata captured per execution
- ·Security scanner with daily automated runs, auto-assignment, and auto-resolution
Product decisions are more valuable than features. Each of these was a deliberate choice with a specific rationale and measurable outcome.
Pointer-based context passing
WhyFull-document passing bloats context windows and wastes tokens. Agents pass only UUIDs and summaries, forcing clean boundaries.
Outcome40x token reduction in multi-agent pipelines while maintaining output quality.
Context investment as the AI quality lever
WhyAI effectiveness isn't a model problem. It's a context problem. The more product context the system has, the better its outputs. But most teams don't measure or budget for it.
OutcomeBuilt as Product Context Setup, a tiered onboarding flow where each tier maps to a measurable AI quality target: a 5-minute Quick Start reaches ~70% effectiveness, a 30-minute full walkthrough reaches ~95%.
Recommendations before controls
WhyTraditional tools show a wall of data and expect humans to find the signal. AI-first means the system does the analysis and presents actions.
OutcomeAI Triage as default Work page, delivering a narrative briefing with cost estimates, risk, and recommended actions.
Anti-platform architecture (fork, don't rent)
WhyMulti-tenant SaaS locks customers into shared infrastructure with no customization. Each company should own their instance.
OutcomeModule manifest + schema split + config-driven agents, customizable without code changes.
Dual-mode workflow on shared infrastructure
WhyCompanies sit at different points on the AI adoption curve. A product that only serves the destination fails the team using it today.
OutcomeToday + Work modules coexist on the same data model, selectable per company. Traditional teams adopt AI-as-assist; agent-native teams run daily cycles; hybrid teams use both.
ProductIntel is live with a guided demo tour. No signup required.