AI Product Skills

Skills demonstrated through building

Every skill below was developed by designing and shipping ProductIntel , a 21-module AI platform deployed to production. Not theoretical knowledge. Working, deployed evidence.

Product Design

AI-First UX Design

Designing interfaces where AI leads the experience: recommendations before controls, progressive trust, human-in-the-loop patterns.

3-level AI-first framework: L1 Recommend, L2 Adaptive UI, L3 Autonomous Execution
AI Triage as default landing, with narrative briefing before manual backlog
Team Creation Wizard: AI pre-fills everything, humans override
Deploy Risk Assessment: visualized risk scoring before destructive actions

Specification Precision

Writing instructions so explicit that machines reproduce your exact intent, leaving no gaps for hallucination to fill.

21 agent specs with explicit capabilities, constraints, and output schemas
Per-agent tool scoping via toolsEnabled arrays (principle of least privilege)
Spec Pipeline: 4-level maturity system with LLM enrichment for agent-ready stories

Task Decomposition

Breaking complex objectives into agent-sized tasks. Knowing which subtasks an agent handles solo vs. which need human oversight.

5-stage intelligence pipeline: Discovery, Analyst, Prototype, Persona, Product Owner
Pointer-based handoff protocol with 40x token reduction vs. full-document passing
4 team topologies: Solo, Supervised, Collaborative, Hybrid
Agent execution pipeline: story, worktree, implement, test, PR, with clear handoff points

AI Systems

Context Architecture

Organizing knowledge so AI agents find exactly what they need. The "Dewey Decimal System" for your AI, covering persistent vs. ephemeral context.

Context Engine: getRelevantContext(), getRecommendationContext(), getStoryContext()
Tiered onboarding: 5 min (~70% AI effectiveness) to 30 min (~95%)
KB Quality Score: coverage assessment across 7 categories with gap identification
Scoped retrieval per feature, so consumer chat vs. internal agents see different context

RAG Pipeline Design

End-to-end retrieval-augmented generation: embedding strategy, hybrid search, prompt assembly, and output grounding.

Full embedding pipeline with pgvector, stale marking, and re-embedding triggers
Hybrid search: full-text + vector similarity with configurable weighting
Production-deployed Consumer Chat: external-facing RAG chatbot with scoped retrieval
Knowledge Chat: multi-turn RAG conversation with source attribution

Multi-Agent Orchestration

Designing how multiple AI agents collaborate: handoff protocols, shared state, error propagation, and recovery strategies.

AgentWeave: purpose-built orchestration engine with pipeline, solo, and collaborative modes
21 agents across 7 teams with defined topologies and supervisor styles
Training Arena: A/B testing different orchestration configurations against benchmarks
Team config export to Claude Code, Cursor, GitHub Copilot, Aider, and Continue.dev

Prompt Engineering at Scale

Moving beyond one-off prompts to systematic prompt management: database-stored, admin-editable, and multi-provider compatible.

21 agent prompts stored in database, editable from admin UI, not hardcoded
Model config table: prompts paired with per-feature model selection
Multi-provider compatibility: same patterns work across Anthropic Claude and Google Gemini
Structured output schemas: agents return typed JSON or Markdown with specific formats

Operations

Evaluation & Quality

Building systematic ways to measure whether AI output is actually correct, not just fluent. Catching the subtle failures.

Model Calibration: golden test cases with side-by-side comparison across providers
Training Arena: LLM-as-judge evaluation scoring completeness, precision, feasibility
Inference Inspector: full retrieval attribution to verify source relevance
Agent defect tracking with auto-creation on build/test failure

Trust & Security

Designing boundaries between human and machine. Building blast radius assessments. Ensuring AI systems fail safely.

8-scanner security pipeline: npm audit, auth guards, SQL injection, XSS, RLS, and more
Git worktree isolation for agent execution: disposable branches, never main
Per-agent tool scoping, team guardrails, and configurable review policies
RLS write-blocking on all 85+ tables for demo user, belt-and-suspenders protection

Cost & Token Economics

Calculating ROI of every AI operation. Right model for the right task. The new infrastructure cost optimization.

3-tier model config: Lite (Haiku/Flash), Standard (Sonnet), Advanced (Opus) per feature
Budget service: monthly spend tracking, burn rate, projection, stories-remaining
Per-agent token tracking logged to every run with cost estimation
AI Triage surfaces estimated token cost per recommended action

AI Observability

Monitoring AI systems in production, not just uptime, but output quality, cost trends, and retrieval attribution.

Inference Inspector: per-trace observability with plain-English explanations
Workforce Monitor: real-time activity, queue depth, token burn rate
Agent run logging: full prompt + response + metadata captured per execution
Security scanner with daily automated runs, auto-assignment, and auto-resolution

Key Design Decisions

Product decisions are more valuable than features. Each of these was a deliberate choice with a specific rationale and measurable outcome.

Pointer-based context passing

Why

Full-document passing bloats context windows and wastes tokens. Agents pass only UUIDs and summaries, forcing clean boundaries.

Outcome

40x token reduction in multi-agent pipelines while maintaining output quality.

Tiered onboarding tied to AI effectiveness

Why

AI effectiveness isn't a model problem. It's a context problem. Measuring context quality makes AI investment decisions tangible.

Outcome

Quantifiable framework: 5 min onboarding = ~70% AI effectiveness, 30 min = ~95%.

Recommendations before controls

Why

Traditional tools show a wall of data and expect humans to find the signal. AI-first means the system does the analysis and presents actions.

Outcome

AI Triage as default Work page, delivering a narrative briefing with cost estimates, risk, and recommended actions.

Anti-platform architecture (fork, don't rent)

Why

Multi-tenant SaaS locks customers into shared infrastructure with no customization. Each company should own their instance.

Outcome

Module manifest + schema split + config-driven agents, customizable without code changes.

See these skills in action

ProductIntel is live with a guided demo tour. No signup required.