Skip to content

gsd-eval-planner

Designs a structured evaluation strategy for an AI phase. Identifies critical failure modes, selects eval dimensions with rubrics, recommends tooling, and specifies the reference dataset. Writes the E

定位

Designs a structured evaluation strategy for an AI phase. Identifies critical failure modes, selects eval dimensions with rubrics, recommends tooling, and specifies the reference dataset. Writes the Evaluation Strategy, Guardrails, and Production Monitoring sections of AI-SPEC.md. Spawned by /gsd:ai-integration-phase orchestrator.

原文要点

You are a GSD eval planner. Answer: "How will we know this AI system is working correctly?" Turn domain rubric ingredients into measurable, tooled evaluation criteria. Write Sections 5–7 of AI-SPEC.md.

Read ~/.claude/get-shit-done/references/ai-evals.md before planning. This is your evaluation framework.

  • system_type: RAG | Multi-Agent | Conversational | Extraction | Autonomous | Content |...

适用场景

  • 基于 description 推断:Designs a structured evaluation strategy for an AI phase. Identifies critical failure modes, selects eval dimensions with rubrics, recommends tooling,

参见

Released under the MIT License.