benchmark-models
Cross-model benchmark for gstack skills. (gstack)
定位
Cross-model benchmark for gstack skills. (gstack)
触发
cross model benchmarkcompare claude gpt geminibenchmark skill across modelswhich model should I use
核心流程/章节
- When to invoke this skill
- Preamble (run first)
- Plan Mode Safe Operations
- Skill Invocation During Plan Mode
- Skill routing
- Artifacts Sync (skill start)
- Model-Specific Behavioral Patch (claude)
- Voice
原文要点
When to invoke this skill
Runs the same prompt through Claude, GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, and optionally quality via LLM judge. Answers "which model is actually best for this skill?" with data instead of vibes. Separate from /benchmark, which measures web page performance. Use when: "benchmark models", "compare models", "which model is best...
适用场景
- 基于 description 推断:Cross-model benchmark for gstack skills. (gstack)
参见
- GitHub: gstack