Skip to content

benchmark-models

Cross-model benchmark for gstack skills. (gstack)

定位

Cross-model benchmark for gstack skills. (gstack)

触发

  • cross model benchmark
  • compare claude gpt gemini
  • benchmark skill across models
  • which model should I use

核心流程/章节

  • When to invoke this skill
  • Preamble (run first)
  • Plan Mode Safe Operations
  • Skill Invocation During Plan Mode
  • Skill routing
  • Artifacts Sync (skill start)
  • Model-Specific Behavioral Patch (claude)
  • Voice

原文要点

When to invoke this skill

Runs the same prompt through Claude, GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, and optionally quality via LLM judge. Answers "which model is actually best for this skill?" with data instead of vibes. Separate from /benchmark, which measures web page performance. Use when: "benchmark models", "compare models", "which model is best...

适用场景

  • 基于 description 推断:Cross-model benchmark for gstack skills. (gstack)

参见

Released under the MIT License.