benchmark-models

Cross-model benchmark for gstack skills. (gstack)

定位

Cross-model benchmark for gstack skills. (gstack)

触发

cross model benchmark
compare claude gpt gemini
benchmark skill across models
which model should I use

核心流程/章节

When to invoke this skill
Preamble (run first)
Plan Mode Safe Operations
Skill Invocation During Plan Mode
Skill routing
Artifacts Sync (skill start)
Model-Specific Behavioral Patch (claude)
Voice

原文要点

When to invoke this skill

Runs the same prompt through Claude, GPT (via Codex CLI), and Gemini side-by-side — compares latency, tokens, cost, and optionally quality via LLM judge. Answers "which model is actually best for this skill?" with data instead of vibes. Separate from /benchmark, which measures web page performance. Use when: "benchmark models", "compare models", "which model is best...

适用场景

基于 description 推断：Cross-model benchmark for gstack skills. (gstack)

参见

GitHub: gstack

benchmark-models ​

定位 ​

触发 ​

核心流程/章节 ​

原文要点 ​

When to invoke this skill ​

适用场景 ​

参见 ​

benchmark-models

定位

触发

核心流程/章节

原文要点

When to invoke this skill

适用场景

参见