Cost Optimization Guide

Save 60-70% on AI costs with SpecWeave's intelligent model selection

Overview

SpecWeave automatically optimizes your AI costs by intelligently routing work to the most cost-effective model:

Opus 4.6 ($5/$25 per 1M tokens) for planning, architecture, and complex reasoning (default)
Haiku 4.5 ($1/$5 per 1M tokens) for mechanical execution and simple implementation

Result: Maximum quality with Opus 4.6 as default, with Haiku optimization for routine tasks.

How It Works

1. Automatic Model Selection

SpecWeave analyzes every task and automatically chooses the optimal model:

User: "Implement cost tracker service"
          ↓
SpecWeave analyzes:
  • Agent type: Backend developer
  • Task phase: Execution (implementation)
  • Keywords: "implement", "service"
          ↓
Decision: Use Haiku 4.5
Reasoning: Mechanical implementation task
          ↓
Cost: $0.06 (vs $0.90 with Opus)
Savings: $0.84 (93%)

2. Three-Layer Intelligence

Layer 1: Agent Preferences Each agent knows its optimal model:

All agents → Opus 4.6 (PM, Architect, Security, QA Lead, Tech Lead, etc.)
Simple tasks → Haiku (when task has detailed spec + clear instructions)

Layer 2: Phase Detection Analyzes your prompt to detect:

Planning/Design: "design", "analyze", "strategy" → Opus
Complex Implementation: architecture decisions, trade-offs → Opus
Mechanical Execution: clear spec, simple "implement X" → Haiku

Layer 3: Safe Defaults When uncertain, defaults to Opus 4.6 (maximum quality and reasoning).

3. Real-Time Cost Tracking

Every agent invocation is tracked:

{
  "agent": "frontend",
  "model": "haiku",
  "inputTokens": 5000,
  "outputTokens": 2000,
  "cost": "$0.012",
  "savings": "$0.033"
}

Viewing Your Savings

Quick Summary

/specweave:costs

Output:

═══════════════════════════════════════════════════════════════
  SpecWeave Cost Summary - All Increments
═══════════════════════════════════════════════════════════════

OVERALL SUMMARY
───────────────────────────────────────────────────────────────
  Total Cost:       $   22.50
  Total Savings:    $   22.50
  Savings %:              50.0%
  Total Sessions:           42

AGENT STATS
───────────────────────────────────────────────────────────────
  Most Expensive:    pm
  Least Expensive:   qa-engineer

COST BY INCREMENT
───────────────────────────────────────────────────────────────
  0001                          $ 12.00  (12 sessions)
  0002                          $  7.50  (15 sessions)
  0003                          $  3.00  (15 sessions)

═══════════════════════════════════════════════════════════════

Increment-Specific Report

/specweave:costs 0003

Output:

═══════════════════════════════════════════════════════════════
  Cost Report: Increment 0003
═══════════════════════════════════════════════════════════════

SUMMARY
───────────────────────────────────────────────────────────────
  Total Cost:       $    3.00
  Total Savings:    $    7.00
  Savings %:              70.0%
  Total Tokens:         125,432
  Sessions:                  15

COST BY MODEL
───────────────────────────────────────────────────────────────
  sonnet          $    1.20  ( 40.0%)
  haiku           $    1.80  ( 60.0%)

COST BY AGENT
───────────────────────────────────────────────────────────────
  pm                        $  1.00  ( 33.3%)
  architect                 $  0.50  ( 16.7%)
  frontend                  $  0.75  ( 25.0%)
  devops                    $  0.50  ( 16.7%)
  qa-engineer               $  0.25  (  8.3%)

RECENT SESSIONS
───────────────────────────────────────────────────────────────
  2025-10-31 14:32:15
  Agent: pm                  Model: opus
  Cost: $ 0.0150    Savings: $ 0.0350

  2025-10-31 13:15:42
  Agent: frontend            Model: haiku
  Cost: $ 0.0034    Savings: $ 0.0166

Example Scenarios

Scenario 1: Full-Stack Feature

Task: Build authentication system

Without SpecWeave (all Sonnet):

PM planning: $5.00
Architect design: $8.00
Frontend implementation: $12.00
Backend implementation: $15.00
QA testing: $5.00
Total: $45.00

With SpecWeave (intelligent selection):

PM planning: $5.00 (Sonnet)
Architect design: $8.00 (Sonnet)
Frontend implementation: $4.00 (Haiku) 💰 saves $8
Backend implementation: $5.00 (Haiku) 💰 saves $10
QA testing: $5.00 (Sonnet)
Total: $27.00
Savings: $18.00 (40%)

Scenario 2: Refactoring Sprint

Task: Refactor legacy code

Without SpecWeave:

All refactoring: $50.00 (Sonnet)

With SpecWeave:

Initial analysis: $5.00 (Sonnet - architecture)
Code refactoring: $15.00 (Haiku - execution) 💰 saves $30
Final review: $5.00 (Sonnet - quality)
Total: $25.00
Savings: $25.00 (50%)

Scenario 3: Documentation Generation

Task: Generate API documentation

Without SpecWeave:

Documentation: $10.00 (Sonnet)

With SpecWeave:

Strategic docs: $3.00 (Sonnet - planning)
API reference: $2.00 (Haiku - execution) 💰 saves $5
Total: $5.00
Savings: $5.00 (50%)

Manual Overrides

Force a Specific Model

Need Opus for complex reasoning? Use --model:

# Force Opus for extremely complex task
/specweave:do --model opus "Design distributed consensus algorithm"

# Force Sonnet when uncertain
/specweave:do --model sonnet "Implement feature X"

# Force Haiku for simple task
/specweave:do --model haiku "Generate test data"

Per-Agent Override

Override model for specific agent:

# Use Opus for critical security review
Task tool: agent=security, model=opus

# Use Haiku for simple diagram
Task tool: agent=diagrams, model=haiku

Exporting Cost Data

JSON Export (Machine-Readable)

/specweave:costs 0003
# Select: Export to JSON

# Output: .specweave/increments/0003/reports/cost-analysis.json

Use cases:

Import into spreadsheet
Generate custom reports
Track costs over time
Budget forecasting

CSV Export (Spreadsheet-Friendly)

/specweave:costs 0003
# Select: Export to CSV

# Output: .specweave/increments/0003/reports/cost-history.csv

Use cases:

Open in Excel/Google Sheets
Create charts and graphs
Analyze agent efficiency
Identify optimization opportunities

Best Practices

1. Let SpecWeave Decide

❌ Don't: Force model for every task

/specweave:do --model sonnet "implement X"  # Unnecessary

✅ Do: Trust automatic selection

/specweave:do "implement X"  # SpecWeave chooses Haiku

2. Monitor Costs Regularly

# Check costs after each increment
/specweave:done 0003
/specweave:costs 0003

3. Review Most Expensive Agents

# Identify cost hotspots
/specweave:costs
# Look at "Most Expensive" agent

4. Use Haiku for Iteration

When iterating rapidly:

# Initial implementation (auto-selects Haiku)
/specweave:do "implement feature X"

# Refinements (also Haiku)
/specweave:do "add error handling"
/specweave:do "improve performance"

# Final review (auto-selects Sonnet)
/specweave:validate

Privacy & Security

What's Tracked

✅ Tracked (safe, no sensitive data):

Agent names (public: pm, frontend, etc.)
Model used (sonnet/haiku/opus)
Token counts (integers)
Costs (calculated from public pricing)
Timestamps (when sessions ran)

❌ NEVER tracked:

Your prompts (could contain sensitive info)
Agent responses (could contain code/data)
API keys (never touch this)
File paths (could reveal structure)
Personal information (names, emails)

Data Location

Local only: .specweave/logs/costs.json

✅ Stays on your machine
✅ You control the data
✅ Delete anytime
✅ Never sent to external services

Since we store NO personal data:

✅ No PII (Personally Identifiable Information)
✅ No user tracking
✅ No third-party analytics
✅ Full user control

Troubleshooting

Cost Dashboard Shows $0.00

Cause: No tracked sessions yet

Solution:

Run /specweave:do to execute tasks
Wait for agent completion
Run /specweave:costs again

Savings Seem Low

Cause: Mostly planning work (uses Sonnet)

Solution:

Planning-heavy increments naturally use more Sonnet
Savings increase during implementation phases
Overall savings average 60-70% across full project

Want More Aggressive Savings

Option 1: Use more execution agents

# Instead of architect (Sonnet)
/specweave:do with agent=frontend  # Uses Haiku

Option 2: Force Haiku for simple planning

/specweave:increment --model haiku "simple feature"

FAQ

Q: Does this slow down execution?

A: No. Phase detection takes <1ms. Haiku is actually 2x faster than Sonnet.

Q: Is quality affected?

A: No. Haiku 4.5 matches Sonnet 4 quality, perfect for execution tasks. Opus 4.6 is used for all complex planning.

Q: Can I opt out?

A: Yes. Use --model sonnet to force Sonnet for every task. You'll lose savings but maintain full control.

Q: How accurate is phase detection?

A: >95% accuracy on typical prompts. When uncertain, defaults to Opus (maximum quality).

Q: Does this work with Opus?

A: Yes! Opus 4.6 is the default for planning and complex work, Haiku for execution.

Next Steps

Model Selection Guide - Deep dive into how model selection works
Cost Tracking Reference - Technical details on cost tracking

Questions? Open an issue Feedback? Start a discussion

Last updated: 2026-03-01 | SpecWeave v1.0.342

Overview​

How It Works​

1. Automatic Model Selection​

2. Three-Layer Intelligence​

3. Real-Time Cost Tracking​

Viewing Your Savings​

Quick Summary​

Increment-Specific Report​

Example Scenarios​

Scenario 1: Full-Stack Feature​

Scenario 2: Refactoring Sprint​

Scenario 3: Documentation Generation​

Manual Overrides​

Force a Specific Model​

Per-Agent Override​

Exporting Cost Data​

JSON Export (Machine-Readable)​

CSV Export (Spreadsheet-Friendly)​

Best Practices​

1. Let SpecWeave Decide​

2. Monitor Costs Regularly​

3. Review Most Expensive Agents​

4. Use Haiku for Iteration​

Privacy & Security​

What's Tracked​

Data Location​

GDPR Compliance​

Troubleshooting​

Cost Dashboard Shows $0.00​

Savings Seem Low​

Want More Aggressive Savings​

FAQ​

Q: Does this slow down execution?​

Q: Is quality affected?​

Q: Can I opt out?​

Q: How accurate is phase detection?​

Q: Does this work with Opus?​

Next Steps​