Cost Optimization Guide
Save 60-70% on AI costs with SpecWeave's intelligent model selection
Overview
SpecWeave automatically optimizes your AI costs by intelligently routing work to the most cost-effective model:
- Sonnet 4.5 ($3/$15 per 1M tokens) for planning and strategic work
- Haiku 4.5 ($1/$5 per 1M tokens) for execution and implementation
Result: 60-70% cost savings vs using Sonnet for everything, with zero quality degradation.
How It Works
1. Automatic Model Selection
SpecWeave analyzes every task and automatically chooses the optimal model:
User: "Implement cost tracker service"
↓
SpecWeave analyzes:
• Agent type: Backend developer
• Task phase: Execution (implementation)
• Keywords: "implement", "service"
↓
Decision: Use Haiku 4.5
Reasoning: Mechanical implementation task
↓
Cost: $0.06 (vs $0.18 with Sonnet)
Savings: $0.12 (67%)
2. Three-Layer Intelligence
Layer 1: Agent Preferences Each agent knows its optimal model:
- Planning agents → Sonnet (PM, Architect, Security, QA Lead)
- Execution agents → Haiku (Tech Lead, Docs Writer, Translator)
- Hybrid agents → Auto-detect (Test-Aware Planner, TDD Orchestrator)
Layer 2: Phase Detection Analyzes your prompt to detect:
- Planning: "design", "analyze", "strategy" → Sonnet
- Execution: "implement", "build", "create" → Haiku
- Review: "validate", "test", "audit" → Sonnet
Layer 3: Safe Defaults When uncertain, defaults to Sonnet (quality over cost).
3. Real-Time Cost Tracking
Every agent invocation is tracked:
{
"agent": "frontend",
"model": "haiku",
"inputTokens": 5000,
"outputTokens": 2000,
"cost": "$0.012",
"savings": "$0.033"
}
Viewing Your Savings
Quick Summary
/specweave:costs
Output:
═══════════════════════════════════════════════════════════════
SpecWeave Cost Summary - All Increments
═══════════════════════════════════════════════════════════════
OVERALL SUMMARY
───────────────────────────────────────────────────────────────
Total Cost: $ 22.50
Total Savings: $ 22.50
Savings %: 50.0%
Total Sessions: 42
AGENT STATS
───────────────────────────────────────────────────────────────
Most Expensive: pm
Least Expensive: qa-lead
COST BY INCREMENT
───────────────────────────────────────────────────────────────
0001 $ 12.00 (12 sessions)
0002 $ 7.50 (15 sessions)
0003 $ 3.00 (15 sessions)
═══════════════════════════════════════════════════════════════
Increment-Specific Report
/specweave:costs 0003
Output:
═══════════════════════════════════════════════════════════════
Cost Report: Increment 0003
═══════════════════════════════════════════════════════════════
SUMMARY
───────────────────────────────────────────────────────────────
Total Cost: $ 3.00
Total Savings: $ 7.00
Savings %: 70.0%
Total Tokens: 125,432
Sessions: 15
COST BY MODEL
───────────────────────────────────────────────────────────────
sonnet $ 1.20 ( 40.0%)
haiku $ 1.80 ( 60.0%)
COST BY AGENT
───────────────────────────────────────────────────────────────
pm $ 1.00 ( 33.3%)
architect $ 0.50 ( 16.7%)
frontend $ 0.75 ( 25.0%)
devops $ 0.50 ( 16.7%)
qa-lead $ 0.25 ( 8.3%)
RECENT SESSIONS
───────────────────────────────────────────────────────────────
2025-10-31 14:32:15
Agent: pm Model: sonnet
Cost: $ 0.0150 Savings: $ 0.0350
2025-10-31 13:15:42
Agent: frontend Model: haiku
Cost: $ 0.0034 Savings: $ 0.0166
Example Scenarios
Scenario 1: Full-Stack Feature
Task: Build authentication system
Without SpecWeave (all Sonnet):
- PM planning: $5.00
- Architect design: $8.00
- Frontend implementation: $12.00
- Backend implementation: $15.00
- QA testing: $5.00
- Total: $45.00
With SpecWeave (intelligent selection):
- PM planning: $5.00 (Sonnet)
- Architect design: $8.00 (Sonnet)
- Frontend implementation: $4.00 (Haiku) 💰 saves $8
- Backend implementation: $5.00 (Haiku) 💰 saves $10
- QA testing: $5.00 (Sonnet)
- Total: $27.00
- Savings: $18.00 (40%)
Scenario 2: Refactoring Sprint
Task: Refactor legacy code
Without SpecWeave:
- All refactoring: $50.00 (Sonnet)
With SpecWeave:
- Initial analysis: $5.00 (Sonnet - architecture)
- Code refactoring: $15.00 (Haiku - execution) 💰 saves $30
- Final review: $5.00 (Sonnet - quality)
- Total: $25.00
- Savings: $25.00 (50%)
Scenario 3: Documentation Generation
Task: Generate API documentation
Without SpecWeave:
- Documentation: $10.00 (Sonnet)
With SpecWeave:
- Strategic docs: $3.00 (Sonnet - planning)
- API reference: $2.00 (Haiku - execution) 💰 saves $5
- Total: $5.00
- Savings: $5.00 (50%)
Manual Overrides
Force a Specific Model
Need Opus for complex reasoning? Use --model:
# Force Opus for extremely complex task
/specweave:do --model opus "Design distributed consensus algorithm"
# Force Sonnet when uncertain
/specweave:do --model sonnet "Implement feature X"
# Force Haiku for simple task
/specweave:do --model haiku "Generate test data"
Per-Agent Override
Override model for specific agent:
# Use Opus for critical security review
Task tool: agent=security, model=opus
# Use Haiku for simple diagram
Task tool: agent=diagrams-architect, model=haiku
Exporting Cost Data
JSON Export (Machine-Readable)
/specweave:costs 0003
# Select: Export to JSON
# Output: .specweave/increments/0003/reports/cost-analysis.json
Use cases:
- Import into spreadsheet
- Generate custom reports
- Track costs over time
- Budget forecasting
CSV Export (Spreadsheet-Friendly)
/specweave:costs 0003
# Select: Export to CSV
# Output: .specweave/increments/0003/reports/cost-history.csv
Use cases:
- Open in Excel/Google Sheets
- Create charts and graphs
- Analyze agent efficiency
- Identify optimization opportunities
Best Practices
1. Let SpecWeave Decide
❌ Don't: Force model for every task
/specweave:do --model sonnet "implement X" # Unnecessary
✅ Do: Trust automatic selection
/specweave:do "implement X" # SpecWeave chooses Haiku
2. Monitor Costs Regularly
# Check costs after each increment
/specweave:done 0003
/specweave:costs 0003
3. Review Most Expensive Agents
# Identify cost hotspots
/specweave:costs
# Look at "Most Expensive" agent
4. Use Haiku for Iteration
When iterating rapidly:
# Initial implementation (auto-selects Haiku)
/specweave:do "implement feature X"
# Refinements (also Haiku)
/specweave:do "add error handling"
/specweave:do "improve performance"
# Final review (auto-selects Sonnet)
/specweave:validate
Privacy & Security
What's Tracked
✅ Tracked (safe, no sensitive data):
- Agent names (public: pm, frontend, etc.)
- Model used (sonnet/haiku/opus)
- Token counts (integers)
- Costs (calculated from public pricing)
- Timestamps (when sessions ran)
❌ NEVER tracked:
- Your prompts (could contain sensitive info)
- Agent responses (could contain code/data)
- API keys (never touch this)
- File paths (could reveal structure)
- Personal information (names, emails)
Data Location
Local only: .specweave/logs/costs.json
- ✅ Stays on your machine
- ✅ You control the data
- ✅ Delete anytime
- ✅ Never sent to external services
GDPR Compliance
Since we store NO personal data:
- ✅ No PII (Personally Identifiable Information)
- ✅ No user tracking
- ✅ No third-party analytics
- ✅ Full user control
Troubleshooting
Cost Dashboard Shows $0.00
Cause: No tracked sessions yet
Solution:
- Run
/specweave:doto execute tasks - Wait for agent completion
- Run
/specweave:costsagain
Savings Seem Low
Cause: Mostly planning work (uses Sonnet)
Solution:
- Planning-heavy increments naturally use more Sonnet
- Savings increase during implementation phases
- Overall savings average 60-70% across full project
Want More Aggressive Savings
Option 1: Use more execution agents
# Instead of architect (Sonnet)
/specweave:do with agent=frontend # Uses Haiku
Option 2: Force Haiku for simple planning
/specweave:increment --model haiku "simple feature"
FAQ
Q: Does this slow down execution?
A: No. Phase detection takes <1ms. Haiku is actually 2x faster than Sonnet.
Q: Is quality affected?
A: No. Haiku 4.5 matches Sonnet 3.5 quality, perfect for execution tasks. Sonnet 4.5 is still used for all complex planning.
Q: Can I opt out?
A: Yes. Use --model sonnet to force Sonnet for every task. You'll lose savings but maintain full control.
Q: How accurate is phase detection?
A: >95% accuracy on typical prompts. When uncertain, defaults to Sonnet (quality over cost).
Q: Does this work with Opus?
A: Yes! Opus 4.0 will be supported when released. Currently: planning=Sonnet, execution=Haiku.
Next Steps
- Model Selection Guide - Deep dive into how model selection works
- Cost Tracking Reference - Technical details on cost tracking
Questions? Open an issue Feedback? Start a discussion
Last updated: 2025-10-31 | SpecWeave v0.4.0