Lazy Plugin Loading (Upcoming v1.1)
99% token reduction for non-SpecWeave work through conditional plugin activation.
This feature is planned for v1.1. The specification is complete and approved. Track progress in increment 0171.
The Problem
Currently, SpecWeave installs all 13 plugins (~~48 skills) at startup, consuming ~60,000 tokens even when you're doing non-SpecWeave work. This creates several issues:
| Problem | Impact |
|---|---|
| Context bloat | Only 108 of ~48 skills (43%) are shown due to token limits |
| Wasted tokens | ~60,000 tokens consumed even when SpecWeave isn't needed |
| Slower startup | All plugins loaded regardless of user intent |
| Reduced quality | Important skills get truncated from context |
Evidence: <!-- Showing 108 of ~48 skills due to token limits --> appears in system prompts.
The Solution
A lazy loading architecture that:
- Installs only a lightweight router skill (~500 tokens) by default
- Detects SpecWeave intent from user prompts using keyword matching
- Hot-reloads full plugins only when needed (leveraging Claude Code 2.1.0+ features)
- Uses context forking for heavy skills to isolate their context
- Provides migration path for existing installations
Token Savings
| Scenario | Current | After v1.1 | Savings |
|---|---|---|---|
| Non-SpecWeave work | ~60,000 tokens | ~500 tokens | 99% |
| SpecWeave work | ~60,000 tokens | ~60,000 (loaded on demand) | 0% |
| Mixed session | ~60,000 tokens | ~30,000 avg | 50% |
How It Works
Architecture
┌─────────────────────────────────────────────────────────────────────┐
│ LAZY LOADING ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────────────────────────┐ │
│ │ Router Skill │ │ Skills Cache │ │
│ │ (~500 tokens) │ │ ~/.specweave/skills-cache/ │ │
│ │ │ │ │ │
│ │ - Keyword │────▶│ ├── specweave/ │ │
│ │ detection │ │ │ ├── increment/ │ │
│ │ - Install │ │ │ ├── architect/ │ │
│ │ trigger │ │ │ └── ... (~48 skills) │ │
│ │ - State track │ │ ├── specweave/ │ │
│ └──────────────────┘ │ │ ├── lib/integrations/github/ │ │
│ │ │ │ ├── lib/integrations/jira/ │ │
│ │ │ │ └── lib/integrations/ado/ │ │
│ │ └──────────────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────────────┐ │
│ │ Active Skills │◀────│ Hot-Reload Copy │ │
│ │ ~/.claude/ │ │ (on detection) │ │
│ │ skills/ │ │ │ │
│ │ │ │ cp -r cache/* ~/.claude/skills/ │ │
│ │ Loaded on │ │ → Activates immediately │ │
│ │ demand only │ │ → No restart needed │ │
│ └──────────────────┘ └──────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Keyword Detection
The router skill detects SpecWeave intent using these keywords:
Commands:
sw:,specweave,increment
Files:
spec.md,tasks.md,plan.md,metadata.json
Concepts:
living docs,living documentationfeature planning,sprint planningacceptance criteria,user story
Workflow:
backlog,kanban,scrum
Integrations:
jira sync,github sync,ado sync
Advanced:
auto mode,parallel auto,tdd mode
Detection is case-insensitive and takes under 50ms.
Claude Code Features Used
| Feature | Version | How We Use It |
|---|---|---|
| Skill Hot-Reload | 2.1.0 | Skills in ~/.claude/skills/ activate immediately without restart |
| Context Forking | 2.1.0 | context: fork runs heavy skills in isolated sub-agent |
| Setup Hook | 2.1.10 | Runs on --init for conditional plugin setup |
| MCP list_changed | 2.1.0 | Alternative: MCP server can dynamically update tools |
| Nested Discovery | 2.1.0 | Auto-discovers skills from .claude/skills/ subdirectories |
User Experience
For New Users
After v1.1, specweave init will:
- Install only the router skill (~500 tokens)
- Cache full plugins at
~/.specweave/skills-cache/ - Inform user about lazy loading behavior
- Full install option:
specweave init --full
For Existing Users
Lazy loading is now enabled by default. To force a full refresh:
specweave refresh-plugins --force
Manual Control
Power users can manually manage plugin loading using the vskill CLI:
# Install plugins using vskill (RECOMMENDED)
npx vskill install --repo anton-abyzov/specweave --plugin sw # Core Skill Fabric
npx vskill install --repo anton-abyzov/vskill --plugin mobile # Mobile development
npx vskill install --repo anton-abyzov/specweave --plugin sw # GitHub/JIRA/ADO included in sw
npx vskill install --repo anton-abyzov/specweave --plugin sw # All integrations in core
# List installed plugins
vskill list
# Remove a plugin
vskill remove mobile
Available plugins:
| Short Name | Install Command | Description |
|---|---|---|
sw | npx vskill install --repo anton-abyzov/specweave --plugin sw | Core SpecWeave functionality |
sw-router | npx vskill install --repo anton-abyzov/specweave --plugin sw-router | Agent routing |
mobile | vskill install --repo anton-abyzov/vskill --plugin mobile | Mobile development |
marketing | vskill install --repo anton-abyzov/vskill --plugin marketing | Marketing & social media |
google-workspace | vskill install --repo anton-abyzov/vskill --plugin google-workspace | Google Workspace CLI |
productivity | vskill install --repo anton-abyzov/vskill --plugin productivity | Personal productivity |
skills | vskill install --repo anton-abyzov/vskill --plugin skills | Skill discovery |
Context Forking for Heavy Skills
Skills larger than 200 lines will use context: fork in their frontmatter:
---
name: pm
description: Product Manager expertise...
context: fork
model: opus
---
This runs the skill in an isolated sub-agent, preventing context bloat in the main conversation. Results return to the main conversation when the forked skill completes.
Skills that will use forking:
- PM Agent
- Architect Agent
- QA Lead Agent
- Tech Lead Agent
- TDD Orchestrator
- And 10+ more heavy skills
State Tracking
Loading state is tracked at ~/.specweave/state/plugins-loaded.json:
{
"version": "1.0.0",
"lazyMode": true,
"loadedAt": "2026-01-18T12:00:00Z",
"loadedPlugins": [
{
"name": "sw",
"loadedAt": "2026-01-18T12:00:00Z",
"trigger": "User mentioned 'increment'",
"skillCount": 50
}
],
"cachedPlugins": ["sw", "sw-github", "sw-jira", ...],
"analytics": {
"totalLoads": 42,
"avgLoadTimeMs": 850,
"tokensSaved": 2500000
}
}
Graceful Degradation
If hot-reload fails:
- Clear error message shown to user
- "Restart Claude Code" option offered
- Failure logged to
~/.specweave/logs/lazy-loading.log - Retry mechanism attempts up to 3 times
- Fallback:
npx vskill install --repo anton-abyzov/specweave --plugin sw
Cross-Platform Support
macOS/Linux: Bash scripts (default)
Windows: PowerShell alternative with:
- Auto-detection of available shell
- Long path support (>260 chars)
- Same functionality as Bash version
Related Features
- MCP Tool Search - Current built-in Claude Code feature for tool deferred loading
- AI Agents and Skills - Claude Code 2.1.0+ feature for isolated sub-agents
- Context Precision - Current progressive disclosure approach
Timeline
- Planning: Complete (January 2026)
- Implementation: Planned for v1.1
- Target: 12-20 days development time
Feedback
Have suggestions for the lazy loading feature? Open an issue or join our Discord.