Lazy Plugin Loading (Upcoming v1.1)

99% token reduction for non-SpecWeave work through conditional plugin activation.

Status

This feature is planned for v1.1. The specification is complete and approved. Track progress in increment 0171.

The Problem

Currently, SpecWeave installs all 13 plugins (~~48 skills) at startup, consuming ~60,000 tokens even when you're doing non-SpecWeave work. This creates several issues:

Problem	Impact
Context bloat	Only 108 of ~48 skills (43%) are shown due to token limits
Wasted tokens	~60,000 tokens consumed even when SpecWeave isn't needed
Slower startup	All plugins loaded regardless of user intent
Reduced quality	Important skills get truncated from context

Evidence:  appears in system prompts.

The Solution

A lazy loading architecture that:

Installs only a lightweight router skill (~500 tokens) by default
Detects SpecWeave intent from user prompts using keyword matching
Hot-reloads full plugins only when needed (leveraging Claude Code 2.1.0+ features)
Uses context forking for heavy skills to isolate their context
Provides migration path for existing installations

Token Savings

Scenario	Current	After v1.1	Savings
Non-SpecWeave work	~60,000 tokens	~500 tokens	99%
SpecWeave work	~60,000 tokens	~60,000 (loaded on demand)	0%
Mixed session	~60,000 tokens	~30,000 avg	50%

How It Works

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                     LAZY LOADING ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  ┌──────────────────┐     ┌──────────────────────────────────────┐ │
│  │  Router Skill    │     │  Skills Cache                         │ │
│  │  (~500 tokens)   │     │  ~/.specweave/skills-cache/           │ │
│  │                  │     │                                        │ │
│  │  - Keyword       │────▶│  ├── specweave/                       │ │
│  │    detection     │     │  │   ├── increment/                   │ │
│  │  - Install       │     │  │   ├── architect/                   │ │
│  │    trigger       │     │  │   └── ... (~48 skills)             │ │
│  │  - State track   │     │  ├── specweave/                       │ │
│  └──────────────────┘     │  │   ├── lib/integrations/github/     │ │
│           │               │  │   ├── lib/integrations/jira/       │ │
│           │               │  │   └── lib/integrations/ado/        │ │
│           │               └──────────────────────────────────────┘ │
│           │                              │                          │
│           ▼                              ▼                          │
│  ┌──────────────────┐     ┌──────────────────────────────────────┐ │
│  │  Active Skills   │◀────│  Hot-Reload Copy                      │ │
│  │  ~/.claude/      │     │  (on detection)                       │ │
│  │  skills/         │     │                                        │ │
│  │                  │     │  cp -r cache/* ~/.claude/skills/      │ │
│  │  Loaded on       │     │  → Activates immediately              │ │
│  │  demand only     │     │  → No restart needed                  │ │
│  └──────────────────┘     └──────────────────────────────────────┘ │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

Keyword Detection

The router skill detects SpecWeave intent using these keywords:

Commands:

sw:, specweave, increment

Files:

spec.md, tasks.md, plan.md, metadata.json

Concepts:

living docs, living documentation
feature planning, sprint planning
acceptance criteria, user story

Workflow:

backlog, kanban, scrum

Integrations:

jira sync, github sync, ado sync

Advanced:

auto mode, parallel auto, tdd mode

Detection is case-insensitive and takes under 50ms.

Claude Code Features Used

Feature	Version	How We Use It
Skill Hot-Reload	2.1.0	Skills in `~/.claude/skills/` activate immediately without restart
Context Forking	2.1.0	`context: fork` runs heavy skills in isolated sub-agent
Setup Hook	2.1.10	Runs on `--init` for conditional plugin setup
MCP list_changed	2.1.0	Alternative: MCP server can dynamically update tools
Nested Discovery	2.1.0	Auto-discovers skills from `.claude/skills/` subdirectories

User Experience

For New Users

After v1.1, specweave init will:

Install only the router skill (~500 tokens)
Cache full plugins at ~/.specweave/skills-cache/
Inform user about lazy loading behavior
Full install option: specweave init --full

For Existing Users

Lazy loading is now enabled by default. To force a full refresh:

specweave refresh-plugins --force

Manual Control

Power users can manually manage plugin loading using the vskill CLI:

# Install plugins using vskill (RECOMMENDED)
npx vskill install --repo anton-abyzov/specweave --plugin sw       # Core Skill Fabric
npx vskill install --repo anton-abyzov/vskill --plugin mobile     # Mobile development
npx vskill install --repo anton-abyzov/specweave --plugin sw      # GitHub/JIRA/ADO included in sw
npx vskill install --repo anton-abyzov/specweave --plugin sw      # All integrations in core

# List installed plugins
vskill list

# Remove a plugin
vskill remove mobile

Available plugins:

Short Name	Install Command	Description
`sw`	`npx vskill install --repo anton-abyzov/specweave --plugin sw`	Core SpecWeave functionality
`sw-router`	`npx vskill install --repo anton-abyzov/specweave --plugin sw-router`	Agent routing
`mobile`	`vskill install --repo anton-abyzov/vskill --plugin mobile`	Mobile development
`marketing`	`vskill install --repo anton-abyzov/vskill --plugin marketing`	Marketing & social media
`google-workspace`	`vskill install --repo anton-abyzov/vskill --plugin google-workspace`	Google Workspace CLI
`productivity`	`vskill install --repo anton-abyzov/vskill --plugin productivity`	Personal productivity
`skills`	`vskill install --repo anton-abyzov/vskill --plugin skills`	Skill discovery

Context Forking for Heavy Skills

Skills larger than 200 lines will use context: fork in their frontmatter:

---
name: pm
description: Product Manager expertise...
context: fork
model: opus
---

This runs the skill in an isolated sub-agent, preventing context bloat in the main conversation. Results return to the main conversation when the forked skill completes.

Skills that will use forking:

PM Agent
Architect Agent
QA Lead Agent
Tech Lead Agent
TDD Orchestrator
And 10+ more heavy skills

State Tracking

Loading state is tracked at ~/.specweave/state/plugins-loaded.json:

{
  "version": "1.0.0",
  "lazyMode": true,
  "loadedAt": "2026-01-18T12:00:00Z",
  "loadedPlugins": [
    {
      "name": "sw",
      "loadedAt": "2026-01-18T12:00:00Z",
      "trigger": "User mentioned 'increment'",
      "skillCount": 50
    }
  ],
  "cachedPlugins": ["sw", "sw-github", "sw-jira", ...],
  "analytics": {
    "totalLoads": 42,
    "avgLoadTimeMs": 850,
    "tokensSaved": 2500000
  }
}

Graceful Degradation

If hot-reload fails:

Clear error message shown to user
"Restart Claude Code" option offered
Failure logged to ~/.specweave/logs/lazy-loading.log
Retry mechanism attempts up to 3 times
Fallback: npx vskill install --repo anton-abyzov/specweave --plugin sw

Cross-Platform Support

macOS/Linux: Bash scripts (default)

Windows: PowerShell alternative with:

Auto-detection of available shell
Long path support (>260 chars)
Same functionality as Bash version

MCP Tool Search - Current built-in Claude Code feature for tool deferred loading
AI Agents and Skills - Claude Code 2.1.0+ feature for isolated sub-agents
Context Precision - Current progressive disclosure approach

Timeline

Planning: Complete (January 2026)
Implementation: Planned for v1.1
Target: 12-20 days development time

Feedback

Have suggestions for the lazy loading feature? Open an issue or join our Discord.

The Problem​

The Solution​

Token Savings​

How It Works​

Architecture​

Keyword Detection​

Claude Code Features Used​

User Experience​

For New Users​

For Existing Users​

Manual Control​

Context Forking for Heavy Skills​

State Tracking​

Graceful Degradation​

Cross-Platform Support​

Related Features​

Timeline​

Feedback​