CLI & Hook Integration Testing Guide

How to write reliable integration tests for CLI commands, shell hooks, and terminal tools. Based on best practices from OpenClaw, pnpm, oclif/Salesforce, and Bubbletea.

The Testing Pyramid for CLI Tools

        /  E2E  \          5%  - Full workflows (init → increment → done)
       /  CLI    \        15%  - Spawn process, verify output + filesystem
      / Integration\      20%  - Hook execution, config parsing, sync flows
     /    Unit      \     60%  - Pure functions, parsers, validators

Core Helpers

SpecWeave provides purpose-built helpers in tests/test-utils/:

1. Home Directory Isolation (`temp-home.ts`)

Problem: CLI tests that touch ~/.specweave/ or ~/.claude/ pollute real user config.

Solution: Override HOME to a temp directory for the duration of the test.

import { withIsolatedHome, getIsolatedEnv } from '../test-utils/temp-home.js';

it('should work in isolated home', async () => {
  const { homePath, restore } = await withIsolatedHome('my-test');
  try {
    // HOME is now a temp dir with .specweave/ and .claude/ created
    const { stdout } = await execAsync('node bin/specweave.js --version', {
      env: getIsolatedEnv(homePath),
    });
    expect(stdout).toBeTruthy();
  } finally {
    await restore(); // ALWAYS restore in finally block
  }
});

For child processes, use getIsolatedEnv() which also strips NODE_OPTIONS to prevent debugger interference:

const env = getIsolatedEnv(homePath, { CI: 'true', MY_VAR: 'test' });
await execAsync(command, { env });

2. Working Directory Isolation (`isolated-test-dir.ts`)

Problem: Tests that create .specweave/ structures can collide with the real project.

Solution: Create disposable working directories in os.tmpdir().

import { createIsolatedTestDir, createSpecweaveStructure } from '../test-utils/isolated-test-dir.js';

it('should manipulate increments safely', async () => {
  const { testDir, cleanup } = await createIsolatedTestDir('increment-test');
  try {
    await createSpecweaveStructure(testDir);
    // testDir/.specweave/increments/ exists
    // NEVER touches project .specweave/
  } finally {
    await cleanup();
  }
});

3. Output Normalization (`normalize-output.ts`)

Problem: CLI output contains ANSI color codes, spinner artifacts, and platform-specific line endings.

Solution: Normalize before asserting.

import { normalizeOutput, extractJson, stripAnsi } from '../test-utils/normalize-output.js';

// Strip ANSI codes and normalize line endings
const clean = normalizeOutput(stdout); // strip ANSI + \r\n → \n + trim

// Extract JSON from mixed output (log lines + JSON)
const json = extractJson<{ decision: string }>(stdout);
expect(json?.decision).toBe('approve');

4. Hook Test Harness (`hook-test-harness.ts`)

Problem: Hooks are bash scripts; testing them requires proper process spawning.

Solution: Use the HookTestHarness class.

import { HookTestHarness } from '../test-utils/hook-test-harness.js';

it('should execute hook and return JSON', async () => {
  const harness = new HookTestHarness(testDir, hookPath);
  const result = await harness.execute({ CI: 'true' });

  expect(result.exitCode).toBe(0);
  const json = extractJson(result.stdout);
  expect(json).toHaveProperty('decision');
});

Vitest Configuration Tiers

SpecWeave separates tests by speed and resource requirements:

Config	Command	Timeout	Workers	Purpose
`vitest.config.ts`	`npm test`	10s	default	All tests
`vitest.unit.config.ts`	`npm run test:unit:fast`	5s	max	Unit tests only
`vitest.e2e.config.ts`	`npm run test:e2e:cli`	30s	2-4	CLI + e2e tests

Why separate configs? CLI tests spawn real processes (7-10s each). Running them with unit tests in parallel causes flakiness and slows the feedback loop.

Common Patterns

Pattern 1: Test CLI command output

import { exec } from 'child_process';
import { promisify } from 'util';
const execAsync = promisify(exec);

describe('CLI Commands', { timeout: 30000 }, () => {
  it('should show version', async () => {
    const { stdout } = await execAsync('node bin/specweave.js --version', {
      env: getIsolatedEnv(homePath),
    });
    expect(normalizeOutput(stdout)).toMatch(/^\d+\.\d+\.\d+$/);
  });
});

Pattern 2: Test CLI creates files

it('should create config on init', async () => {
  await execAsync('node bin/specweave.js init --adapter=claude --language=en', {
    cwd: workDir,
    env: getIsolatedEnv(homePath, { CI: 'true' }),
  });

  const configPath = path.join(workDir, '.specweave', 'config.json');
  const config = JSON.parse(await fs.readFile(configPath, 'utf-8'));
  expect(config.adapters?.default).toBe('claude');
});

Pattern 3: Test exit codes

it('should fail for unknown command', async () => {
  try {
    await execAsync('node bin/specweave.js bad-command', {
      env: getIsolatedEnv(homePath),
    });
    expect.unreachable('Expected non-zero exit');
  } catch (error: any) {
    expect(error.code).not.toBe(0);
  }
});

Pattern 4: Test hook with stdin

it('should process user prompt input', async () => {
  const input = JSON.stringify({ prompt: '/sw:progress' });
  const hookPath = 'plugins/specweave/hooks/user-prompt-submit.sh';

  const result = await execAsync(
    `echo '${input.replace(/'/g, "'\\''")}' | bash "${hookPath}"`,
    { cwd: projectRoot, env: getIsolatedEnv(homePath) }
  );

  const json = extractJson(result.stdout);
  expect(json).toHaveProperty('decision');
});

Pattern 5: Golden file testing (for stable output)

Inspired by Bubbletea's *.golden pattern and Turborepo's Insta snapshots:

it('should match expected help output', async () => {
  const { stdout } = await execAsync('node bin/specweave.js --help', {
    env: getIsolatedEnv(homePath),
  });

  // Use Vitest snapshots as golden files
  expect(normalizeOutput(stdout)).toMatchSnapshot();
});

Run with --update to regenerate: npx vitest run --update

Anti-Patterns

Do NOT: String-match file contents instead of executing

// BAD - doesn't verify the hook actually works
const hookContent = fs.readFileSync(hookPath, 'utf-8');
expect(hookContent).toContain('specweave detect-intent');

// GOOD - actually runs the hook
const harness = new HookTestHarness(testDir, hookPath);
const result = await harness.execute();
expect(result.exitCode).toBe(0);

Do NOT: Use process.cwd() in test paths

// BAD - unreliable in parallel execution, can delete real files
const testDir = process.cwd();

// GOOD - isolated temp directory
const { testDir, cleanup } = await createIsolatedTestDir('my-test');

Do NOT: Forget to strip NODE_OPTIONS

// BAD - will break in VSCode debugger
await execAsync(command, { env: process.env });

// GOOD - strips debugger flags
await execAsync(command, { env: getIsolatedEnv(homePath) });

Do NOT: Use default timeout for CLI tests

// BAD - 5s timeout will fail for init (takes 7-10s)
describe('CLI Tests', () => { ... });

// GOOD - explicit longer timeout
describe('CLI Tests', { timeout: 30000 }, () => { ... });

Comparison with Industry Tools

Feature	SpecWeave	OpenClaw	pnpm	oclif
Home isolation	`withIsolatedHome()`	`temp-home.ts`	`pnpm_tmp`	TestSession
Output normalization	`normalizeOutput()`	`normalize-text.ts`	N/A	N/A
Config tiers	3 configs	6 configs	2 modes	nyc auto
Coverage target	40%	70%	merged lcov	nyc auto
Process spawning	`execAsync` + `getIsolatedEnv`	`spawn` + env snapshot	CLI invocation	In-process
JSON extraction	`extractJson()`	Manual parsing	N/A	`--json` flag

The Testing Pyramid for CLI Tools​

Core Helpers​

1. Home Directory Isolation (temp-home.ts)​

2. Working Directory Isolation (isolated-test-dir.ts)​

3. Output Normalization (normalize-output.ts)​

4. Hook Test Harness (hook-test-harness.ts)​

Vitest Configuration Tiers​

Common Patterns​

Pattern 1: Test CLI command output​

Pattern 2: Test CLI creates files​

Pattern 3: Test exit codes​

Pattern 4: Test hook with stdin​

Pattern 5: Golden file testing (for stable output)​

Anti-Patterns​

Do NOT: String-match file contents instead of executing​

Do NOT: Use process.cwd() in test paths​

Do NOT: Forget to strip NODE_OPTIONS​

Do NOT: Use default timeout for CLI tests​

Comparison with Industry Tools​

Further Reading​