Skip to main content

Infrastructure & Operations

Understanding how systems are deployed, scaled, and operated in production.


Overview

Infrastructure and operations terms cover the deployment, scaling, monitoring, and maintenance of software systems. These concepts enable teams to run reliable, scalable applications in production environments while minimizing downtime and operational overhead.

Core Concepts

Infrastructure as Code (IaC)

Terraform

  • Declarative infrastructure management tool
  • Define infrastructure using code (not manual clicks)
  • Version control for infrastructure changes
  • SpecWeave increments often include Terraform modules

Infrastructure as Code

  • Managing infrastructure through code files
  • Benefits: version control, repeatability, automation
  • Tools: Terraform, CloudFormation, Pulumi
  • When to use: production systems, multi-environment deployments

Container Orchestration

Kubernetes

  • Container orchestration platform
  • Automates deployment, scaling, and management
  • Industry standard for cloud-native applications
  • SpecWeave plugin: specweave-kubernetes for K8s-specific features

Docker

  • Container platform for packaging applications
  • Benefits: consistency, portability, isolation
  • Used for development environments and production deployments
  • SpecWeave uses Docker for local testing

Continuous Integration/Deployment

CI/CD

  • Automated build, test, and deployment pipeline
  • CI: Continuous Integration (automated testing)
  • CD: Continuous Deployment (automated releases)
  • SpecWeave supports GitHub Actions, GitLab CI, Jenkins

GitHub Actions

  • GitHub's built-in CI/CD platform
  • Workflow automation using YAML files
  • Free tier for public repositories
  • SpecWeave provides Action examples in increment planning

Monitoring & Observability

Observability

  • Understanding system behavior from external outputs
  • Three pillars: logs, metrics, traces
  • Tools: Prometheus, Grafana, Datadog
  • SpecWeave plugin: specweave-observability (planned)

When to Use These Terms

TermUse WhenDon't Use When
TerraformManaging cloud resources (AWS, GCP, Azure)Simple scripts, single-server setups
KubernetesContainer orchestration, multi-service deploymentsSingle-container apps, simple deployments
DockerPackaging applications, development consistencyMonolithic apps with simple deployment
CI/CDAutomated testing/deployment, team collaborationSolo projects, infrequent releases
ObservabilityProduction systems, debugging complex issuesLocal development, simple scripts

Real-World Examples

Startup Infrastructure Journey

Phase 1: MVP (0-3 months) - Manual Deployment

  • Infrastructure: Single EC2 instance
  • Deployment: SSH + git pull
  • Database: RDS PostgreSQL
  • Monitoring: CloudWatch
  • Why: Fast to set up, cheap
  • Result: Launch MVP with 100 users

Phase 2: Growth (3-12 months) - Docker + Basic CI/CD

  • Infrastructure: 3 EC2 instances + load balancer
  • Deployment: GitHub Actions + Docker Compose
  • Database: RDS with read replica
  • Monitoring: Prometheus + Grafana
  • Why: Better reliability, automated deployments
  • Result: 10K users, 99.5% uptime

Phase 3: Scale (12+ months) - Kubernetes + Full IaC

  • Infrastructure: EKS cluster (Kubernetes)
  • IaC: Terraform for all resources
  • Deployment: ArgoCD + Helm charts
  • Database: Aurora PostgreSQL cluster
  • Monitoring: Datadog (logs + metrics + traces)
  • Why: Auto-scaling, high availability, team autonomy
  • Result: 100K users, 99.95% uptime

SpecWeave Infrastructure Example

# Increment 0012: Production Infrastructure

## Acceptance Criteria
- **AC-US1-01**: Deploy API to Kubernetes cluster (P1)
- **AC-US1-02**: Configure auto-scaling (2-10 pods) (P1)
- **AC-US1-03**: Set up Prometheus metrics (P1)
- **AC-US1-04**: Configure CI/CD pipeline (P1)

## Infrastructure Stack
- **Container Platform**: Kubernetes (EKS)
- **IaC Tool**: Terraform
- **CI/CD**: GitHub Actions
- **Monitoring**: Prometheus + Grafana

## Terraform Modules
```hcl
# kubernetes.tf
resource "aws_eks_cluster" "main" {
name = "specweave-prod"
role_arn = aws_iam_role.eks_cluster.arn

vpc_config {
subnet_ids = aws_subnet.private[*].id
}
}

resource "aws_eks_node_group" "main" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "specweave-workers"
node_role_arn = aws_iam_role.eks_nodes.arn

scaling_config {
desired_size = 3
max_size = 10
min_size = 2
}
}

GitHub Actions Workflow

# .github/workflows/deploy.yml
name: Deploy to Production

on:
push:
branches: [main]

jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3

- name: Build Docker image
run: docker build -t specweave-api:${{ github.sha }} .

- name: Push to ECR
run: |
aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REGISTRY
docker push specweave-api:${{ github.sha }}

- name: Deploy to Kubernetes
run: |
kubectl set image deployment/api api=specweave-api:${{ github.sha }}
kubectl rollout status deployment/api

Kubernetes Deployment

# k8s/api-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
selector:
matchLabels:
app: specweave-api
template:
metadata:
labels:
app: specweave-api
spec:
containers:
- name: api
image: specweave-api:latest
ports:
- containerPort: 3000
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10

---

## How SpecWeave Uses Infrastructure Terms

### 1. Increment Planning with Infrastructure

When creating infrastructure-related increments:

```bash
/specweave:increment "Deploy API to Kubernetes"

The Architect agent:

  • Suggests IaC tools (Terraform vs CloudFormation)
  • Recommends container orchestration platform
  • Includes infrastructure diagrams
  • Plans CI/CD pipeline integration

2. Plugin System for Infrastructure

specweave-kubernetes plugin (planned):

  • Kubernetes-specific skills (deployment, scaling, monitoring)
  • Helm chart generation
  • K8s configuration validation
  • kubectl command suggestions

specweave-terraform plugin (planned):

  • Terraform module templates
  • State management guidance
  • Provider-specific best practices

3. Living Documentation

Infrastructure is documented in:

.specweave/docs/internal/
├── architecture/
│ ├── infrastructure.md # Infrastructure overview
│ └── diagrams/
│ └── deployment-diagram.mmd
├── operations/
│ ├── runbook-api.md # Operational runbook
│ └── performance-tuning.md # Performance guide
└── delivery/
└── ci-cd-pipeline.md # CI/CD setup

4. Infrastructure Increments

Typical infrastructure increments in SpecWeave:

  • 0005-docker-setup: Containerize application
  • 0008-kubernetes-deploy: Deploy to K8s cluster
  • 0012-terraform-infrastructure: IaC setup
  • 0015-monitoring-setup: Observability stack


Learn More

Guides

  • Kubernetes Fundamentals (coming soon)
  • Infrastructure as Code with Terraform (coming soon)
  • CI/CD Best Practices (coming soon)

Books

  • "Kubernetes in Action" by Marko Lukša
  • "Terraform: Up & Running" by Yevgeniy Brikman
  • "Site Reliability Engineering" by Google
  • "The DevOps Handbook" by Gene Kim
  • "Infrastructure as Code" by Kief Morris

External Resources


Navigation: