Infrastructure & Operations
Understanding how systems are deployed, scaled, and operated in production.
Overview
Infrastructure and operations terms cover the deployment, scaling, monitoring, and maintenance of software systems. These concepts enable teams to run reliable, scalable applications in production environments while minimizing downtime and operational overhead.
Core Concepts
Infrastructure as Code (IaC)
- Declarative infrastructure management tool
- Define infrastructure using code (not manual clicks)
- Version control for infrastructure changes
- SpecWeave increments often include Terraform modules
- Managing infrastructure through code files
- Benefits: version control, repeatability, automation
- Tools: Terraform, CloudFormation, Pulumi
- When to use: production systems, multi-environment deployments
Container Orchestration
- Container orchestration platform
- Automates deployment, scaling, and management
- Industry standard for cloud-native applications
- SpecWeave plugin:
specweave-kubernetesfor K8s-specific features
- Container platform for packaging applications
- Benefits: consistency, portability, isolation
- Used for development environments and production deployments
- SpecWeave uses Docker for local testing
Continuous Integration/Deployment
- Automated build, test, and deployment pipeline
- CI: Continuous Integration (automated testing)
- CD: Continuous Deployment (automated releases)
- SpecWeave supports GitHub Actions, GitLab CI, Jenkins
- GitHub's built-in CI/CD platform
- Workflow automation using YAML files
- Free tier for public repositories
- SpecWeave provides Action examples in increment planning
Monitoring & Observability
Observability
- Understanding system behavior from external outputs
- Three pillars: logs, metrics, traces
- Tools: Prometheus, Grafana, Datadog
- SpecWeave plugin:
specweave-observability(planned)
When to Use These Terms
| Term | Use When | Don't Use When |
|---|---|---|
| Terraform | Managing cloud resources (AWS, GCP, Azure) | Simple scripts, single-server setups |
| Kubernetes | Container orchestration, multi-service deployments | Single-container apps, simple deployments |
| Docker | Packaging applications, development consistency | Monolithic apps with simple deployment |
| CI/CD | Automated testing/deployment, team collaboration | Solo projects, infrequent releases |
| Observability | Production systems, debugging complex issues | Local development, simple scripts |
Real-World Examples
Startup Infrastructure Journey
Phase 1: MVP (0-3 months) - Manual Deployment
- Infrastructure: Single EC2 instance
- Deployment: SSH + git pull
- Database: RDS PostgreSQL
- Monitoring: CloudWatch
- Why: Fast to set up, cheap
- Result: Launch MVP with 100 users
Phase 2: Growth (3-12 months) - Docker + Basic CI/CD
- Infrastructure: 3 EC2 instances + load balancer
- Deployment: GitHub Actions + Docker Compose
- Database: RDS with read replica
- Monitoring: Prometheus + Grafana
- Why: Better reliability, automated deployments
- Result: 10K users, 99.5% uptime
Phase 3: Scale (12+ months) - Kubernetes + Full IaC
- Infrastructure: EKS cluster (Kubernetes)
- IaC: Terraform for all resources
- Deployment: ArgoCD + Helm charts
- Database: Aurora PostgreSQL cluster
- Monitoring: Datadog (logs + metrics + traces)
- Why: Auto-scaling, high availability, team autonomy
- Result: 100K users, 99.95% uptime
SpecWeave Infrastructure Example
# Increment 0012: Production Infrastructure
## Acceptance Criteria
- **AC-US1-01**: Deploy API to Kubernetes cluster (P1)
- **AC-US1-02**: Configure auto-scaling (2-10 pods) (P1)
- **AC-US1-03**: Set up Prometheus metrics (P1)
- **AC-US1-04**: Configure CI/CD pipeline (P1)
## Infrastructure Stack
- **Container Platform**: Kubernetes (EKS)
- **IaC Tool**: Terraform
- **CI/CD**: GitHub Actions
- **Monitoring**: Prometheus + Grafana
## Terraform Modules
```hcl
# kubernetes.tf
resource "aws_eks_cluster" "main" {
name = "specweave-prod"
role_arn = aws_iam_role.eks_cluster.arn
vpc_config {
subnet_ids = aws_subnet.private[*].id
}
}
resource "aws_eks_node_group" "main" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "specweave-workers"
node_role_arn = aws_iam_role.eks_nodes.arn
scaling_config {
desired_size = 3
max_size = 10
min_size = 2
}
}
GitHub Actions Workflow
# .github/workflows/deploy.yml
name: Deploy to Production
on:
push:
branches: [main]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build Docker image
run: docker build -t specweave-api:${{ github.sha }} .
- name: Push to ECR
run: |
aws ecr get-login-password | docker login --username AWS --password-stdin $ECR_REGISTRY
docker push specweave-api:${{ github.sha }}
- name: Deploy to Kubernetes
run: |
kubectl set image deployment/api api=specweave-api:${{ github.sha }}
kubectl rollout status deployment/api
Kubernetes Deployment
# k8s/api-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
selector:
matchLabels:
app: specweave-api
template:
metadata:
labels:
app: specweave-api
spec:
containers:
- name: api
image: specweave-api:latest
ports:
- containerPort: 3000
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
---
## How SpecWeave Uses Infrastructure Terms
### 1. Increment Planning with Infrastructure
When creating infrastructure-related increments:
```bash
/specweave:increment "Deploy API to Kubernetes"
The Architect agent:
- Suggests IaC tools (Terraform vs CloudFormation)
- Recommends container orchestration platform
- Includes infrastructure diagrams
- Plans CI/CD pipeline integration
2. Plugin System for Infrastructure
specweave-kubernetes plugin (planned):
- Kubernetes-specific skills (deployment, scaling, monitoring)
- Helm chart generation
- K8s configuration validation
- kubectl command suggestions
specweave-terraform plugin (planned):
- Terraform module templates
- State management guidance
- Provider-specific best practices
3. Living Documentation
Infrastructure is documented in:
.specweave/docs/internal/
├── architecture/
│ ├── infrastructure.md # Infrastructure overview
│ └── diagrams/
│ └── deployment-diagram.mmd
├── operations/
│ ├── runbook-api.md # Operational runbook
│ └── performance-tuning.md # Performance guide
└── delivery/
└── ci-cd-pipeline.md # CI/CD setup
4. Infrastructure Increments
Typical infrastructure increments in SpecWeave:
- 0005-docker-setup: Containerize application
- 0008-kubernetes-deploy: Deploy to K8s cluster
- 0012-terraform-infrastructure: IaC setup
- 0015-monitoring-setup: Observability stack
Related Categories
- Architecture & Design - System design decisions
- DevOps & Tools - Development workflows
- Security & Compliance - Production security
Learn More
Guides
- Kubernetes Fundamentals (coming soon)
- Infrastructure as Code with Terraform (coming soon)
- CI/CD Best Practices (coming soon)
Books
- "Kubernetes in Action" by Marko Lukša
- "Terraform: Up & Running" by Yevgeniy Brikman
- "Site Reliability Engineering" by Google
- "The DevOps Handbook" by Gene Kim
- "Infrastructure as Code" by Kief Morris
External Resources
- Kubernetes Documentation
- Terraform Registry
- AWS Well-Architected Framework
- The Twelve-Factor App
- CNCF Cloud Native Landscape
Navigation: