HashiCorp Nomad is the orchestrator you reach for when Kubernetes feels like overkill. It schedules containers, VMs, Java apps, static binaries—basically anything—across your infrastructure with a fraction of Kubernetes’s operational complexity.

I switched to Nomad after wrestling with Kubernetes for a 10-server deployment. K8s is powerful but complex: etcd clusters, CNI plugins, ingress controllers, service meshes. For our needs (schedule some Docker containers, maybe some batch jobs), it was like using a crane to hang a picture. Nomad felt like using a hammer—simple tool, right job.

The operational model is elegant: run Nomad servers (consensus via Raft), run Nomad clients (execute workloads), submit job specs (HCL files), done. No separate scheduler, no node controllers, no 50 CRDs to learn.

Created by HashiCorp, Nomad integrates beautifully with Consul (service discovery) and Vault (secrets). Together they form a cohesive infrastructure stack.

Nomad vs Kubernetes

Feature Nomad Kubernetes
Learning curve Low High
Operational overhead Minimal Significant
Workload types Containers, VMs, binaries, Java Containers only
Cluster size 10-10,000 nodes 100-5,000 nodes
Community size Smaller Massive
Cloud native features Basic Extensive
Good for Small-medium deployments Large, complex systems

Choose Nomad when:

  • Team size <20 people
  • Running diverse workloads (not just containers)
  • Want operational simplicity
  • Multi-cloud or hybrid cloud
  • Need to schedule batch jobs, cron jobs, services together

Choose Kubernetes when:

  • Large teams need advanced features
  • Heavy investment in K8s ecosystem
  • Need extensive tooling (Helm, operators, etc.)
  • Cloud-native best practices required

Read Nomad vs Kubernetes for official comparison.

Core Architecture

Servers - Form consensus quorum (Raft), make scheduling decisions. Run 3-5 servers for HA.

Clients - Execute tasks on behalf of servers. Can be 10 or 10,000 nodes.

Jobs - Declarative specs written in HCL (HashiCorp Configuration Language). Like Kubernetes Deployments but simpler.

Allocations - Running instances of tasks. Nomad places allocations on clients based on resources and constraints.

Drivers - Execute workloads: Docker, exec (raw binaries), Java, QEMU (VMs), and custom plugins.

Install Nomad (single binary!):

# Download latest release
curl -LO https://releases.hashicorp.com/nomad/1.7.5/nomad_1.7.5_linux_amd64.zip
unzip nomad_1.7.5_linux_amd64.zip
sudo mv nomad /usr/local/bin/

# Verify
nomad version

# Start dev agent (single node, not for production)
nomad agent -dev

For production, see deployment guide.

Writing Job Specifications

Jobs are written in HCL:

# web-app.nomad.hcl
job "web-app" {
  datacenters = ["dc1"]
  type = "service"
  
  group "app" {
    count = 3  # Run 3 instances
    
    network {
      port "http" {
        to = 8080
      }
    }
    
    task "server" {
      driver = "docker"
      
      config {
        image = "nginx:1.25"
        ports = ["http"]
      }
      
      resources {
        cpu    = 500  # MHz
        memory = 256  # MB
      }
      
      service {
        name = "web-app"
        port = "http"
        
        check {
          type     = "http"
          path     = "/health"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

Deploy:

# Validate job file
nomad job validate web-app.nomad.hcl

# Plan changes (like terraform plan)
nomad job plan web-app.nomad.hcl

# Run job
nomad job run web-app.nomad.hcl

# Check status
nomad job status web-app

# View logs
nomad alloc logs <allocation-id>

Batch Jobs

job "data-pipeline" {
  datacenters = ["dc1"]
  type = "batch"  # Runs once, exits
  
  group "etl" {
    task "extract" {
      driver = "docker"
      
      config {
        image = "python:3.11"
        command = "python"
        args = ["etl.py"]
      }
      
      # Artifact fetching
      artifact {
        source = "https://example.com/etl.py"
      }
      
      resources {
        cpu    = 2000
        memory = 4096
      }
    }
  }
}

Periodic Jobs (Cron)

job "backup" {
  datacenters = ["dc1"]
  type = "batch"
  
  periodic {
    cron             = "0 2 * * *"  # Daily at 2am
    prohibit_overlap = true  # Don't run if previous still running
    time_zone        = "America/New_York"
  }
  
  group "backup" {
    task "run-backup" {
      driver = "exec"
      
      config {
        command = "/usr/local/bin/backup.sh"
      }
    }
  }
}

Integration with Consul and Vault

Nomad shines when combined with Consul and Vault:

Service Discovery with Consul

Nomad automatically registers services:

task "api" {
  driver = "docker"
  
  config {
    image = "api:latest"
    ports = ["http"]
  }
  
  service {
    name = "api"
    port = "http"
    tags = ["v1", "production"]
    
    # Health check
    check {
      type     = "http"
      path     = "/health"
      interval = "10s"
      timeout  = "2s"
    }
    
    # Connect for service mesh (mTLS)
    connect {
      sidecar_service {}
    }
  }
}

Other services discover via Consul DNS:

curl http://api.service.consul:8080/users

Secrets Management with Vault

Inject secrets securely:

task "app" {
  driver = "docker"
  
  vault {
    policies = ["app-policy"]
  }
  
  template {
    data = <<EOF
DATABASE_URL=
API_KEY=
EOF
    destination = "secrets/app.env"
    env         = true
  }
  
  config {
    image = "app:latest"
  }
}

Vault issues short-lived credentials, automatically renewed by Nomad.

Production Best Practices

  1. Use namespaces - Isolate teams:
    nomad namespace apply -description "Dev team" dev
    nomad job run -namespace=dev app.nomad.hcl
    
  2. Enable ACLs - Secure the API:
    nomad acl bootstrap
    nomad acl policy apply developer developer-policy.hcl
    
  3. Monitor with Prometheus - Export metrics:
    telemetry {
      prometheus_metrics = true
      publish_allocation_metrics = true
      publish_node_metrics = true
    }
    
  4. Backup state - Snapshot Raft data:
    nomad operator snapshot save backup.snap
    
  5. Use nomad plan - Validate before applying:
    nomad job plan app.nomad.hcl
    # Shows what will change, like terraform plan
    
  6. Set resource limits - Prevent resource exhaustion: ```hcl resources { cpu = 1000 # MHz memory = 1024 # MB

memory_max = 2048 # Hard limit }


7. **Implement health checks** - Detect and restart failed tasks:
```hcl
service {
  check {
    type     = "http"
    path     = "/health"
    interval = "10s"
    timeout  = "2s"
    
    check_restart {
      limit = 3
      grace = "10s"
    }
  }
}

Conclusion

Nomad proves orchestration doesn’t have to be complex. A single binary, simple HCL job specs, and multi-workload support make it pragmatic for teams that don’t need Kubernetes’s complexity.

The HashiCorp ecosystem integration is seamless: Consul for service discovery, Vault for secrets, Terraform for infrastructure. Everything works together naturally.

For small-to-medium deployments, Nomad hits a sweet spot: powerful enough for production, simple enough that one person can manage it. When your alternative is Kubernetes and you don’t need its advanced features, Nomad is worth serious consideration.

Further Resources:


Nomad orchestration from May 2024, expanded with practical patterns and operational guidance.