Advanced Kubernetes: Custom Resources and Operators
I watched a platform engineer manually provision databases for the third time that week. Copy Deployment YAML. Adjust environment variables. Create PVC. Set up Service. Wait. Repeat for Redis. Repeat for RabbitMQ. Same steps, different values, error-prone and boring.
“We should automate this,” someone said. Shell scripts worked until they didn’t—no status tracking, no self-healing, no lifecycle management when someone deleted a resource.
Custom Resource Definitions (CRDs) and operators are Kubernetes’ answer: extend the API with your domain concepts, then encode operational knowledge in a controller that reconciles desired state with actual state.
Instead of kubectl apply -f postgres-deployment.yaml, you write kubectl apply -f database.yaml with kind: Database. The operator handles the rest. It’s the pattern behind Prometheus, Istio, and every managed-database operator you’ve used.
Custom Resources: Your API, Kubernetes’ Machinery
Kubernetes stores everything as resources (Pods, Deployments, Services). CRDs let you define new resource types:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.example.com
spec:
group: example.com
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
databaseName:
type: string
databaseUser:
type: string
size:
type: string
enum: ["small", "medium", "large"]
status:
type: object
properties:
phase:
type: string
message:
type: string
ready:
type: boolean
scope: Namespaced
names:
plural: databases
singular: database
kind: Database
Apply the CRD, and Database becomes a first-class Kubernetes resource:
kubectl apply -f crd.yaml
kubectl get databases # It works
Creating Instances
apiVersion: example.com/v1
kind: Database
metadata:
name: my-database
spec:
databaseName: mydb
databaseUser: admin
size: medium
kubectl apply -f database-instance.yaml
kubectl get databases
# NAME AGE
# my-database 5s
Without an operator, this does nothing useful—Kubernetes stores the YAML and waits. The operator is what makes it real.
The Operator Pattern: Reconciliation Loop
An operator watches custom resources and takes action to match desired state:
- Watch — observe Database resources (create, update, delete)
- Reconcile — compare desired spec vs actual cluster state
- Act — create/update/delete Deployments, Services, PVCs
- Update status — report phase (Creating, Ready, Failed)
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
var db examplev1.Database
if err := r.Get(ctx, req.NamespacedName, &db); err != nil {
return ctrl.Result{}, client.IgnoreNotFound(err)
}
if db.Status.Phase == "" {
db.Status.Phase = "Creating"
if err := r.Status().Update(ctx, &db); err != nil {
return ctrl.Result{}, err
}
}
if err := r.createDatabase(ctx, &db); err != nil {
db.Status.Phase = "Failed"
db.Status.Message = err.Error()
r.Status().Update(ctx, &db)
return ctrl.Result{}, err
}
db.Status.Phase = "Ready"
db.Status.Ready = true
return ctrl.Result{}, r.Status().Update(ctx, &db)
}
func (r *DatabaseReconciler) createDatabase(ctx context.Context, db *examplev1.Database) error {
deployment := &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: db.Name,
Namespace: db.Namespace,
},
Spec: appsv1.DeploymentSpec{
Replicas: int32Ptr(1),
Selector: &metav1.LabelSelector{
MatchLabels: map[string]string{"app": db.Name},
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: map[string]string{"app": db.Name},
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{{
Name: "postgres",
Image: "postgres:13",
Env: []corev1.EnvVar{
{Name: "POSTGRES_DB", Value: db.Spec.DatabaseName},
{Name: "POSTGRES_USER", Value: db.Spec.DatabaseUser},
},
}},
},
},
},
}
return r.Create(ctx, deployment)
}
The reconcile loop runs continuously. Delete the Deployment manually? Operator recreates it. Change size from small to medium? Operator adjusts resources. This is Kubernetes’ superpower—declarative state with automatic correction.
Operator SDK: Don’t Start From Scratch
# Install Operator SDK
curl -LO https://github.com/operator-framework/operator-sdk/releases/download/v1.28.0/operator-sdk_linux_amd64
chmod +x operator-sdk && sudo mv operator-sdk /usr/local/bin/
# Scaffold a new operator
operator-sdk init --domain example.com --repo github.com/example/database-operator
operator-sdk create api --group example --version v1 --kind Database --resource --controller
make generate
make manifests
The SDK generates CRD manifests, Go types, controller boilerplate, and RBAC rules. You fill in reconciliation logic. Kubebuilder (which powers the SDK) is the standard toolchain.
Status Updates: Tell Users What’s Happening
db.Status.Phase = "Ready"
db.Status.Message = "Database is ready"
db.Status.Ready = true
r.Status().Update(ctx, &db)
Users run kubectl get databases and see status. Without status updates, they’re kubectl-describing Pods wondering why nothing works. Status is your UX.
Finalizers: Clean Cleanup
When someone deletes a Database, you need to clean up PVCs, backups, external resources:
const finalizerName = "database.example.com/finalizer"
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
var db examplev1.Database
if err := r.Get(ctx, req.NamespacedName, &db); err != nil {
return ctrl.Result{}, err
}
if !containsString(db.Finalizers, finalizerName) {
db.Finalizers = append(db.Finalizers, finalizerName)
return ctrl.Result{}, r.Update(ctx, &db)
}
if !db.DeletionTimestamp.IsZero() {
if err := r.cleanup(ctx, &db); err != nil {
return ctrl.Result{}, err
}
db.Finalizers = removeString(db.Finalizers, finalizerName)
return ctrl.Result{}, r.Update(ctx, &db)
}
// Normal reconciliation...
return ctrl.Result{}, nil
}
Without finalizers, kubectl delete database removes the CR but leaves orphaned Deployments and PVCs. Finalizers block deletion until cleanup completes.
Production Lessons
- Idempotent reconciliation — running twice produces the same result
- Owner references — child resources (Deployments) owned by parent (Database) for garbage collection
- RBAC — operator ServiceAccount with minimal required permissions
- Leader election — multiple operator replicas, only one reconciles
- Webhook validation — reject invalid specs before they reach reconciliation
- Version your CRDs —
v1alpha1→v1beta1→v1with conversion webhooks - Test with envtest — unit test controllers without a real cluster
When to Build an Operator
Build an operator when:
- You deploy the same complex application repeatedly
- Operational knowledge is tribal (“ask Dave how to set up Redis”)
- You need day-2 operations (backup, upgrade, scaling) automated
- Platform team wants self-service for product teams
Skip the operator when:
- It’s a one-off deployment (Helm chart is enough)
- The application doesn’t need lifecycle management
- Nobody on the team knows Go (operators are typically Go)
Helm installs. Operators manage. Different tools, different problems.
Conclusion
CRDs extend Kubernetes’ API with your domain language. Operators encode operational expertise into software that runs 24/7. Together, they turn “ask platform team to provision a database” into “apply a YAML file and wait for Ready status.”
The manual database provisioning that started this post became kubectl apply -f database.yaml. Product teams self-served. Platform team wrote the operator once and maintained it instead of doing the same manual steps weekly.
Start with a simple CRD and a controller that creates one Deployment. Add status, finalizers, and cleanup incrementally. Use Operator SDK. Read existing operators (postgres-operator, prometheus-operator) for patterns.
Kubernetes’ power is declarative infrastructure. CRDs and operators let you declare your infrastructure—not just Pods and Services, but Databases, Queues, and Pipelines. That’s platform engineering.
Kubernetes Custom Resources and Operators from January 2021, covering CRDs and operator patterns.