4 min read

GitOps w/ FluxCD: Add Health Checks to a CI/CD Flow.

In the world of bits/bytes, the human being is the weakest link in the quality chain. Use Health Probes to ensure Kustomizations don't fail to deploy Pods.

A lot was achieved in the first article in this The Right Way series. We bootstrapped FluxCD, wrote some Source and Kustomization objects, checked in changed K8s manifests and then saw how FluxCD automatically synced edited infrastructure with deployed objects.

But what would happen, if there was some sort of human error that crept into any of our manifests? How would Source and/or Kustomization controllers react?

💡
Resources and code files used in the previous article are still valid for this exercise. The example below uses the vote-deployment.yaml file in the instavote/deploy/vote folder.

Let's assume we fat-finger the name of the image in our simple K8s Deployment manifest.

Check in the code and, using either kubectl or flux, investigate the state of the Pods, Source and Kustomization Controllers.

Did either Source or Kustomization Controllers throw an error?

That can be easily found.

flux get sources git instavote
flux get kustomizations vote-dev

Notice both are READY (and we can only assume the error we introduced was not discovered by either the Source Controller or the Kustomization Controller).

If both Source and Kustomization are in a READY state, the Pods MUST have updated to the fat-finger container image. Right?

As always, to see if Pods were updated to the new image, we could do a simple...

kubectl get pods -n instavote

"Houston, we do have a problem" !!

A deep dive into the Pod events list shows us the problem: the image we fat-fingered does not (surprise, surprise) exist and thus, the Pod has no recourse but to back off and print a message informing the error.

But this is concerning.

To eliminate the low-value busyness in our work, we turn to automation. The hope is, as it should be, to offload easy-to-complete tasks to automatons and focus on the more complex solutions for which automation does not currently exist.

That's why any Engineering Team would spend their time and money adding GitOps capabilities to their overall process. However, if we still rely on manual intervention to ensure our K8s infrastructure meets our needs, why bother with GitOps/FluxCD?

Enter Health Checks

We can add simple health checks to our GitOps Controllers that should be able to preempt error discovery and make sure it is shared with the engineer's way before the error(s) are discovered in production.

We will add the check to the Kustomization for this example since it performs the actual deployments.

Add a health check flag in the vote-dev kustomization YAML

The health check will monitor the outcome of executing the Deployment called vote in the namespace instavote.

apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: vote-dev
  namespace: flux-system
spec:
  healthChecks:
  - kind: Deployment
    name: vote
    namespace: instavote
  interval: 1m0s
  path: ./deploy/vote
  prune: true
  sourceRef:
    kind: GitRepository
    name: instavote
  targetNamespace: instavote

Check in the updated vote-dev-kustomization.yaml file to the flux-infra repo.

Did the Health Check work?

Since we added the check to the kustomization file, any errors/failures to deploy will be caught by the vote-dev kustomization object.

The health check did work, turned the state of vote-dev to FALSE and provided a message identifying the cause of the problem.

Revert to a working Deployment

Change the image name in instavote/deploy/vote/vote-deployment.yaml file to v1.

Check in the file to instavote repo, and after a minute, re-check the state of the vote-dev kustomization.

Things have returned to normal, and we should be able to see healthy Pods.

Summary: What and Why?

GitHub Folder Structure So Far

Onwards.


I write to remember, and if, in the process, I can help someone learn about Containers, Orchestration (Docker Compose, Kubernetes), GitOps, DevSecOps, VR/AR, Architecture, and Data Management, that is just icing on the cake.