Kubernetes Manifest Validation: Catch Errors Before kubectl Apply

Every team that runs Kubernetes hits the same workflow: edit a YAML manifest, run kubectl apply, watch the cluster reject it, fix the typo, repeat. The cycle ends with someone realizing the error was a single character — an indent off by two spaces, an apiVersion that moved out of beta three releases ago, a label selector that does not match the pod template. The cluster is the most expensive place to discover any of those.

I once shipped a Deployment with apiVersion: extensions/v1beta1 to a 1.16 cluster. The pipeline went green because the staging cluster still ran 1.13. The prod cluster did not. I learned about the API removal at 2am from a paging alert. The bug was in the YAML the whole time — and a 30-second local schema check would have caught it.

The four kinds of validation, ranked by when they catch what

"Kubernetes validation" is not a single thing. The manifest goes through four progressively stricter checks, and most failures can be caught long before the cluster.

YAML well-formedness — the file parses at all. Indentation, quoting, duplicate keys. A linter catches this in your editor.
Schema validation — the manifest matches the OpenAPI schema for that apiVersion and kind. This is what catches typos in field names, wrong value types, missing required fields. Runs offline against the schema bundle.
Server-side dry run — kubectl apply --dry-run=server. The API server validates against its own schema and runs admission webhooks. Catches things the offline schema cannot: OPA policies, mutating webhooks, namespace existence.
Policy validation— OPA Gatekeeper, Kyverno, or Conftest. Custom rules like "every Deployment must set resource limits" or "no privileged containers." Runs against your written policies, not the schema.

The first two run without any cluster access. The third needs a cluster but does not commit changes. The fourth needs your policy bundle. A good workflow chains them — fail fast on cheap checks, save the expensive ones for the moments they actually matter.

Offline schema validation, in 30 seconds

The cheapest check is the most often skipped because the tooling has shifted three times in five years. Here is what works today:

# 옵션 1: kubeconform — 빠르고 CI 친화적, kubeval 후속
brew install kubeconform
kubeconform -strict deployment.yaml

# 옵션 2: 클러스터에 붙어서 dry-run — 가장 정확하지만 클러스터 필요
kubectl apply --dry-run=server -f deployment.yaml

# 옵션 3: kustomize build + kubeconform — 오버레이까지 검증
kustomize build overlays/prod | kubeconform -strict

kubeconform is the maintained successor to kubeval (which is archived). It runs in milliseconds and catches the schema-level mistakes that make up most CI failures. For a quick browser check without installing anything, the Kubernetes manifest validator runs the same check against the bundled schema.

The apiVersion graveyard

Kubernetes removes APIs on a schedule. Every minor release retires something. The manifests in your repo from two years ago probably reference at least one API that no longer exists on a modern cluster.

The most common removals to remember:

extensions/v1beta1 — gone in 1.16. Deployments, DaemonSets, ReplicaSets moved to apps/v1.
networking.k8s.io/v1beta1 Ingress — gone in 1.22. The v1 schema changed serviceName to service.name, which trips up older copy-pasted manifests.
batch/v1beta1 CronJob — gone in 1.25. Promoted to batch/v1.
autoscaling/v2beta2 HorizontalPodAutoscaler — gone in 1.26. The v2 schema renamed several fields.
PodSecurityPolicy — gone in 1.25. Replaced by Pod Security Admission.

A schema validator pinned to your target cluster version will reject every one of these. Pin the validator to the actual version your cluster runs, not the latest available — old manifests against new schema rules is the most common false positive.

CRDs and the schema you do not have

Custom Resource Definitions are the hard case. The schema lives on the cluster, not in the validator's bundle. A manifest for an Istio VirtualService or a Prometheus ServiceMonitor will pass YAML lint, fail offline schema validation (no schema for that kind), and only get a real check at --dry-run=server.

Three workable strategies for CRD validation in CI:

Skip strict mode for CRDs — kubeconform -ignore-missing-schemas will not fail on unknown kinds. You lose the check but the build passes.
Bundle CRD schemas — dump every installed CRD's OpenAPI schema with kubectl get crd -o json and convert to JSON Schema. Tools like openapi2jsonschema automate this. Point kubeconform at the bundle.
Spin up a kind cluster in CI — install the same operators that prod runs, then kubectl apply --dry-run=server. Slow but catches everything. Worth it for repos where CRDs dominate.

Most teams pick option 1 for speed and option 3 in a separate slower job that runs nightly or on release branches.

CI integration that pays for itself

A GitHub Actions step that catches 80% of YAML mistakes:

# .github/workflows/k8s-validate.yml
name: Validate Kubernetes manifests
on: [pull_request]

jobs:
  kubeconform:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install kubeconform
        run: |
          curl -L -o kubeconform.tar.gz \
            https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz
          tar xf kubeconform.tar.gz
          sudo mv kubeconform /usr/local/bin/

      # 클러스터 버전 고정 — 매 PR에서 동일한 검증을 보장
      - name: Validate manifests
        run: |
          find k8s -name '*.yaml' -exec \
            kubeconform -strict \
              -kubernetes-version 1.29.0 \
              -ignore-missing-schemas \
              {} +

Three flags carry the weight: -strict rejects unknown fields (catches typos), -kubernetes-version pins the target schema (catches API removals before they bite production), and -ignore-missing-schemas keeps the build from failing on CRDs you have not bundled yet.

The three failures I see in almost every audit

1. Selector and template labels drift apart

A Deployment's spec.selector.matchLabels must equal a subset of spec.template.metadata.labels. Schema validation does not check this — both fields look fine in isolation. The Deployment applies, never picks up any pods, and the rollout sits at "0 of 3" until someone notices. The cure is a separate lint step (kube-linter catches it).

2. ConfigMap mounted as a file vs as a directory

Setting subPath on a ConfigMap mount converts a file from a watched, hot-reloadable symlink to a plain file that does not update on rotation. The schema accepts both, the bug only shows up when you change a config in prod and wait for a restart that never comes.

3. Resource limits that the cluster cannot satisfy

A pod with requests.memory: 64Gi on a node pool with 16Gi nodes will validate, fail to schedule, and sit in Pending indefinitely. The fix is a policy (OPA or Kyverno) that rejects requests larger than your largest node. Not a schema concern.

When to reach for OPA / Kyverno

Schema validation answers "is this manifest syntactically valid for this kind?" Policy validation answers "is this manifest allowed under our rules?" The two questions overlap in zero places.

Policies worth writing on day one:

Every container must set memory and CPU limits.
No pod runs as root unless explicitly allowlisted (runAsNonRoot: true).
No image tagged :latest.
Every Service of type LoadBalancer requires an annotation justifying the cost.

Conftest (built on OPA) runs policies against YAML in CI without a cluster. Kyverno runs as an admission webhook on the cluster itself. Both can use the same Rego or Kyverno policy files, so writing them once is enough.

Helm vs Kustomize — the validation surfaces look different

Both produce the same kind of YAML at the end, but the step where you can validate is different. Knowing where to insert the check saves an hour the first time the templated output goes wrong.

# Helm — values.yaml은 검증 안 되고, 렌더링된 manifest만 검증 가능
helm template my-release ./chart -f values.yaml | kubeconform -strict

# 단점: values.yaml에 오타가 있으면 그냥 빈 값으로 렌더링됨
# 보완책: helm lint도 같이 (스키마는 약하지만 syntax는 잡음)
helm lint ./chart -f values.yaml

# Kustomize — overlay 별로 build 후 검증, base 자체도 검증 가능
kustomize build base/        | kubeconform -strict
kustomize build overlays/dev | kubeconform -strict
kustomize build overlays/prod| kubeconform -strict

# overlay별 schema 차이 (예: prod만 PodDisruptionBudget 추가)도 자동 캐치

Helm's downside is that values.yamlis not validated against the chart's expected shape unless the chart ships a JSON Schema (values.schema.json). Few public charts do. If the team owns the chart, write the schema once — it costs an hour and catches the "why is replicas blank" bug forever.

Kustomize is structurally easier to validate because every overlay is plain YAML at every layer. The trade-off is that strategic merges (the JSON-patch-style overrides) only evaluate at build time, so a malformed patch only surfaces during kustomize build, not when editing the patch file.

Pre-commit hooks — the cheapest place to fail

CI catches mistakes, but the developer has already pushed the branch and waited for a runner by then. A pre-commit hook catches the same mistakes 30 seconds after they were typed.

# .pre-commit-config.yaml — pre-commit 프레임워크 기준
repos:
  - repo: https://github.com/syntaqx/git-hooks
    rev: v0.0.18
    hooks:
      - id: forbid-binary
      - id: shellcheck

  # YAML 문법 검사 (kubeconform보다 가벼움)
  - repo: https://github.com/adrienverge/yamllint
    rev: v1.35.1
    hooks:
      - id: yamllint
        args: [-c=.yamllint.yml]

  # kubeconform — 변경된 manifest만 검증
  - repo: local
    hooks:
      - id: kubeconform
        name: kubeconform
        entry: kubeconform -strict -kubernetes-version 1.29.0
        language: system
        files: ^k8s/.*\.ya?ml$

Pre-commit is opt-in per developer, so it does not replace CI — but the average team member catches 80% of YAML mistakes before they ever leave the laptop. The CI step still runs as the source of truth.

Wrapping up

Catching a YAML mistake before kubectl apply costs nothing. Catching it after costs an outage, a retro, and the trust of whoever was on call. Pin a schema version, run kubeconform on every PR, write four or five Conftest rules for the obvious mistakes, and the production incidents that come from manifest typos drop to roughly zero.

For a quick check before opening a PR, paste the manifest into the Kubernetes manifest validator — same schema, no install, instant feedback.

The four kinds of validation, ranked by when they catch what

"Kubernetes validation" is not a single thing. The manifest goes through four progressively stricter checks, and most failures can be caught long before the cluster.

YAML well-formedness — the file parses at all. Indentation, quoting, duplicate keys. A linter catches this in your editor.
Schema validation — the manifest matches the OpenAPI schema for that apiVersion and kind. This is what catches typos in field names, wrong value types, missing required fields. Runs offline against the schema bundle.
Server-side dry run — kubectl apply --dry-run=server. The API server validates against its own schema and runs admission webhooks. Catches things the offline schema cannot: OPA policies, mutating webhooks, namespace existence.
Policy validation— OPA Gatekeeper, Kyverno, or Conftest. Custom rules like "every Deployment must set resource limits" or "no privileged containers." Runs against your written policies, not the schema.

Offline schema validation, in 30 seconds

The cheapest check is the most often skipped because the tooling has shifted three times in five years. Here is what works today:

# 옵션 1: kubeconform — 빠르고 CI 친화적, kubeval 후속
brew install kubeconform
kubeconform -strict deployment.yaml

# 옵션 2: 클러스터에 붙어서 dry-run — 가장 정확하지만 클러스터 필요
kubectl apply --dry-run=server -f deployment.yaml

# 옵션 3: kustomize build + kubeconform — 오버레이까지 검증
kustomize build overlays/prod | kubeconform -strict

The apiVersion graveyard

The most common removals to remember:

extensions/v1beta1 — gone in 1.16. Deployments, DaemonSets, ReplicaSets moved to apps/v1.
networking.k8s.io/v1beta1 Ingress — gone in 1.22. The v1 schema changed serviceName to service.name, which trips up older copy-pasted manifests.
batch/v1beta1 CronJob — gone in 1.25. Promoted to batch/v1.
autoscaling/v2beta2 HorizontalPodAutoscaler — gone in 1.26. The v2 schema renamed several fields.
PodSecurityPolicy — gone in 1.25. Replaced by Pod Security Admission.

CRDs and the schema you do not have

Three workable strategies for CRD validation in CI:

Skip strict mode for CRDs — kubeconform -ignore-missing-schemas will not fail on unknown kinds. You lose the check but the build passes.
Bundle CRD schemas — dump every installed CRD's OpenAPI schema with kubectl get crd -o json and convert to JSON Schema. Tools like openapi2jsonschema automate this. Point kubeconform at the bundle.
Spin up a kind cluster in CI — install the same operators that prod runs, then kubectl apply --dry-run=server. Slow but catches everything. Worth it for repos where CRDs dominate.

Most teams pick option 1 for speed and option 3 in a separate slower job that runs nightly or on release branches.

CI integration that pays for itself

A GitHub Actions step that catches 80% of YAML mistakes:

# .github/workflows/k8s-validate.yml
name: Validate Kubernetes manifests
on: [pull_request]

jobs:
  kubeconform:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install kubeconform
        run: |
          curl -L -o kubeconform.tar.gz \
            https://github.com/yannh/kubeconform/releases/latest/download/kubeconform-linux-amd64.tar.gz
          tar xf kubeconform.tar.gz
          sudo mv kubeconform /usr/local/bin/

      # 클러스터 버전 고정 — 매 PR에서 동일한 검증을 보장
      - name: Validate manifests
        run: |
          find k8s -name '*.yaml' -exec \
            kubeconform -strict \
              -kubernetes-version 1.29.0 \
              -ignore-missing-schemas \
              {} +

The three failures I see in almost every audit

1. Selector and template labels drift apart

2. ConfigMap mounted as a file vs as a directory

3. Resource limits that the cluster cannot satisfy

When to reach for OPA / Kyverno

Schema validation answers "is this manifest syntactically valid for this kind?" Policy validation answers "is this manifest allowed under our rules?" The two questions overlap in zero places.

Policies worth writing on day one:

Every container must set memory and CPU limits.
No pod runs as root unless explicitly allowlisted (runAsNonRoot: true).
No image tagged :latest.
Every Service of type LoadBalancer requires an annotation justifying the cost.

Helm vs Kustomize — the validation surfaces look different

Both produce the same kind of YAML at the end, but the step where you can validate is different. Knowing where to insert the check saves an hour the first time the templated output goes wrong.

# Helm — values.yaml은 검증 안 되고, 렌더링된 manifest만 검증 가능
helm template my-release ./chart -f values.yaml | kubeconform -strict

# 단점: values.yaml에 오타가 있으면 그냥 빈 값으로 렌더링됨
# 보완책: helm lint도 같이 (스키마는 약하지만 syntax는 잡음)
helm lint ./chart -f values.yaml

# Kustomize — overlay 별로 build 후 검증, base 자체도 검증 가능
kustomize build base/        | kubeconform -strict
kustomize build overlays/dev | kubeconform -strict
kustomize build overlays/prod| kubeconform -strict

# overlay별 schema 차이 (예: prod만 PodDisruptionBudget 추가)도 자동 캐치

Pre-commit hooks — the cheapest place to fail

CI catches mistakes, but the developer has already pushed the branch and waited for a runner by then. A pre-commit hook catches the same mistakes 30 seconds after they were typed.

# .pre-commit-config.yaml — pre-commit 프레임워크 기준
repos:
  - repo: https://github.com/syntaqx/git-hooks
    rev: v0.0.18
    hooks:
      - id: forbid-binary
      - id: shellcheck

  # YAML 문법 검사 (kubeconform보다 가벼움)
  - repo: https://github.com/adrienverge/yamllint
    rev: v1.35.1
    hooks:
      - id: yamllint
        args: [-c=.yamllint.yml]

  # kubeconform — 변경된 manifest만 검증
  - repo: local
    hooks:
      - id: kubeconform
        name: kubeconform
        entry: kubeconform -strict -kubernetes-version 1.29.0
        language: system
        files: ^k8s/.*\.ya?ml$

Wrapping up

For a quick check before opening a PR, paste the manifest into the Kubernetes manifest validator — same schema, no install, instant feedback.

The four kinds of validation, ranked by when they catch what

Offline schema validation, in 30 seconds

The apiVersion graveyard

CRDs and the schema you do not have

CI integration that pays for itself

The three failures I see in almost every audit

1. Selector and template labels drift apart

2. ConfigMap mounted as a file vs as a directory

3. Resource limits that the cluster cannot satisfy

When to reach for OPA / Kyverno

Helm vs Kustomize — the validation surfaces look different

Pre-commit hooks — the cheapest place to fail

Wrapping up

Related Tools

Kubernetes Validator

Docker Compose Validator

YAML Formatter

YAML Validator

Related Articles

How to Generate Secure Passwords in 2026: A Complete Guide

JSON vs YAML: When to Use What — A Developer's Guide

The four kinds of validation, ranked by when they catch what

Offline schema validation, in 30 seconds

The apiVersion graveyard

CRDs and the schema you do not have

CI integration that pays for itself

The three failures I see in almost every audit

1. Selector and template labels drift apart

2. ConfigMap mounted as a file vs as a directory

3. Resource limits that the cluster cannot satisfy

When to reach for OPA / Kyverno

Helm vs Kustomize — the validation surfaces look different

Pre-commit hooks — the cheapest place to fail

Wrapping up

Related Tools

Kubernetes Validator

Docker Compose Validator

YAML Formatter

YAML Validator

Related Articles

How to Generate Secure Passwords in 2026: A Complete Guide

JSON vs YAML: When to Use What — A Developer's Guide