Docker Kubernetes Java guide

Docker to Kubernetes: A Java Developer’s Production Guide

For Spring Boot developers in 2026, the line between developer and operations has all but disappeared. If you are writing services and you cannot containerize and deploy them, you are shipping half the job. This guide on Docker Kubernetes Java covers what most teams wish they had known when they first moved from “it works on my machine” to “it works in production, at scale, every time.”

Why Java Developers Need Containerization Skills

Consider the reality of a modern service. Your microservice does not exist in isolation. It runs alongside dozens of other services, needs specific JVM settings, connects to databases and message brokers, and has to scale under load. Containers solve the environment consistency problem, and Kubernetes solves the orchestration problem. Knowing both isn’t optional anymore — it has become table stakes for senior Java roles, and most job listings now treat it as a baseline rather than a bonus.

Writing Optimized Dockerfiles for Spring Boot

Most tutorials show you a naive Dockerfile. Don’t use it in production. A multi-stage build keeps the final image small and secure, and it is the pattern the official Spring and Docker documentation both recommend:

# Stage 1: Build
FROM eclipse-temurin:21-jdk-alpine AS builder
WORKDIR /app
COPY gradle/ gradle/
COPY gradlew build.gradle settings.gradle ./
RUN ./gradlew dependencies --no-daemon
COPY src/ src/
RUN ./gradlew bootJar --no-daemon -x test

# Stage 2: Runtime
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
COPY --from=builder /app/build/libs/*.jar app.jar
USER appuser
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]

Key things happening here:

Alpine base cuts the image from roughly 400MB to around 180MB
JRE, not JDK in the runtime stage — you don’t need a compiler in production
Non-root user because running as root in containers is a security risk
Dependency layer caching — the COPY gradlew and RUN dependencies steps are cached unless your build files change, which saves minutes on rebuilds

Layered JARs and Distroless: Going Further

The multi-stage build above is the baseline, but two refinements are worth knowing because they show up in well-tuned production pipelines. First, Spring Boot supports layered JARs, which split the application archive into separate layers — dependencies, Spring Boot loader, snapshot dependencies, and your own classes — ordered from least to most frequently changed. Because your application code changes far more often than your third-party dependencies, putting them in distinct image layers means a code change only invalidates the small top layer, and the large dependency layer stays cached.

# Build, then unpack the layered jar
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
ARG JAR=build/libs/app.jar
COPY ${JAR} app.jar
RUN java -Djarmode=layertools -jar app.jar extract

# Copy layers least-to-most volatile for optimal caching
COPY --from=builder /app/dependencies/ ./
COPY --from=builder /app/spring-boot-loader/ ./
COPY --from=builder /app/snapshot-dependencies/ ./
COPY --from=builder /app/application/ ./
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]

Second, for the strictest security posture, consider a distroless base image instead of Alpine. Distroless images contain only your application and its runtime dependencies — no shell, no package manager, nothing for an attacker to pivot into. The trade-off is real, so weigh it honestly: debugging is harder when you cannot exec a shell into the container, and many teams keep Alpine in development for that convenience while reserving distroless for production. There is no single right answer; the point is to choose deliberately rather than default to whatever the first tutorial showed.

JVM Memory Settings in Containers

This is a problem that catches teams repeatedly. The JVM doesn’t always play nice with container memory limits. If your container has a 512MB limit and the JVM tries to allocate its default heap based on the host’s memory, the OOM killer will terminate your process with no warning and no friendly stack trace — just an exit code 137.

Stop using -Xmx in containers. Use percentage-based flags instead:

ENTRYPOINT ["java", \
  "-XX:MaxRAMPercentage=75.0", \
  "-XX:InitialRAMPercentage=50.0", \
  "-XX:+UseG1GC", \
  "-XX:+UseContainerSupport", \
  "-jar", "app.jar"]

-XX:MaxRAMPercentage=75.0 tells the JVM to use up to 75% of the container’s memory limit. The remaining 25% is for non-heap memory, thread stacks, metaspace, and OS overhead. This scales automatically — whether your container has 512MB or 4GB. Why 75% rather than 100%? Because the heap is only part of the JVM’s footprint. Metaspace, compressed class space, thread stacks (roughly 1MB each by default), the code cache, and direct byte buffers all live outside the heap, and they are exactly what gets squeezed when you set the heap too aggressively. A container that is OOM-killed despite a healthy-looking heap is almost always one where off-heap memory was forgotten.

Docker Compose for Local Development

Before jumping into Kubernetes, get your local development workflow right with Docker Compose:

version: '3.8'
services:
  app:
    build: .
    ports:
      - "8080:8080"
    environment:
      - SPRING_PROFILES_ACTIVE=local
      - SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/myapp
      - SPRING_REDIS_HOST=redis
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: secret
    ports:
      - "5432:5432"
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U admin -d myapp"]
      interval: 5s
      timeout: 3s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

One docker compose up and your entire stack is running. Every developer on the team gets an identical environment. No more “which version of Postgres are you running?” conversations. The condition: service_healthy dependency is the detail that makes this reliable: without it, your app container races the database and fails on the first connection attempt because Postgres has accepted the port but isn’t ready to serve queries yet. The healthcheck makes the dependency meaningful rather than cosmetic.

Kubernetes Basics: The Mental Model

If Docker is a shipping container, Kubernetes is the port authority. Here’s the hierarchy that matters:

Concept	What It Does	Java Analogy
Pod	Runs one or more containers	A single JVM process
Deployment	Manages pod replicas and updates	A managed thread pool
Service	Stable network endpoint for pods	A load balancer / DNS entry
ConfigMap	External configuration	application.yml but outside the jar
Secret	Sensitive configuration	Encrypted credentials

The mental shift that trips up developers coming from a single-server world is that Kubernetes is declarative. You don’t tell it “start three pods”; you declare “I want three pods,” and a control loop continuously works to make reality match that declaration. Kill a pod and Kubernetes recreates it, not because something caught the event, but because the observed state no longer equals the desired state. Internalizing that reconciliation model explains nearly every behavior you will later find surprising.

Deploying a Spring Boot App to Kubernetes

Here’s a production-grade deployment manifest, with the reasoning behind each section:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  labels:
    app: order-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
  template:
    metadata:
      labels:
        app: order-service
    spec:
      containers:
        - name: order-service
          image: registry.example.com/order-service:1.4.2
          ports:
            - containerPort: 8080
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "1Gi"
              cpu: "1000m"
          envFrom:
            - configMapRef:
                name: order-service-config
            - secretRef:
                name: order-service-secrets
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10
            failureThreshold: 3
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 5
            failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
  name: order-service
spec:
  selector:
    app: order-service
  ports:
    - port: 80
      targetPort: 8080
  type: ClusterIP

A note on requests versus limits, because the distinction is load-bearing. requests is what the scheduler reserves when deciding which node has room for your pod; limits is the hard ceiling the kubelet enforces at runtime. Set them too far apart and you risk noisy-neighbor problems or surprise throttling; set CPU limits too tight on a JVM and the runtime will be throttled mid-garbage-collection, producing latency spikes that look mysterious until you correlate them with CPU throttling metrics. Many teams deliberately leave CPU limits off for latency-sensitive Java services while keeping memory limits firm, since memory overcommit is far more dangerous than CPU contention.

Health Checks with Spring Actuator

Kubernetes probes map directly to Spring Boot Actuator’s health groups. Add this to your application.yml:

management:
  endpoints:
    web:
      exposure:
        include: health,info,prometheus
  endpoint:
    health:
      probes:
        enabled: true
      show-details: always
  health:
    livenessState:
      enabled: true
    readinessState:
      enabled: true

Liveness probe answers: “Is the process stuck?” If it fails, Kubernetes restarts the pod.
Readiness probe answers: “Can this pod handle traffic?” If it fails, the pod is removed from the Service’s load balancer but is not restarted.

The initialDelaySeconds is critical for Java apps. Spring Boot takes time to start — if your liveness probe fires before the context is ready, Kubernetes will kill the pod in a restart loop and your deployment never stabilizes. For slow-starting services there is a cleaner tool than padding initialDelaySeconds: a startup probe. A startup probe disables the liveness and readiness checks until the application has booted, then hands off to them. This lets you tolerate a long, variable cold start without weakening the liveness check that protects a healthy, running pod. Confusing liveness and readiness is one of the most common Kubernetes mistakes, and it is worth re-reading the two definitions above until the difference is reflexive.

ConfigMaps and Secrets

Externalize your Spring config properly:

apiVersion: v1
kind: ConfigMap
metadata:
  name: order-service-config
data:
  SPRING_DATASOURCE_URL: "jdbc:postgresql://postgres-svc:5432/orders"
  SPRING_REDIS_HOST: "redis-svc"
  SERVER_PORT: "8080"
---
apiVersion: v1
kind: Secret
metadata:
  name: order-service-secrets
type: Opaque
stringData:
  SPRING_DATASOURCE_USERNAME: "order_svc"
  SPRING_DATASOURCE_PASSWORD: "encrypted-password-here"
  API_SECRET_KEY: "my-secret-key"

Spring Boot automatically maps environment variables to properties. SPRING_DATASOURCE_URL becomes spring.datasource.url. No code changes needed. One honest caveat: a Kubernetes Secret is only base64-encoded, not encrypted, in etcd by default. For real protection you need encryption-at-rest enabled on the cluster, or an external secrets manager such as HashiCorp Vault or a cloud provider’s secrets service fronted by the External Secrets Operator. Treating a base64 string as if it were encrypted is a security gap that audits routinely flag.

Horizontal Pod Autoscaler

Scaling manually is for emergencies. Set up autoscaling from day one:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: order-service-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: order-service
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

This scales between 2 and 10 pods based on CPU and memory utilization. For Java apps, keep minReplicas at 2 or higher — JVM cold starts are slow, and you don’t want zero pods when traffic arrives. Be aware of a subtlety that surprises many teams: memory-based autoscaling and the JVM are an awkward pair. The JVM tends to hold onto heap it has allocated rather than returning it to the OS, so memory utilization stays high even after load subsides, which can prevent the HPA from ever scaling back down. For that reason, CPU is usually the more responsive scaling signal, and serious workloads often graduate to custom or external metrics — requests per second, or queue depth — via KEDA when raw resource utilization proves too blunt.

Common Pitfalls to Avoid

JVM ignoring container limits: On older JDK versions (pre-10), the JVM reads the host’s memory, not the container’s. Always use JDK 17+ and verify with -XX:+UseContainerSupport, which is enabled by default on modern JDKs.
Fat images: A naive Dockerfile with a full JDK can be 600MB+. Use multi-stage builds and JRE-only runtime images. Your CI/CD pipeline and node pull times will thank you.
No resource requests/limits: Without them, a single pod can starve the entire node, and the scheduler has no information to place it sensibly. Always set both, with memory limits in particular.
Hardcoded config: If a database URL is in your application.yml instead of a ConfigMap, you’ve made deployment environment-specific. Externalize everything that changes between environments.
Ignoring graceful shutdown: Add server.shutdown=graceful and set terminationGracePeriodSeconds in your pod spec. Otherwise, in-flight requests get killed mid-flight during rolling deployments, surfacing as random 5xx errors to users during every release.

When Docker Kubernetes Java Is Overkill: An Honest Trade-off

It would be misleading to present Kubernetes as the right answer for every Java service, so consider the counter-case before adopting it. Kubernetes carries a substantial operational tax: cluster upgrades, networking, RBAC, ingress, certificate rotation, and a control plane that someone must run and patch. For a single service, or a handful of low-traffic internal apps, that overhead dwarfs the benefit, and a simpler platform is the more professional choice — a managed container runtime such as AWS App Runner, Google Cloud Run, or Azure Container Apps will run the very same Docker image you built above with a fraction of the moving parts.

The decision really turns on scale and team capacity. Kubernetes earns its complexity when you are running many services that need shared networking, fine-grained autoscaling, sophisticated rollout strategies, and a common operational substrate — and when you have, or can fund, the platform skills to operate it well. Adopt it for that. But reaching for a full cluster to host three endpoints is resume-driven architecture, not engineering. The skill being demonstrated by a senior developer is knowing not only how to use these tools, but precisely when the simpler option is the correct one.

Closing Thoughts

For further reading, refer to the AWS documentation and the Google Cloud documentation for comprehensive reference material.

DevOps is a developer skill now. Infrastructure is no longer “someone else’s problem” — the teams that ship fastest are the ones where developers own the full lifecycle, from writing the code to defining how it runs in production. Docker and Kubernetes aren’t just ops tools. They’re how modern Java applications are built, tested, and delivered, and the investment in learning them pays off on every single deployment.

In conclusion, Docker Kubernetes Java is an essential topic for modern software development. By applying the patterns and practices covered in this guide — multi-stage builds, container-aware JVM settings, well-defined probes, externalized config, and sensible autoscaling — you can build more robust, scalable, and maintainable systems. Start with the fundamentals, iterate on your implementation, and continuously measure results to ensure you are getting the most value from these approaches.