Docker to Kubernetes: A Java Developer’s Production Guide
For Spring Boot developers in 2026, the line between developer and operations has all but disappeared. If you are writing services and you cannot containerize and deploy them, you are shipping half the job. This guide on Docker Kubernetes Java covers what most teams wish they had known when they first moved from “it works on my machine” to “it works in production, at scale, every time.”
Why Java Developers Need Containerization Skills
Consider the reality of a modern service. Your microservice does not exist in isolation. It runs alongside dozens of other services, needs specific JVM settings, connects to databases and message brokers, and has to scale under load. Containers solve the environment consistency problem, and Kubernetes solves the orchestration problem. Knowing both isn’t optional anymore — it has become table stakes for senior Java roles, and most job listings now treat it as a baseline rather than a bonus.
Writing Optimized Dockerfiles for Spring Boot
Most tutorials show you a naive Dockerfile. Don’t use it in production. A multi-stage build keeps the final image small and secure, and it is the pattern the official Spring and Docker documentation both recommend:
# Stage 1: Build
FROM eclipse-temurin:21-jdk-alpine AS builder
WORKDIR /app
COPY gradle/ gradle/
COPY gradlew build.gradle settings.gradle ./
RUN ./gradlew dependencies --no-daemon
COPY src/ src/
RUN ./gradlew bootJar --no-daemon -x test
# Stage 2: Runtime
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
COPY --from=builder /app/build/libs/*.jar app.jar
USER appuser
EXPOSE 8080
ENTRYPOINT ["java", "-jar", "app.jar"]
Key things happening here:
Alpine base cuts the image from roughly 400MB to around 180MB
JRE, not JDK in the runtime stage — you don’t need a compiler in production
Non-root user because running as root in containers is a security risk
Dependency layer caching — the
COPY gradlewandRUN dependenciessteps are cached unless your build files change, which saves minutes on rebuilds
Layered JARs and Distroless: Going Further
The multi-stage build above is the baseline, but two refinements are worth knowing because they show up in well-tuned production pipelines. First, Spring Boot supports layered JARs, which split the application archive into separate layers — dependencies, Spring Boot loader, snapshot dependencies, and your own classes — ordered from least to most frequently changed. Because your application code changes far more often than your third-party dependencies, putting them in distinct image layers means a code change only invalidates the small top layer, and the large dependency layer stays cached.
# Build, then unpack the layered jar
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
ARG JAR=build/libs/app.jar
COPY ${JAR} app.jar
RUN java -Djarmode=layertools -jar app.jar extract
# Copy layers least-to-most volatile for optimal caching
COPY --from=builder /app/dependencies/ ./
COPY --from=builder /app/spring-boot-loader/ ./
COPY --from=builder /app/snapshot-dependencies/ ./
COPY --from=builder /app/application/ ./
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]
Second, for the strictest security posture, consider a distroless base image instead of Alpine. Distroless images contain only your application and its runtime dependencies — no shell, no package manager, nothing for an attacker to pivot into. The trade-off is real, so weigh it honestly: debugging is harder when you cannot exec a shell into the container, and many teams keep Alpine in development for that convenience while reserving distroless for production. There is no single right answer; the point is to choose deliberately rather than default to whatever the first tutorial showed.
JVM Memory Settings in Containers
This is a problem that catches teams repeatedly. The JVM doesn’t always play nice with container memory limits. If your container has a 512MB limit and the JVM tries to allocate its default heap based on the host’s memory, the OOM killer will terminate your process with no warning and no friendly stack trace — just an exit code 137.
Stop using -Xmx in containers. Use percentage-based flags instead:
ENTRYPOINT ["java", \
"-XX:MaxRAMPercentage=75.0", \
"-XX:InitialRAMPercentage=50.0", \
"-XX:+UseG1GC", \
"-XX:+UseContainerSupport", \
"-jar", "app.jar"]
-XX:MaxRAMPercentage=75.0 tells the JVM to use up to 75% of the container’s memory limit. The remaining 25% is for non-heap memory, thread stacks, metaspace, and OS overhead. This scales automatically — whether your container has 512MB or 4GB. Why 75% rather than 100%? Because the heap is only part of the JVM’s footprint. Metaspace, compressed class space, thread stacks (roughly 1MB each by default), the code cache, and direct byte buffers all live outside the heap, and they are exactly what gets squeezed when you set the heap too aggressively. A container that is OOM-killed despite a healthy-looking heap is almost always one where off-heap memory was forgotten.
Docker Compose for Local Development
Before jumping into Kubernetes, get your local development workflow right with Docker Compose:
version: '3.8'
services:
app:
build: .
ports:
- "8080:8080"
environment:
- SPRING_PROFILES_ACTIVE=local
- SPRING_DATASOURCE_URL=jdbc:postgresql://db:5432/myapp
- SPRING_REDIS_HOST=redis
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
db:
image: postgres:16-alpine
environment:
POSTGRES_DB: myapp
POSTGRES_USER: admin
POSTGRES_PASSWORD: secret
ports:
- "5432:5432"
healthcheck:
test: ["CMD-SHELL", "pg_isready -U admin -d myapp"]
interval: 5s
timeout: 3s
retries: 5
redis:
image: redis:7-alpine
ports:
- "6379:6379"
One docker compose up and your entire stack is running. Every developer on the team gets an identical environment. No more “which version of Postgres are you running?” conversations. The condition: service_healthy dependency is the detail that makes this reliable: without it, your app container races the database and fails on the first connection attempt because Postgres has accepted the port but isn’t ready to serve queries yet. The healthcheck makes the dependency meaningful rather than cosmetic.
Kubernetes Basics: The Mental Model
If Docker is a shipping container, Kubernetes is the port authority. Here’s the hierarchy that matters:
| Concept | What It Does | Java Analogy |
|---|---|---|
| Pod | Runs one or more containers | A single JVM process |
| Deployment | Manages pod replicas and updates | A managed thread pool |
| Service | Stable network endpoint for pods | A load balancer / DNS entry |
| ConfigMap | External configuration | application.yml but outside the jar |
| Secret | Sensitive configuration | Encrypted credentials |
The mental shift that trips up developers coming from a single-server world is that Kubernetes is declarative. You don’t tell it “start three pods”; you declare “I want three pods,” and a control loop continuously works to make reality match that declaration. Kill a pod and Kubernetes recreates it, not because something caught the event, but because the observed state no longer equals the desired state. Internalizing that reconciliation model explains nearly every behavior you will later find surprising.
Deploying a Spring Boot App to Kubernetes
Here’s a production-grade deployment manifest, with the reasoning behind each section:
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
labels:
app: order-service
spec:
replicas: 3
selector:
matchLabels:
app: order-service
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: order-service
spec:
containers:
- name: order-service
image: registry.example.com/order-service:1.4.2
ports:
- containerPort: 8080
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "1000m"
envFrom:
- configMapRef:
name: order-service-config
- secretRef:
name: order-service-secrets
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 15
periodSeconds: 5
failureThreshold: 3
---
apiVersion: v1
kind: Service
metadata:
name: order-service
spec:
selector:
app: order-service
ports:
- port: 80
targetPort: 8080
type: ClusterIP
A note on requests versus limits, because the distinction is load-bearing. requests is what the scheduler reserves when deciding which node has room for your pod; limits is the hard ceiling the kubelet enforces at runtime. Set them too far apart and you risk noisy-neighbor problems or surprise throttling; set CPU limits too tight on a JVM and the runtime will be throttled mid-garbage-collection, producing latency spikes that look mysterious until you correlate them with CPU throttling metrics. Many teams deliberately leave CPU limits off for latency-sensitive Java services while keeping memory limits firm, since memory overcommit is far more dangerous than CPU contention.
Health Checks with Spring Actuator
Kubernetes probes map directly to Spring Boot Actuator’s health groups. Add this to your application.yml:
management:
endpoints:
web:
exposure:
include: health,info,prometheus
endpoint:
health:
probes:
enabled: true
show-details: always
health:
livenessState:
enabled: true
readinessState:
enabled: true
Liveness probe answers: “Is the process stuck?” If it fails, Kubernetes restarts the pod.
Readiness probe answers: “Can this pod handle traffic?” If it fails, the pod is removed from the Service’s load balancer but is not restarted.
The initialDelaySeconds is critical for Java apps. Spring Boot takes time to start — if your liveness probe fires before the context is ready, Kubernetes will kill the pod in a restart loop and your deployment never stabilizes. For slow-starting services there is a cleaner tool than padding initialDelaySeconds: a startup probe. A startup probe disables the liveness and readiness checks until the application has booted, then hands off to them. This lets you tolerate a long, variable cold start without weakening the liveness check that protects a healthy, running pod. Confusing liveness and readiness is one of the most common Kubernetes mistakes, and it is worth re-reading the two definitions above until the difference is reflexive.
ConfigMaps and Secrets
Externalize your Spring config properly:
apiVersion: v1
kind: ConfigMap
metadata:
name: order-service-config
data:
SPRING_DATASOURCE_URL: "jdbc:postgresql://postgres-svc:5432/orders"
SPRING_REDIS_HOST: "redis-svc"
SERVER_PORT: "8080"
---
apiVersion: v1
kind: Secret
metadata:
name: order-service-secrets
type: Opaque
stringData:
SPRING_DATASOURCE_USERNAME: "order_svc"
SPRING_DATASOURCE_PASSWORD: "encrypted-password-here"
API_SECRET_KEY: "my-secret-key"
Spring Boot automatically maps environment variables to properties. SPRING_DATASOURCE_URL becomes spring.datasource.url. No code changes needed. One honest caveat: a Kubernetes Secret is only base64-encoded, not encrypted, in etcd by default. For real protection you need encryption-at-rest enabled on the cluster, or an external secrets manager such as HashiCorp Vault or a cloud provider’s secrets service fronted by the External Secrets Operator. Treating a base64 string as if it were encrypted is a security gap that audits routinely flag.
Horizontal Pod Autoscaler
Scaling manually is for emergencies. Set up autoscaling from day one:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: order-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: order-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
This scales between 2 and 10 pods based on CPU and memory utilization. For Java apps, keep minReplicas at 2 or higher — JVM cold starts are slow, and you don’t want zero pods when traffic arrives. Be aware of a subtlety that surprises many teams: memory-based autoscaling and the JVM are an awkward pair. The JVM tends to hold onto heap it has allocated rather than returning it to the OS, so memory utilization stays high even after load subsides, which can prevent the HPA from ever scaling back down. For that reason, CPU is usually the more responsive scaling signal, and serious workloads often graduate to custom or external metrics — requests per second, or queue depth — via KEDA when raw resource utilization proves too blunt.
Common Pitfalls to Avoid
JVM ignoring container limits: On older JDK versions (pre-10), the JVM reads the host’s memory, not the container’s. Always use JDK 17+ and verify with
-XX:+UseContainerSupport, which is enabled by default on modern JDKs.Fat images: A naive Dockerfile with a full JDK can be 600MB+. Use multi-stage builds and JRE-only runtime images. Your CI/CD pipeline and node pull times will thank you.
No resource requests/limits: Without them, a single pod can starve the entire node, and the scheduler has no information to place it sensibly. Always set both, with memory limits in particular.
Hardcoded config: If a database URL is in your
application.ymlinstead of a ConfigMap, you’ve made deployment environment-specific. Externalize everything that changes between environments.Ignoring graceful shutdown: Add
server.shutdown=gracefuland setterminationGracePeriodSecondsin your pod spec. Otherwise, in-flight requests get killed mid-flight during rolling deployments, surfacing as random 5xx errors to users during every release.
When Docker Kubernetes Java Is Overkill: An Honest Trade-off
It would be misleading to present Kubernetes as the right answer for every Java service, so consider the counter-case before adopting it. Kubernetes carries a substantial operational tax: cluster upgrades, networking, RBAC, ingress, certificate rotation, and a control plane that someone must run and patch. For a single service, or a handful of low-traffic internal apps, that overhead dwarfs the benefit, and a simpler platform is the more professional choice — a managed container runtime such as AWS App Runner, Google Cloud Run, or Azure Container Apps will run the very same Docker image you built above with a fraction of the moving parts.
The decision really turns on scale and team capacity. Kubernetes earns its complexity when you are running many services that need shared networking, fine-grained autoscaling, sophisticated rollout strategies, and a common operational substrate — and when you have, or can fund, the platform skills to operate it well. Adopt it for that. But reaching for a full cluster to host three endpoints is resume-driven architecture, not engineering. The skill being demonstrated by a senior developer is knowing not only how to use these tools, but precisely when the simpler option is the correct one.
Closing Thoughts
For further reading, refer to the AWS documentation and the Google Cloud documentation for comprehensive reference material.
DevOps is a developer skill now. Infrastructure is no longer “someone else’s problem” — the teams that ship fastest are the ones where developers own the full lifecycle, from writing the code to defining how it runs in production. Docker and Kubernetes aren’t just ops tools. They’re how modern Java applications are built, tested, and delivered, and the investment in learning them pays off on every single deployment.
In conclusion, Docker Kubernetes Java is an essential topic for modern software development. By applying the patterns and practices covered in this guide — multi-stage builds, container-aware JVM settings, well-defined probes, externalized config, and sensible autoscaling — you can build more robust, scalable, and maintainable systems. Start with the fundamentals, iterate on your implementation, and continuously measure results to ensure you are getting the most value from these approaches.