GCP Cloud Run: The Simplest Way to Run Containers
GCP Cloud Run serverless is Google Cloud’s fully managed container platform that automatically scales from zero to thousands of instances based on incoming traffic. You bring a container image, Cloud Run handles everything else — provisioning, scaling, TLS certificates, and load balancing. Therefore, teams can deploy any application that fits in a container without managing any infrastructure.
Cloud Run’s killer feature is scale-to-zero — when your service receives no traffic, it runs zero instances and you pay nothing. Moreover, it scales up in seconds when requests arrive, handling traffic spikes without pre-provisioning capacity. Consequently, Cloud Run is ideal for APIs, web applications, webhooks, and event-driven workloads where traffic is variable or unpredictable.
GCP Cloud Run Serverless: Deploying Your First Service
Deploy a container to Cloud Run with a single command. Cloud Run pulls your image, configures networking, provisions TLS, and makes your service available at a unique HTTPS URL. Furthermore, you can deploy from source code directly — Cloud Run uses Cloud Build to containerize your application automatically.
# Deploy from a container image
gcloud run deploy order-service \
--image gcr.io/my-project/order-service:v1.2.0 \
--platform managed \
--region us-central1 \
--memory 1Gi \
--cpu 2 \
--min-instances 0 \
--max-instances 100 \
--concurrency 80 \
--port 8080 \
--set-env-vars "SPRING_PROFILES_ACTIVE=production" \
--set-secrets "DB_PASSWORD=db-password:latest" \
--vpc-connector my-vpc-connector \
--allow-unauthenticated
# Deploy from source (Cloud Build auto-builds)
gcloud run deploy my-api \
--source . \
--region us-central1
# Deploy with traffic splitting (canary)
gcloud run services update-traffic order-service \
--to-revisions LATEST=10,order-service-v1=90 \
--region us-central1Service Configuration and Scaling
Cloud Run provides fine-grained control over scaling behavior, request handling, and resource allocation. The concurrency setting determines how many requests each instance handles simultaneously — higher concurrency means fewer instances but more memory per instance. Additionally, minimum instances keep your service warm for latency-sensitive endpoints.
# Cloud Run service YAML configuration
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: order-service
annotations:
run.googleapis.com/launch-stage: GA
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/minScale: "1"
autoscaling.knative.dev/maxScale: "100"
run.googleapis.com/cpu-throttling: "false"
run.googleapis.com/startup-cpu-boost: "true"
run.googleapis.com/vpc-access-connector: my-vpc-connector
run.googleapis.com/vpc-access-egress: private-ranges-only
spec:
containerConcurrency: 80
timeoutSeconds: 300
containers:
- image: gcr.io/my-project/order-service:v1.2.0
ports:
- containerPort: 8080
resources:
limits:
cpu: "2"
memory: 1Gi
env:
- name: SPRING_PROFILES_ACTIVE
value: production
startupProbe:
httpGet:
path: /actuator/health
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 12
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080VPC Connectivity and Private Services
Cloud Run services can connect to resources in your VPC — databases, Redis caches, internal APIs — through VPC connectors or Direct VPC egress. Furthermore, you can restrict ingress to internal traffic only, making services accessible only from within your VPC or through a load balancer.
CI/CD with Cloud Build
Automate deployments with Cloud Build triggers that build, test, and deploy on every push. Additionally, use traffic splitting for gradual rollouts and automatic rollback on failure. See the Cloud Run documentation for advanced deployment patterns.
Key Takeaways
- Start with a solid foundation and build incrementally based on your requirements
- Test thoroughly in staging before deploying to production environments
- Monitor performance metrics and iterate based on real-world data
- Follow security best practices and keep dependencies up to date
- Document architectural decisions for future team members
# cloudbuild.yaml — CI/CD pipeline
steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/order-service:$SHORT_SHA', '.']
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/order-service:$SHORT_SHA']
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'deploy'
- 'order-service'
- '--image=gcr.io/$PROJECT_ID/order-service:$SHORT_SHA'
- '--region=us-central1'
- '--tag=canary'
- '--no-traffic'
- name: 'gcr.io/google.com/cloudsdktool/cloud-sdk'
entrypoint: gcloud
args:
- 'run'
- 'services'
- 'update-traffic'
- 'order-service'
- '--to-tags=canary=10'
- '--region=us-central1'
images:
- 'gcr.io/$PROJECT_ID/order-service:$SHORT_SHA'In conclusion, GCP Cloud Run serverless is the fastest path from container to production on Google Cloud. With scale-to-zero pricing, automatic TLS, and built-in traffic management, it removes operational overhead while giving you full container flexibility. Start with a simple deployment, add VPC connectivity for database access, and implement canary deployments for safe releases.