Outbox Pattern for Reliable Event Publishing
Outbox pattern reliable events publishing solves one of the hardest problems in distributed systems: how do you update a database and publish an event atomically? Without the outbox pattern, you risk either losing events (if the message broker is down after the DB commit) or publishing events for uncommitted changes (if you publish before committing). Both scenarios lead to data inconsistency across services.
This guide explains the transactional outbox pattern in depth, covering both polling-based and log-based (CDC) implementations. You will learn how to implement it in Java with Spring Boot and PostgreSQL, handle failure scenarios, and choose the right approach for your architecture.
The Dual-Write Problem
Consider a typical microservice that creates an order and publishes an OrderCreated event. The naive approach has a critical flaw:
@Service
public class OrderService {
@Transactional
public Order createOrder(OrderRequest request) {
// Step 1: Save to database
Order order = orderRepository.save(new Order(request));
// Step 2: Publish event — PROBLEM!
// If this fails, the order exists but no event is published
// If the app crashes between step 1 and 2, same issue
eventPublisher.publish(new OrderCreatedEvent(order));
return order;
}
}This is called the dual-write problem. You are writing to two systems (database and message broker) without a distributed transaction. The outbox pattern eliminates this by writing the event to the same database as the business data, within the same transaction.
How the Outbox Pattern Works
The pattern introduces an “outbox” table in the same database as your business data. Instead of publishing events directly, the service writes events to the outbox table within the same transaction as the business operation. A separate process then reads from the outbox table and publishes events to the message broker.
-- Outbox table schema
CREATE TABLE outbox_events (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
aggregate_type VARCHAR(255) NOT NULL,
aggregate_id VARCHAR(255) NOT NULL,
event_type VARCHAR(255) NOT NULL,
payload JSONB NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
published_at TIMESTAMP NULL,
retry_count INT NOT NULL DEFAULT 0
);
-- Index for the polling publisher
CREATE INDEX idx_outbox_unpublished
ON outbox_events (created_at)
WHERE published_at IS NULL;Implementation with Spring Boot
@Entity
@Table(name = "outbox_events")
public class OutboxEvent {
@Id
@GeneratedValue(strategy = GenerationType.UUID)
private UUID id;
private String aggregateType;
private String aggregateId;
private String eventType;
@JdbcTypeCode(SqlTypes.JSON)
private String payload;
private Instant createdAt;
private Instant publishedAt;
private int retryCount;
}
@Service
public class OrderService {
private final OrderRepository orderRepository;
private final OutboxRepository outboxRepository;
private final ObjectMapper objectMapper;
@Transactional // Single transaction for both writes
public Order createOrder(OrderRequest request) {
// Save business data
Order order = orderRepository.save(new Order(request));
// Write event to outbox (same transaction!)
OutboxEvent event = new OutboxEvent();
event.setAggregateType("Order");
event.setAggregateId(order.getId().toString());
event.setEventType("OrderCreated");
event.setPayload(objectMapper.writeValueAsString(
new OrderCreatedPayload(order.getId(), order.getTotal(),
order.getCustomerId())
));
event.setCreatedAt(Instant.now());
outboxRepository.save(event);
return order;
}
}Because both the order and the outbox event are written in the same database transaction, they either both succeed or both fail. There is no window for inconsistency.
Polling Publisher Approach
The simplest way to relay outbox events to the message broker is a polling publisher — a scheduled task that periodically queries for unpublished events and sends them.
@Component
public class OutboxPollingPublisher {
private final OutboxRepository outboxRepository;
private final KafkaTemplate kafkaTemplate;
@Scheduled(fixedDelay = 500) // Poll every 500ms
@Transactional
public void publishPendingEvents() {
List events = outboxRepository
.findByPublishedAtIsNullOrderByCreatedAtAsc(
PageRequest.of(0, 100));
for (OutboxEvent event : events) {
try {
String topic = "events." + event.getAggregateType().toLowerCase();
kafkaTemplate.send(topic, event.getAggregateId(),
event.getPayload()).get(); // Sync send
event.setPublishedAt(Instant.now());
outboxRepository.save(event);
} catch (Exception e) {
event.setRetryCount(event.getRetryCount() + 1);
outboxRepository.save(event);
log.warn("Failed to publish event {}: {}",
event.getId(), e.getMessage());
}
}
}
// Cleanup: delete published events older than 7 days
@Scheduled(cron = "0 0 3 * * *")
@Transactional
public void cleanupPublishedEvents() {
outboxRepository.deleteByPublishedAtBefore(
Instant.now().minus(7, ChronoUnit.DAYS));
}
} The polling approach is simple to implement and understand. However, it introduces latency (up to the polling interval) and creates load on the database with frequent queries. For most applications with moderate event volumes, this trade-off is acceptable.
Log-Based CDC Approach
For higher throughput and lower latency, use log-based Change Data Capture (CDC). Tools like Debezium read the database transaction log (WAL in PostgreSQL) and stream changes to Kafka. This approach has near-zero latency and adds no load to the database.
{
"name": "outbox-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "debezium",
"database.password": "secret",
"database.dbname": "orderservice",
"table.include.list": "public.outbox_events",
"transforms": "outbox",
"transforms.outbox.type":
"io.debezium.transforms.outbox.EventRouter",
"transforms.outbox.table.fields.additional.placement":
"event_type:header:eventType",
"transforms.outbox.route.by.field": "aggregate_type",
"transforms.outbox.route.topic.replacement": "events.$1"
}
}Handling Duplicate Events
Both polling and CDC can produce duplicate events (e.g., if the publisher crashes after sending but before marking as published). Therefore, consumers must be idempotent. The standard approach is to track processed event IDs.
@Service
public class OrderEventConsumer {
private final ProcessedEventRepository processedEvents;
private final InventoryService inventoryService;
@KafkaListener(topics = "events.order")
@Transactional
public void handleOrderCreated(ConsumerRecord record) {
String eventId = new String(record.headers()
.lastHeader("eventId").value());
// Idempotency check
if (processedEvents.existsById(eventId)) {
log.info("Duplicate event {}, skipping", eventId);
return;
}
OrderCreatedPayload payload = objectMapper.readValue(
record.value(), OrderCreatedPayload.class);
inventoryService.reserveStock(payload.getItems());
processedEvents.save(new ProcessedEvent(eventId, Instant.now()));
}
} When NOT to Use the Outbox Pattern
The outbox pattern adds complexity — an extra table, a publishing process, and idempotent consumers. If your system tolerates occasional missed events (best-effort delivery), a simple try-catch around the publish call may be sufficient. Additionally, if you use an event-sourced architecture where the event store IS the source of truth, the outbox pattern is redundant.
For systems with very low event volumes (fewer than 100 events per hour), the operational overhead of managing a CDC connector or polling publisher may not be justified. A simpler retry mechanism with a dead-letter queue could provide adequate reliability with less infrastructure.
Key Takeaways
- The outbox pattern reliable events approach guarantees atomicity between database changes and event publication by writing both in a single transaction
- Polling publishers are simple to implement but add latency; CDC with tools like Debezium offers near-real-time with zero database load
- Always design consumers to be idempotent since both approaches can produce duplicate events during failure recovery
- The outbox table should be cleaned up periodically to prevent unbounded growth
- Choose polling for simplicity with moderate volumes; choose CDC for high throughput and low latency requirements
Related Reading
- Saga Pattern in Microservices with Spring Boot
- CQRS and Event Sourcing with Axon Framework
- Kafka CDC with Debezium for Real-Time Data