If a Kafka consumer triggers a side effect, assume duplicates are possible until you have proven otherwise.
That is the safer default.
Kafka has exactly-once features, but engineers often over-interpret what those guarantees cover. They help a lot inside Kafka's own processing model. They do not magically make every downstream database write or external API call duplicate-free.
The Consumer Problem
The risky pattern looks like this:
- consume message
- update database
- acknowledge progress
If the service crashes between steps, a replay can happen.
That is not Kafka being broken. It is normal distributed systems behavior under failure.
Use an Idempotency Key
Every event that can trigger a side effect should carry a stable identifier:
{
"eventId": "evt_01HS8T0X8Q5Q8X4X8H2K2YJ6FD",
"type": "PaymentCharged",
"walletId": "wal_123"
}
Then make the side effect conditional on whether that event was already applied.
Let the Database Enforce It
A common and reliable approach is a unique constraint:
CREATE TABLE processed_events (
event_id text PRIMARY KEY,
processed_at timestamptz NOT NULL DEFAULT now()
);
Then process inside one transaction:
BEGIN;
INSERT INTO processed_events (event_id)
VALUES ($1);
UPDATE wallets
SET balance = balance - $2
WHERE id = $3;
COMMIT;
If the same event is replayed, the insert fails on the unique constraint and the consumer can treat it as already handled.
That is often simpler and more trustworthy than ad hoc in-memory deduplication.
The Useful Mental Model
Do not aim for "duplicates never happen."
Aim for "duplicates do not change the final state."
That is what idempotency buys you.
Further Reading