Skip to main content

Kafka

Apache Kafka provides high-throughput event streaming with log retention for replay and analytics.


Why Kafka

StrengthBenefit
High throughputMillions of events/second
Log retentionEvents persist for replay
PartitioningHorizontal scaling
EcosystemKafka Streams, Connect, ksqlDB
OrderingPer-partition ordering guarantees

Trade-offs

ConcernConsideration
Operational complexityZooKeeper (or KRaft) coordination
Resource heavyMore memory/disk than AMQP
LatencyBatching adds milliseconds

Configuration

[bus]
backend = "kafka"

[bus.kafka]
brokers = ["localhost:9092"]
topic_prefix = "angzarr"
consumer_group = "angzarr-handlers"

Environment Variables

export KAFKA_BROKERS="localhost:9092"
export KAFKA_TOPIC_PREFIX="angzarr"
export BUS_BACKEND="kafka"

Topic Layout

Events are partitioned by aggregate root for ordering:

Topic: angzarr.events.player
Partition 0: [player-001 events, player-004 events, ...]
Partition 1: [player-002 events, player-005 events, ...]
Partition 2: [player-003 events, player-006 events, ...]

Topic: angzarr.events.hand
Partition 0: [hand-001 events, ...]
...

Partitioning Strategy

Events partition by hash(domain + root) % partitions. All events for the same aggregate land in the same partition, preserving order.


Consumer Groups

Handlers join consumer groups for load balancing:

Consumer Group: player-projector
Consumer 1 ← Partition 0, 1
Consumer 2 ← Partition 2, 3

Consumer Group: output-projector
Consumer 1 ← Partition 0, 1, 2, 3

Each partition is consumed by exactly one consumer per group.


Retention

Configure retention for replay capabilities:

# Topic-level configuration
retention.ms=604800000 # 7 days
retention.bytes=-1 # Unlimited by size

Events remain available for replay within the retention window.


Helm Deployment

# values.yaml
bus:
backend: kafka

kafka:
enabled: true
brokers:
- kafka-0.kafka.messaging.svc.cluster.local:9092
- kafka-1.kafka.messaging.svc.cluster.local:9092
topicPrefix: angzarr

Testing

# Run Kafka tests (requires testcontainers)
cargo test --test bus_kafka --features kafka

# Requires podman socket
systemctl --user start podman.socket

Monitoring

Key metrics to monitor:

MetricConcern
Consumer lagProcessing falling behind
Under-replicated partitionsReplication issues
Request latencyBroker performance
Disk usageRetention capacity

When to Use Kafka

  • High volume — Millions of events/second
  • Event replay — Analytics, debugging, new projectors
  • Cross-team sharing — Multiple teams consuming events
  • Stream processing — Kafka Streams, ksqlDB

Next Steps

  • AMQP — Simpler alternative
  • Pub/Sub — GCP managed alternative