Skip to content

Graceful Failure

The player’s bet was accepted. The pot updated. Then the hand discovered they’d already folded. Now what?


Distributed systems fail in the middle. A saga issues a command, the target aggregate rejects it, and now the source needs to know—and respond.

Traditional solutions—two-phase commit, distributed transactions—are complex, slow, and often unavailable across service boundaries. Event sourcing offers a different approach: let failures happen, record them, and compensate.


When a saga issues a command that gets rejected:

illustrative - compensation flow
1. Hand emits BetPlaced event
2. Saga (Hand→Table) receives event, issues DeductFromPot → Table
3. Table rejects: "Player already folded"
4. Framework sends RejectionNotification directly to Hand aggregate
5. Hand's @rejected handler emits compensation event: BetReverted

The audit trail shows exactly what happened: the attempt, the rejection, and the recovery.


sequenceDiagram
    participant Hand as Hand Aggregate
    participant Saga as Hand→Table Saga
    participant Table as Table Aggregate
    participant FW as Framework

    Hand->>Saga: BetPlaced
    Saga->>Table: DeductFromPlayerStack
    Table-->>FW: Rejected (player folded)
    FW->>Hand: RejectionNotification
    Hand->>Hand: BetReverted

The rejection notification bypasses the saga entirely. The framework routes it directly to the source aggregate using the return address stamped on the original command. The saga is stateless—it doesn’t need to know about rejections. The source aggregate decides how to compensate.


Register handlers for specific rejection scenarios:

def handle_table_join_rejected(
notification: types.Notification,
state: PlayerState,
) -> player.FundsReleased | None:
"""Handle JoinTable rejection by releasing reserved funds.
Returns the FundsReleased event directly (packed into an EventBook by the
router) or ``None`` if no reservation exists for the rejected table.
"""
rejection = types.RejectionNotification()
if notification.HasField("payload"):
notification.payload.Unpack(rejection)
table_root = b""
if rejection.HasField("rejected_command"):
rc = rejection.rejected_command
if rc.HasField("cover") and rc.cover.HasField("root"):
table_root = rc.cover.root.value
table_key = table_root.hex()
reserved_amount = state.table_reservations.get(table_key, 0)
if reserved_amount == 0:
return None
new_reserved = state.reserved_funds - reserved_amount
new_available = state.bankroll - new_reserved
return player.FundsReleased(
amount=poker_types.Currency(amount=reserved_amount, currency_code="CHIPS"),
table_root=table_root,
new_available_balance=poker_types.Currency(
amount=new_available, currency_code="CHIPS"
),
new_reserved_balance=poker_types.Currency(
amount=new_reserved, currency_code="CHIPS"
),
released_at=now(),
)

The framework routes rejections to the appropriate handler based on the rejected command’s domain and type.


Different failures require different responses:

ScenarioCompensation
Player disconnects mid-actionAuto-fold, return to action queue
Insufficient chips for blindSit out, notify table
Invalid bet amountReject action, prompt retry
Table closed during handRefund all pots, end hand
Timer expiredAuto-check or auto-fold

The aggregate decides the business response. The framework ensures the notification arrives.


When handling a rejection, you can specify additional actions:

Illustrative Example

The following shows the RevocationResponse pattern. Your handlers will use your domain’s specific events and rejection reasons.

illustrative - RevocationResponse options
@rejected(domain="player", command="ReserveFunds")
def handle_reserve_failed(self, notification: Notification):
rejection = RejectionNotification()
notification.payload.Unpack(rejection)
if rejection.rejection_reason == "insufficient_balance":
# Return event directly—framework auto-applies it
return PlayerSatOut(reason="insufficient_funds")
else:
# Delegate to framework for DLQ/escalation
return delegate_to_framework(
reason=rejection.rejection_reason,
send_to_dead_letter=True,
escalate=True, # Alert floor manager
)
FlagEffect
emit_system_revocationEmit SagaCompensationFailed event
send_to_dead_letter_queueRoute to DLQ for manual review
escalateTrigger configured webhook (floor manager alert)
abortStop saga chain, propagate error

Complex workflows may require compensating multiple steps:

Illustrative Example

The following shows multi-step compensation patterns. Your implementation will define domain-specific events for each compensating action.

illustrative - multi-step compensation
# Player tried to join table but verification failed
@rejected(domain="verification", command="VerifyPlayer")
def handle_verification_failed(self, notification: Notification):
# Already reserved their seat and took their buy-in
# Need to undo both—return multiple events as a tuple
return (
SeatReleased(seat=self.state.pending_seat),
BuyInRefunded(player_id=self.state.pending_player, amount=self.state.pending_buyin),
JoinRejected(player_id=self.state.pending_player, reason="verification_failed"),
)

Each compensation event is recorded. The audit trail shows the full sequence: attempt, failure, recovery.


Regulated industries require demonstrable fairness:

  • Every player action must be recorded
  • Every rejection must be explained
  • Every compensation must be traceable

When a regulator asks “why did this player lose their bet?”, the event history shows:

  1. The bet was placed
  2. The deduction was attempted
  3. The deduction was rejected (reason: player had folded)
  4. The bet was reverted
  5. The player was notified

No silent failures. No unexplained state changes.


The framework provides two mechanisms for undoing events:

MechanismOriginal EventClient CodeUse Case
RevocationHidden (becomes NoOp)None neededFull undo, clean state
CompensateVisibleHandler implements inversePartial undo, business logic

Revocation hides the original event at read time. No client code required.

Revocation flow
1. Framework writes Revocation { sequences: [5, 6], reason: "timeout" }
2. At read time, events 5 and 6 become NoOp
3. Business logic never sees the original events

Use revocation when:

  • Full undo is needed, “never happened” semantics
  • No business logic required for the undo
  • Events came from a failed cascade or timeout

Compensate keeps the original event visible and routes to a client handler. The handler emits inverse events.

Compensate flow
1. Framework writes Compensate { sequences: [5], reason: "order_cancelled" }
2. Framework routes to client's compensation handler
3. Handler receives original event, emits inverse (e.g., InventoryReleased)
4. Both original and compensation events visible in stream

Use compensate when:

  • Business logic must decide how to undo
  • Partial undo based on context
  • Explicit audit trail required (both events visible)
  • Third-party notifications or side effects need reversal
illustrative - compensation handler registration
@compensate("InventoryReserved")
def compensate_reservation(self, original_event, reason):
# Original event remains visible
# Emit inverse event
return InventoryReleased(
sku=original_event.sku,
qty=original_event.qty,
reason=reason,
)
Event TypeAt Read Time
Revocation markerNoOp
Revoked eventNoOp
Compensate markerNoOp
Compensated eventVisible (key difference)