Skip to main content

Mutation Testing: The Deterministic Arbiter of LLM-Generated Tests

· 8 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

Last week I argued that you should build deterministic systems with non-deterministic tools—demand TDD from your LLM, get tests first, then implementation. But there's a problem with that workflow: passing tests aren't proof that tests are good.

Enter mutation testing: a deterministic tool that validates whether your tests actually test anything.

DDD: Domains Sized to Contain Decisions

· 22 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

The uncomfortable truth: most DDD teams draw their bounded contexts too small.

Not too large—too small. They slice by CRUD entity, by database table, by team org chart. The result? Contexts that cannot make decisions autonomously. Every meaningful operation requires cross-context coordination. The architecture devolves into a distributed monolith with extra network hops.

This post argues for a different principle: a bounded context is correctly sized when every decision that changes its invariants can be made entirely within it, without synchronous runtime dependency on another context.

Building Deterministic Systems with Non-Deterministic Tools

· 8 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

Large Language Models are probabilistic text generators. Their raw outputs cannot be trusted for correctness. So how do you build reliable software with unreliable assistants?

You don't ask for answers. You ask for tools that produce answers.

Plan, Review, Execute: Getting Better Results from LLMs

· 3 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

The most effective LLM workflows share one trait: they force a pause between planning and execution. You wouldn't let a contractor start demolition before approving blueprints. The same applies to AI assistants.

Testcontainers Blur the Lines Between Unit and Integration Tests

· 4 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

The old unit/integration distinction assumed "integration" meant "slow, fragile, needs environment setup." Testcontainers changed the economics.

Tests Belong Next to the Code They Test

· 7 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

Tests should live next to the code they test—same directory, separate file. Not inline. Not in a parallel tree.

The Container Overlay Pattern: Same Makefile Command, Different Context

· 6 min read
Ben Abbitt
Consultant, Software Architect, AI Wrangler
About This Blog

This blog documents learnings from building Angzarr—a polyglot event sourcing framework. The framework core is written in Rust, so examples here are primarily Rust.

Angzarr doesn't require Rust. Client SDKs exist for Go, Python, Java, C#, and C++. The author—a polyglot developer—doesn't believe Rust is the best language for everything. It is the right choice for this framework's core, and building it has produced these learnings.

The Rust should be readable by most programmers. If you have questions: consult The Rust Book, ask an LLM, or email the author.

How we eliminated conditionals from our Makefile while supporting both host and containerized builds with a single command interface.