Eventsourcing: State from Events or Events as State?

Trying to get people to agree on a single unambiguous definition of a software concept is usually impossible. In the very least, I can point out that such ambiguity exists when it comes to eventsourcing.

State from Events

For a given stream of events, I can process them to derive some state. This is a very common practice, because it makes a lot of sense to do this. You’ll encounter this basic concept referenced with different names, such as stream processing, event processing, real-time analytics, complex event processing, streaming analytics, real-time streaming analytics, projections, all depending on the environment you’re in. That’s normal, communities evolve language to fit their needs, and we can’t expect the DDD crowd (mostly people developing backend for line of business applications) to synchronize language with, say, data scientists.

An example:

Events:

// (some fields omitted for clarity)
ItemWasAddedToCart {cartid = 1, itemid = 5, timestamp = 2018-01-01}   
ItemWasAddedToCart {cartid = 2, itemid = 6, timestamp = 2019-01-01}
CartWasCheckedOut {cartid = 2, timestamp = 2019-01-02}

Queries¹:

IsCartCheckedOut {cartid = 2}
HowManyCartsAreAbandoned {}

To answer the queries, two consumers can listen to the same stream of events. The first consumer projects a map of cartid => boolean resulting in [ 1 => false, 2 => true ]. The second consumer is a bit more involved. It has a business rule that says “A cart is abandoned if it hasn’t been checked out and the last operations is more than 7 days ago”. The projection is a map of cartid => lastModified, and every time a cart is checked out, we remove it from the map. Our stream projects as [ 1 => 2018-01-01 ] To answer the query, we count all records with lastModified > 7 days.

Should we call this Eventsourcing? It makes sense, as the state of the system is sourced from events.

Let’s look at the other side.

Events as State

The previous example does not make any assumptions about where events come from. In all kinds of analytics and data processing contexts, this is all there is: data or events are provided as is, and the question of where they came from is irrelevant. I’m pointing this out because it explains why, in that context, you would call the previous example eventsourcing.

So where do the events come from? The producer could be modelled as a state based system. Every cart is represented by a document (or an equivalent set of records) that has a status, line items, timestamps… Changing the state of such a document is usually governed by constraints, such as A cart with 0 items cannot be checked out and Items can not be added by a cart that was checked out. Whenever we update the state, we can emit an event that represents what we did. But it’s important to know that often these events can only be emitted after the constraints are imposed. We might even store the events in a log, but the source of truth is the state in the database.

Instead of storing state as documents or records with an optional log, we can get rid of the database and promote the event log to be our single source of truth. I wouldn’t call it a log anymore, but an eventstore. So now, all updates are only expressed as new events, and no longer as state changes. Before storing a new event, all relevant constraints still need to be enforced. To enforce Items can not be added by a cart that was checked out, we project the state of the cart from the event history.

Should we call this Eventsourcing? Now the event store is the single source of truth for the system.

State from Events vs Events as State

The distinction is important. “State from Events” assumes an existing event stream, regardless of how it was produced, and projects state from it. No new events are added to the stream². “Events as State” is about events as the single source of truth. In other words, new events are added to the stream, but they’re constrained by business rules, and these rules depend on previous events as their input (as opposed to state as the input).

Defining Eventsourcing

Now how should we define Eventsourcing? Both make sense. But there are already many terms for “State from Events”. If we choose to call that Eventsourcing, what should we call the second type?

That’s why I use the term Eventsourcing only when I specifically talk about a system where state is stored as events and used for decision-making (aka enforcing constraints). We can now try to define Eventsourcing unambiguously³:

A system is eventsourced when

the single source of truth is a persisted history of the system’s events;

and that history is taken into account for enforcing constraints on new events.

I’m pretty sure this definition can improved. In the meantime, that’s the definition I’m using for my articles on Eventsourcing and Messaging patterns.

Footnotes

Queries can be modelled as domain messages. ↩
New events can be added to the eventstore, such as Summary Events, but these are derived. ↩
Greg Young is often quoted that “Eventsourcing is just a fold over a list of data”, which is correct, but reductionist. (I’m not sure the quote is how he actually phrased it, or what the context was.) In any case, a more accurate phrasing is that “a projection can be implemented as a fold over a stream of events”. ↩