Identifying An Aggregate Is Not Object Oriented Design

published on 08 January 2018 in Domain driven design

It's hard to see people saying they're doing (or trying to learn) DDD, rushing to write classes which they consider to be an aggregate (root). I've written a lot in the past about the difference between Domain Model and Persistence Model, so I'm trying something new and hopefully better.

Forget about being a developer. As a human who uses computers there are apps you're using every day that you know nothing about how they're built. Why are you using those apps? Because you need/like their features. What is a feature? It's a functionality an app (system) performs which provides value to you. As a user, you care about behaviour.

Put your developer hat back on. Most developers are 'trained' to think in data/state. Suddenly you care about what an Order/User/FooBar has. It's not about functionality anymore, just state, data structure with methods. If you want to be successful with DDD , you have to think like a user i.e the domain model should be a behavioural model first, with the data model being just an artifact of the former. Simply put, data is the outcome of the domain model and you can store it in many forms.

Back to our aggregate, it groups behaviour required to make valid domain changes (in a CQRS context). And we get to the point that an Aggregate represents one domain state change, which from a data point of view can consists of many data changes. Think of creating an Invoice. The outcome means one change of the domain state (we have a new invoice) , however an invoice contains plenty of data and all of the data values are part of the same change as one operation (basically Unit of Work).

The role of the Aggregate is to group all required details (small data changes) in order to have one valid business state change (from the domain point of view). The aggregate might be implemented as a class (state and behaviour), but its value in design is the fact that it represents only one business change operation, regardless of how many details needs to be changed. That's why the aggregate defines a consistency boundary, every change inside that boundary is part of that unit of work. And it's always about domain state changes, not about data itself (I'll provide an example later).

Moreover, that's why you don't have aggregates involved in the same unit of work. It doesn't make sense, since each aggregate represents a whole immediately consistent operation that results in one business state change. That change is represented in DDD as a Domain Event. From a high level point of view it can end here, but if we go lower, we need to handle how things will be persisted. However, this is an implementation detail, outside of DDD.

From Persistence point of view you can store the changes themselves (Event Store) and/or as projections i.e actual data that can be queried easily. That data model is unrelated to the Domain Model, it exists as an implementation detail to have fast(er) reads.

You may think "ok, behaviour first, but we do need input value or existing data in order to perform a domain operation". That's true, however that data is transient and always dependent on the functionality that needs it. When modelling some Domain functionality, our focus is on the rules first and then whatever data is needed (both as input and output).

Example - Placing an order

Always model as a domain operation (functionality), never as a state. So instead of the usual "I have an Order that has order lines bla bla", we think: "customer places an order which has as result the creation of an order". The one business state change is the fact that if all goes well, the business will have a new order to fulfill. Before the operation, the business state consists of n orders. After our change , the business state will consist of n+1 orders. How the data is actually stored (Event Store, RDBMS etc) is not relevant now, we view the business state as an abstraction, a snapshot in time of the business data.

Obviously, we identify concepts beyond the Order itself. Now we're going to need representations of each concept, a model for each, which means all the details that are relevant for this operation only. Very important, these are NOT classes, just information. The domain model, including aggregates shouldn't depend on a programming language, stack, framework or programming mindset like OOP. These are implementation details, the programming how. But right now, we need to identify the programming what , which actually is the Domain's how. By understanding how the Domain does its stuff we identify what abstraction we need to implement.

To keep things short, we've identified that the Order means a collection of Products with their Quantities and Price. Also a Coupon and maybe Shipping Address. And some rule that we need at least one product (quantity>0) or else the order doesn't make sense. Our Create Order aggregate consists of a group of these concepts' models plus aggregate consistency rules, while the models consist of business rules and required data (changes of the current business state). We have input values that go through the rules and once they're valid they'll become the 'new' business state. The Aggregate makes sure we have all the required values in valid form (information represented as Value Objects) to perform the domain state change (however, the Aggregate contains only some of the domain rules) . And these values are the details of the resulting Domain Event.

If you're thinking: "Ok, now show me the code", I need to repeat myself and say that the Aggregate is a high level design construct that doesn't depend on the programming stack (code - and I include here any form of class or function design - is always implementation). An Aggregate is just information about how a domain operation changes the business state.

And that's why the Aggregate is so important. It literally tells us: "Hey in this operation, we need this data respecting these rules and all data together represents one domain state change". Many developers want just to jump to the coding part, but the problem is their domain driven design is in fact implementation. They design classes instead of identifying aggregates. And that's because they think state (as in what it needs to be persisted) first. Once they'll switch to domain functionality first, DDD will become easier.