Matthias Noback

Programming guidelines - Part 3: The life and death of objects

Matthias Noback Programming

In the first part of this series we looked at ways to reduce the complexity of function bodies. The second part covered several strategies for reducing complexity even more, by getting rid of null in our code. In this article we'll zoom out a bit and look at how to properly organize the lifecycle of our objects, from creating them to changing them, letting them pass away and bringing them back from the dead.

About objects

This is a list of technical facts about objects that you might know of already:

  • Objects are instances of classes.
  • Objects live in memory.
  • Objects have attributes and methods providing them with data and behavior.
  • Objects hide their data and the implementation of their behavior behind a public interface, which is their boundary.
  • Objects can be interacted with by sending messages across their boundaries.
  • Sending a message to an object means calling one of its methods.
  • The implementation of the methods conform to the contract of the object.
  • In PHP the contract of an object is defined formally at the language level by the class definition itself (including type-hints for function parameters and return values) and informally using parameter, return value and exception declarations in so-called DocBlocks.

When we say that an object should hide its data and implementation details behind a public interface we often think about object attributes and methods only. We forget about the details of its construction, persistence, reconstitution and death, all of which should be hidden behind a public interface too.

If we don't properly encapsulate object creation, or forget to take care of the other aspects of its lifecycle, the path is littered with:

  • Anemic domain models,
  • Objects with invalid data,
  • Objects with inconsistent data,
  • Objects that get modified in different places.

This, of course, leads to bugs and design problems.

Creating objects

The first thing we can do to design better objects is to encapsulate creation logic. Your motto should be:

Only valid objects can be created.

It should not be possible to create an object and only afterwards initialize it, like this:

$circle = new Circle(); $circle->setRadius(10);

A circle always has a radius, so as an object representation it should never be allowed to exist without one. Working with an invalid object would lead to invalid usage, like:

$circle = new Circle(); // this should be impossible: $circle->calculateArea();

When an object can't exist without some information, you should of course provide that information when constructing it:

$circle = new Circle(10); // now it's safe to do this: $circle->calculateArea();

Creating objects in meaningful ways

Looking at the way a Circle can be constructed, we notice that it's quite unclear what 10 means in this context:

$circle = new Circle(10);

It would be better to clearly show the client of the class using what kind of data exactly an object should be created. We can do so by introducing a named constructor:

class Circle { public static function fromRadius($radius) { return new self($radius); } private function __construct($radius) { $this->radius = $radius; } ... }

A Circle object can now be instantiated like this:

Circle::fromRadius(10);

It's a good idea to make the constructor itself a private method, forcing clients to create Circle objects in ways that you explicitly support.

Creating objects in different ways

Sometimes it should be possible to create an object in multiple ways, using different types of input data. For example, you could also define a circle by its diameter. In that case you can just add a second named constructor, which encapsulates the logic needed to derive a radius from the given diameter:

class Circle { public static function fromDiameter($diameter) { $radius = $diameter / 2; return new self($radius); } ... }

Another good use case for multiple (named) constructors is when objects allow some data to be optionally provided (or if they know how to recover from missing data). For example:

class Person { private $partner; public static function married(Person $spouse) { $person = new self(); $person->partner = $spouse; return $person; } public static function bachelor() { $person = new self(); // $person->partner will be left undefined return $person; } }

The Person class has no constructor. The named constructors each populate the private properties of the instantiated Person "from the outside" before returning it. In fact, the named constructors are inside the same class, so they are not really outside the Person object (read more about it in the PHP documentation about visibility from other objects).

Validating correctness of constructor arguments

When an object accepts constructor arguments, it should take great care in validating their correctness. If it would accept any (type of) argument, this would allow for invalid objects to be created.

In the case of the circle, it's not enough to verify that the given radius is a float, but we need to make sure that it's larger than 0 as well. We should check this inside the constructor and throw an exception when the provided radius is incorrect. Throwing an exception in the constructor will prevent the object from being created at all:

class Circle { public function __construct($radius) { if (!is_float($radius) || $radius <= 0) { throw new \InvalidArgumentException( 'Radius should be a float and larger than 0' ); } $this->radius = $radius; } ... }

Changing objects

Sometimes an object needs to undergo some changes. For example when an object is in fact a domain object, representing a certain concept or thing, the state of which isn't fully reflected yet by the internal data of the object. Since the object boundary (its public interface) protects the internal data, the object has to expose (public) methods to allow its client to modify that state, e.g.:

$invoice = new Invoice(...); // when a payment was made for this invoice, change its state $invoice->processPayment(100);

After the payment was processed, the invoice is still the same invoice, except its state has changed.

Validate method arguments

Just like when we discussed constructor arguments we should thoroughly validate any function argument (like the payment amount). Otherwise we might end up with an object containing invalid or inconsistent data:

class Invoice { private $payments = []; public function processPayment($amount) { if (!is_int($amount) || $amount <= 0) { // we use integers to prevent rounding problems with floats throw new \InvalidArgumentException( 'Paid amount should be an integer and more than 0' ); } $this->payments[] = $payment; } }

Of course, all these validations lead to much code duplication, so I recommend you again to outsource these concerns to a dedicated library like beberlei/assert.

Only make consistent changes

State-changing methods should require all the data necessary to process the client's change request. For example, it should be impossible to change the street part of a company's address without also providing the street number:

class Company { public function updateAddress($street, $number, ...) { ... } }

Prefer immutable objects

Making changes to objects is nice, but it also introduces a lot of complexity. The main reasons for this:

  • When the state of an object changes it will behave differently the next time a client calls a method on it. It may return different answers, start throwing exceptions, etc.
  • When the same object is used by different clients, changes made by one client will affect other clients, leading to unforeseen and hard to debug situations.

For these reasons, when working with objects, your motto should be:

Objects should not be allowed to change, unless it's their purpose to change.

When you're designing an object, you should try to define its role and determine if change is required for that role. Of course, this has been done by many programmers before us and the outcome is that in general there's only one type of object that actually needs stateful change: entities. Entities are explicitly defined as objects with identity, which are able to undergo changes over time.

All other objects can be created once, and never need to be modified. This is true for all kinds of services; dispatchers, routers, controllers, mailers, validators, etc. It's also true for something we know as value objects.

About value objects

Value objects are immutable objects by definition. As a tactical DDD pattern a value object is supposed to describe one aspect of an entity (like color, length, width, name, age, etc.). Value objects as opposed to reference objects don't derive their identity from the fact that they refer to the same thing in memory, but that their fields are equal. Value objects don't change; they are simply discarded - replaced by something more accurately describing the entity. Even when the replacement value is equal to the original value, the fact that the object reference changes is irrelevant. It's still the "same" value.

The use of value objects isn't limited to your domain layer. You can implement a value object whenever you'd like to explicitly define a type of value, thereby protecting the value's creation, its consistency and usage. For example:

class Server { private $host; public function __construct($host, $port) { \Assert\that($host)->string(); \Assert\that($port)->integer(); ... } }

Value objects as custom types

Value objects can be considered types in their own right. In that way they can extend the available types of a programming language itself. For example, when we want to keep a URL in a variable, we usually use a string variable. A function could accept that string and verify that the string contains a URL indeed:

function getHost($url) { \Assert\that($url) ->string() ->url(); ... }

Instead we can enforce that only valid URLs will be provided, by specifying a dedicated type for "URL" arguments:

function getHost(Url $url) { ... }

The newly introduced class Url takes care of validation itself:

class Url { private $url; private function __construct($url) { $this->url = $url; } public static function fromString($url) { \Assert\that($url) ->string() ->url(); return new self($url); } }

Value objects usually attract behavior that was previously located elsewhere. In this case we can move the logic for determining the host part of a URL to the Url class itself:

class Url { ... public function getHost() { ... } }

Letting go of objects

In PHP, an object dies when nobody refers to it anymore. So, when a value object has been replaced by another one, the previous object will simply die. The same goes for services: if they are not used anymore, they will depart from us. However, it may be worth it to keep a previously instantiated service around. If you have designed your services well, i.e. as immutable objects, they should be able to do their job over and over again. There's actually no need to instantiate a service object again. The next time it should behave in the exact same way as it did before. This can be turned into a very good motto:

Design services so that they could be running forever.

As previously discussed, the only objects that actually change (and are designed to change too) are entities. They are protected from dying too soon by adding them to an identity map. Later, when everybody has done with the entity whatever they wanted to do with it, some kind of persistence tool will take the entities and calculate the changes it has to make in the persistent storage (e.g. a MySQL database).

Bringing objects back from death

Once you have let go of objects, you will often want to revive them after some time. For example, you want to restore them from a database result set, or you want to deserialize them (back) into existence. This poses some new problems. We had previously agreed upon protecting the object's boundaries. We forced the use of named constructors to properly create a valid object. We offered meaningful, consistent methods for changing the state of an object. But for reconstitution or deserialization you have to completely ignore the object boundary and dive right into the bowels of the object (unless you use a technique like event sourcing by the way).

Persistence

In the case of persisting an object including any changes made to it (that is, the object is an entity), it's usually possible to respect the object boundaries anyway. At least when your persistence library uses reflection to collect or reconstitute internal object state.

Persistence shouldn't be your first thought when designing your objects though. Before you know it you'll comply to all kinds of silly rules, required to make your domain model work with the currently fashionable persistence library. Start with designing the public API of your objects. Make sure you can only create valid objects and only then worry about persisting them. This should require just a small number of changes, for example to support persisting one-to-many associations.

(De)serialization

Serialization and deserialization (for example to and from a JSON or XML string) are common tasks, often implemented using something like a generic serializer (which uses reflection) or using custom class and instance methods:

class Country { private $countryCode; public function serialize() { return [ 'country_code' => $this->countryCode ]; } public static function deserialize(array $data) { $country = new self(); $country->countryCode = $data['country_code']; return $country; } }

This works very well. serialize() actually normalizes the data which can later be converted to a plain text format like JSON. Whenever the data is reconstituted from plain text, it's being fed to deserialize(), which recreates the object in the same way a persistence tool would: by simply copying values into the object (the "backdoor strategy"). The provided data won't be validated: we expect to receive the correct data structure and (types of) values.

If the data has been provided as input by some external client (or if the data structure might have been modified in the meantime), we need to be more careful. We must be sure that the provided data is valid. If you have designed your objects to be created and modified in fully controlled ways, make sure you only use the official ways to (re)create objects based on raw data.

This should be the responsibility of a dedicated class, like a factory:

class Country { private $countryCode; public static function fromCountryCode($countryCode) { if (/* $countryCode is not a valid country code */) { throw new \InvalidArgumentException(...); } $country = new self(); $county->countryCode = $countryCode; return $country; } public function serialize() { return [ 'country_code' => $this->countryCode ]; } public static function deserialize(array $data) { \Assert\that($data) ->arrayKeyExists('country_code'); return Country::fromCountryCode($data['country_code']); } }

The deserialize() method now verifies the data structure too and creates the object in the canonical way by calling Country::fromCountryCode(), thereby automatically triggering the right exceptions when something about the data is wrong.

We need a better solution

The above solution is far from ideal: we'll get exceptions for the first encountered problem. Instead, we might want to provide a full list of problems to the client that is sending the data to us. The real problem is: we're using the same approach for converting user input (a JSON plain text string) to an object, as well as reconstituting an object based on serialized data. We shouldn't allow this: user input should not be used to directly meddle with our (domain) objects. Instead, we should first deserialize to a DTO, then recognize the DTO as a request for change (a command) and finally process it by calling methods on domain objects. I'll cover that in more detail in the next article.

Further reading