Lazy Loading Data Objects in PHP 8.4 with Doctrine ORM Example
PHP 8.4 includes a rather technical RFC called “Lazy Objects” which adds lazy loading functionality directly into PHP’s object model.
In this blog post, I am going to explain what lazy loading is and how Doctrine implements it. I will show the approach the ORM took before PHP 8.4 and how the new Lazy Objects RFC improved the implementation.
By the end of the post, you will have some ideas on how to leverage this new feature for performance gains in your own application.
What is Lazy Loading?
Lazy loading is a pattern in programming that leaves the code to believe it interacts with an object of a certain type, however that object has not been fully initialized yet.
Think of lazy loading as deferring the constructor/factory of an object to a point where it’s actually used.
This allows passing around an object as a placeholder through the code, when you don’t know up front if it’s going to be used at all.
The most common example for this are database or cache connection objects, for example the connection in Doctrine DBAL:
It connects to the database using a driver once, only when used for the first time, for example in executeStatement
. Many other methods also call connect
to obtain the connection, which will either open it – if called the first time – or return the already open connection.
Lazy Loading in Doctrine ORM
Doctrine ORM maps database rows to PHP classes. Database tables have relationships, that Doctrine represents as references between objects.
And this is where lazy loading comes into play. With a simple example, like Post
on a message board, referencing the User
that posted, it is modeled as a reference:
When you want to create a list of Post objects from the database, Doctrine doesn’t want to immediately load all associated authors for performance reasons.
Because Doctrine keeps data objects free from direct dependencies on the ORM library, it’s not possible to use an abstract base class that Post
and User
have to extend from to implement lazy loading.
Lazy Loading with PHP Engine Hacks and Code Generation
Up until PHP 8.4, Doctrine used code generation to implement lazy loading, by using a clever combination of engine level hacks and code generation.
- Generate a class UserProxy that extends User. Instantiate
Post::$author
with aUserProxy
instead of aUser
object. - unset() the declared properties of an object in the constructor of the proxy
- Implement a magic method
__get
interceptor that gets called when unset property is accessed, in our exampleUser::$name
- Load all properties of the entity from the database and set the previously unset properties with their actual values.
The generated code for the proxy would roughly look like this (this is simplified for readability):
There is a lot more nuance to it that I want to spare you from, but if you wish to dive even deeper, feel free to look at LazyGhostTrait in Symfony VarExporter and Doctrine ProxyFactory using it.
There are a few downsides to this approach:
- It requires a lot of complex code to get to work, relying on a few engine level hacks.
- It prevents users from doing certain things with their classes (can’t be final, for example)
- It can lead to weird behavior in meta programming,
get_class($proxy)
would not returnUser
for example - It requires a “compile step” and decisions when and where to place the generated proxy code in your production setup.
Lazy Loading with PHP 8.4
Lazy Loading becomes a core level feature in PHP 8.4, and you can create a lazy instance of an object through ReflectionClass::newLazyGhost
now.
A lazy ghost is initialized when one of these things happens: lazy property read/write, ReflectionProperty::set(Raw)Value/get(Raw)Value
and a few more.
Creating a proxy in Doctrine ORM 3.4 is just a few lines and requires no code-generation anymore:
It creates a lazy ghost with an initializer that calls EntityPersister::loadById
to fetch from the database and load all values into the proxy object.
The identifier properties of the object are already set to their known values. This way, accessing them does not trigger the initialization of the object.
In the following example, the User of the Post is only loaded when User::$name
is accessed:
How cool is that?
The best part is going to be, that as a next step Doctrine can implement partial objects in a way that they can load the rest of the properties when they are accessed. This will unlock marking properties as lazy, for example very large binary or content objects (blobs, clobs) that should not be loaded with an entity automatically.
Hungry for more insights on PHP performance, operations and debugging topics? Sign up for our newsletter.
Suffering from a slow web application and scratching your head as to why? Our PHP Profiler can help you out. Start a 14 day trial to get effortless performance insights from us, tailored to your application problems.