What Is Garbage Collection in PHP And How Do You Make The Most Of It?

BONUS: We have discussed this topic with an expert in the PHP community in our podcast:

Thanks to PHP being an interpreted language and the fact that it has a garbage collector, PHP developers don’t often have to think about memory management. Unlike developers in compiled languages, such as C/C++, we don’t have to give that much thought to memory allocation and deallocation.

However, it’s helpful to have a broad understanding of how garbage collection works in PHP, along with how you can interact with it so that you can create high performing applications.

In this article, we’ll cover two things:

  • The basics of how garbage collection works in PHP
  • About some of the functions available for interacting with it

By the end of the article, you will be better able to understand how garbage collection affects your application.

How Does Garbage Collection Work In PHP?

PHP’s garbage collection works in three ways:

  • Variables Fall Out Of Scope
  • Reference Counting
  • Garbage Collection

Variable Falls Out Of Scope

If a variable falls out of scope and is not used anywhere else, then it is automatically garbage collected. However, this process can be invoked manually by using unset(). In the example below:

  • $foo will be automatically garbage collected as soon as display_var() finishes executing; and
  • $user will be garbage collected because it was removed via unset().
<?php

function display_var() {     $foo = "bar";     echo $foo; }

$user = "Matthew"; unset($user); 

Reference Counting

Similar to other languages, such as Python, Perl, and Tcl (version 8), PHP uses reference counting to help determine when variables are eligible to be garbage collected.

Reference counting is where PHP internally keeps track of how many symbols point to a given variable. When the number of symbols pointing to a variable drops to zero, then the variable is a candidate for being garbage collected at the end of the current request.

In the example below, $A is initialized to “value“, and then $B is initialized pointing to $A.

<?php

$A = "value"; $B = $A; 

When $A is first initialized, it has one reference, the current scope. When $B is initialized to point to $A, then there are two references to $A. If at some point, $B is removed, then $A is then eligible to be garbage collected.

To see how many references are stored for $A, in the code above, you can call xdebug_debug_zval(), assuming that you have Xdebug installed. Using the example above, you will see the following output:

a: (refcount=1, is_ref=0)='value' 

Garbage Collection

If a variable is part of a cyclic reference, e.g., where $A points to $B and $B back to $A, then the variable can only be cleaned up by PHP’s garbage collector. The garbage collector is triggered whenever 10,000 possible cyclic objects or arrays are currently in memory, and one of them falls out of scope.

The collector is enabled by default in every request. In general, this is a good thing. However, as it’s a process that runs, it requires cycles and computational resources, which can’t be devoted to your application. So, if your application is highly time-sensitive, it may be necessary to disable it, if only briefly.

It can be disabled in two ways:

  1. By calling gc_disable
  2. By setting zend.enable_gc boolean to false.

Additionally, if you call the function gc_collect_cycles, then garbage collection is triggered even if you don’t have 10,000 of them in memory yet.

Garbage Collection Has Improved in PHP 7.3

Garbage collection has improved notably in the 7.3 release of PHP, after the merge of a PR by Dmitry Stogov and Nikita Popov. The changes, as the benchmarks below attest, show a marked improvement in the performance of PHP’s garbage collector — especially for an application with a large number of objects.

// Very, very, very many objects GC       |    OLD |   NEW disabled |  1.32s | 1.50s enabled  | 12.75s | 2.32s

// Very many objects GC       |    OLD |   NEW disabled |  0.87s | 0.87s enabled  |  1.48s | 0.94s

// Less many objects GC       |    OLD |   NEW disabled |  1.65s | 1.62s enabled  |  1.75s | 1.62s 

You can see that the garbage collector’s execution time drops notably, in all but when an application only has a small number of objects; which is understandable.

Garbage Collection Statistics

Now we have a basic understanding of what garbage collection is, how it’s implemented in PHP, and when it’s triggered. However, we need more information to be able to use it effectively. Specifically, when it is triggered and how efficient each garbage collection run is.

As of PHP 7.3, PHP provides basic garbage collection information in user land PHP, via gc_status. The function returns:

  • The number of garbage collection runs
  • The number of objects collected
  • The current garbage collection threshold
  • The number of garbage collection roots

For greater detail, however, Xdebug is required. It supports writing far more comprehensive garbage collection statistics in a human-readable, tabular format to a configurable file and directory. Below, you can see what the file would look like:

Garbage Collection Report version: 1 creator: xdebug 2.6.0 (PHP 7.2.0)

Collected | Efficiency% | Duration | Memory Before | Memory After | Reduction% | Function ----------+-------------+----------+---------------+--------------+------------+---------     10000 |    100.00 % |  0.00 ms |       5539880 |       579880 |    79.53 % | bar     10000 |    100.00 % |  0.00 ms |       5540040 |       580040 |    79.53 % | Garbage::produce      4001 |     40.01 % |  0.00 ms |       2563048 |       578968 |    77.41 % | gc_collect_cycles 

Here is what each table column means:

Column Description
Collected The number of items that the garbage collector cleaned up.
Efficiency% The garbage collection efficiency percentage.
Duration The time taken for garbage collection to complete.
Memory Before The amount of memory available before garbage collection.
Memory After The amount of memory available after garbage collection.
Reduction% The percentage of memory saved by the garbage collection run.
Function The name of the function which garbage collection was run on.
Class (not shown in this example) The name of the class which garbage collection was run on.

Available Settings

The extension adds three new Xdebug settings; these are:

  • xdebug.gc_stats_enable: This enables the collection of garbage collection statistics, which is disabled by default. The file name and directory where the statistics are written are configurable by xdebug.gc_stats_output_name and xdebug.gc_stats_output_dir respectively.
  • xdebug.gc_stats_output_dir: This sets the directory where the statistics file is written to. Note, this setting can not be set with ini_set().
  • xdebug.gc_stats_output_name: This sets the name of the file that the statistics are written to.

In Conclusion

While not something that PHP developers have to give much consideration to, as a routine matter of course, Garbage collection is still something that can be essential to know, if we are to ensure that our applications perform as optimally as possible.

If you want to know more about it, make sure you refer to the PHP manual and the other links in the further reading section below.

Further Reading

Benjamin Benjamin 29.10.2019