Measuring the DOM Namespace Reconciliation Performance Fix
This is the story of the manufacturing of seven-league boots for a function that is responsible for processing XML/HTML data in the PHP library. Optimizations in the PHP standard library, like here in ext/dom, have the potential to speed up the performance of applications significantly by an upgrade to the current PHP version. I don’t want to spill the beans right at the beginning but the performance optimization from PHP 8.2 to PHP 8.3 in percent is … no, I’ll show you later to keep up the suspense.
Let’s start at the beginning and work our way through the story of this particular bugfix, which began in 2019 when I was working on the RFC “DOM Living Standard“. I stumbled upon this performance problem which was brought to my attention by a Wikimedia programmer and thereupon created a bug report with “reproduce case“. This resulted in Niels Dossche authoring a performance fix.
What this performance improvement means can be seen in a Profiler such as Tideways. Using the new #[WithSpan] feature for the Timeline of the Tideways Profiler it is possible to see the duration of calls to certain functions of your codebase. I modified the code for reproduce case to include the attribute like this:
You can see in the Profiler that the “addParagraphs” function becomes slower with each call even though it does exactly the same and should accordingly be about as fast every time.
Furthermore, by looking at the “Callgraph”, it becomes possible to exactly pinpoint the place where the application is slow: The DOMNode::appendChild function taking 91% of the time.
PHP 8.2 Callgraph
PHP 8.3 compared to PHP 8.2 makes the performance improvement visible:
It sounds like magic was involved but the 94% performance increase was achieved by an optimization in the PHP standard library (in this particular example in ext/dom). The potential to make many applications faster by upgrading to the latest PHP version or at least a more recent version is sometimes underestimated.