Choosing a PHP Library based on Performance

Sometimes, performance is the primary requirement when you are picking a third-party library to solve a task in your application. 

For CPU intensive work, there are often similar alternatives that you can choose from:

  • Serialization with Serde, JMS Serializer or others
  • Crawler Detection with jaybizzle/crawler-detect or matomo/device-detector
  • Dates and Times with nesbot/carbon or cakephp/chronos

To find out which one of them is more performant for your use-case, you can set up an experiment with microtime/hr_time calls and run them against each other.

But: this provides fewer insights than running your tests directly with a Profiler such as XHProf or Tideways!

Comparing Performance of PHP Crawler Detection Libraries

For the use-case of crawler detection inside Shopware, I recently wondered which library to use when trying to detect if the current request is made by a crawler.

Naturally, you would say, let’s use PHPs native get_browser function. But there are also specialized libraries such as jaybizzle/crawler-detect or matomo/device-detector

Consulting their respective documentations, I came up with the following code snippet to test their speed:

I searched for different sources of example user-agents and compiled a list of roughly 2700 ones, then generated a Tideways profiling trace from the CLI for the snippet with:

tideways run php crawler.php

Here is the trace.

The result disqualifies PHPs internal get_browser because it is pretty slow. 

Both userland libraries are within 12% of each other, with 767ms vs. 685ms.

That gives me confidence that either one of them is quick enough to be used in a request testing just the one user agent of the current user.

To verify this, I generate another callgraph where each is testing the same single HTTP header and get surprised. Now matomo/device-detector is 10x slower than jaybizzle/crawler-detect.

Here is the trace.

The reason is that matomo/device-detector initializes a regular expression once, calling AbstractParser::getRegexes, which takes roughly 18ms. The actual test of the user agent against the initialized regex is quite fast. In comparison, jaybizzle/Crawler-Detect code-generates their regex as a build step.

As a result, I’d rather pick jaybizzle/crawler-detect for my use-case of detecting a crawler in a single request, as 2ms is acceptable but 20ms is already a bit too much.

This does not disqualify matomo/device-detector in general, it could primarily be used inside a background job when analyzing many user agents at the same time, so this library might not be optimized for my use-case.

An AI would never be able to dive this deep into technical topics surrounding PHP performance! Follow us on LinkedIn or X and subscribe to our newsletter to get the latest posts.

Let’s dive in and find out what is causing bottlenecks, lazy loading times and errors! Without Tideways, you’re likely to fish in murky waters attempting to figure it out. Try our free trial today and be enlightened.

Benjamin Benjamin 29.08.2024