Serializing strings has been a pain point for developers, and in the context of this article, is a bottleneck in URL operations. Recently, with the help of Daniel Lemire, we conducted an extensive research to reduce the cost of string serialization on URL parsing operations in Node.js core, resulting in a series of optimizations that addressed the issue, leading to Ada v2.0.0. By implementing these techniques, we were able to improve the performance of URL parsing and formatting, as well as reducing memory usage and improving overall runtime stability. In this article, we will delve into the challenges we encountered while optimizing such bottlenecks in Node.js core, and try to explain the techniques we used to achieve the significant performance improvements.
What is the purpose of serialization?
internalBinding where each subsystem of Node.js registers
its own bindings. An example implementation of how
a certain function is available below.
The internals of how
internalBinding is created, maintained and used is out
of context of this article. For more information, please refer to the
Github discussion I've created called Communication steps between JS and C++.
Here is a quick overview of the implementation provided by Node.js v19.8.0. The code below is a simplified version of the actual implementation, and does not include base url as the parameter.
Whenever a user calls
new URL inside Node.js, the following class is created.
This class is a wrapper for calling the actual implementation in C++ provided
by the Ada URL parser. The following code is available on Github.
parse method takes 2 parameters, input and the completion callback.
This is mostly done to avoid the overhead of creating a new object for each
For example, the following code is slow due to the serialization cost of objects:
In the example above, the
parse function returns a boolean
other properties of the parsed URL. However, the
parse function returns
these properties regardless of the
isValid flag. This means that
the structure of the
rest object is unknown on the compile time, and
V8 has to do its magic to optimize it with its limited knowledge on
the executed code block. This is a very common problem with JIT (Just in time)
Let's dive into the details of how the
parse function is implemented: The URL
constructor by default calls a C++ function called
parse which is
parse function is defined as follows:
Whenever the parse function is called, it needs to be called with
input parameter which is a string, and a callback function to pass
the serialization problem of objects, and it is also a very common pattern
in Node.js core. Unfortunately, this pattern leads to making this function
a function that has a side effect. Meaning, it has to mutate the callback
according to the result of the parsing.
function to update the current context of the URL:
This wasn't a problem until now, where the true performance cost of this function lied in the fact that the URL parser was slow. However, with Ada URL parser the bottleneck was moved to the serialization of the result.
As you know the URL contains a lot of properties, where
href is the only
attribute that contains all of the properties of URL, hence the identifier
of the URL.
As you might notice, origin, protocol, host, hostname and others are all
href. Well, the solution is not as simple as this, because
origin might differ from
URL where the
hostname can be different
pathname values. There are lots of edge cases that needs
to be resolved if we are going to resolve this.
With Ada URL Parser v2.0.0, we incorporated a common approach in industry for storing the URL properties. The idea is to store the href, and use offsets to represent the URL properties. This way, we can have access to the URL properties without knowing the business logic behind "How to parse a URL?".
This solution comes with another advantage on top of solving the serialization
cost. The parsing becomes faster, because we don't need to create multiple
strings for each URL property. We can reserve and allocate a string with a
guessed size, and use the offsets to construct the
href while parsing the URL.
This reduces the memory allocations, and the time spent on parsing the URL.
Here's a quick recap from Ada v2.0 article:
The structure of the URL class stayed the same, but with little caveats.
On the C++ side, we created a class called
Bindingdata class is initialized and snapshotted in the build time
and is used to store an
with a length of
9 unsigned integers. This property will be used to store the
offsets of the URL. Due to the single-threaded environment and the non-parallel
execution of the URL parser, we ensure that that only 1 Uint32Array is
created for parsing URLs throughout the lifecycle of the Node.js application.
What is AliasedUint32Array?
Referencing from the implementation itself:
AliasedUint32Array is a class
that encapsulates the technique of having a native buffer mapped to a
the monitored API. Thus any VM capabilities to detect the modification are
circumvented. The implementation is available at Github.
Here's the implementation of the parse function from
As a result of this optimization, we just need to update the
url_components_buffer_ with the offsets of the URL properties. This
is done by the
parse method returned a string or an undefined value depending
on the success of the parsing function. The string value represents
href part of the URL. Immediately after successful parsing,
this.#updateContext(href) method is called to access the URL
components (indexes of the URL properties) and update the current
url instance context. As a result, the cost of parsing an invalid
URL has significantly decreased.
Updating the URL context
The following code is triggered every time, a URL setter is triggered, as well as the everytime a URL is constructed.
Due to the object destructure of
reducing the performance cost of string serialization.
The usage of indexes as offsets to access the URL properties through a string is not a new idea. As most of you know, it's called lazy loading.
What is lazy loading?
In the context of algorithms, lazy loading is a strategy for optimizing performance by deferring the calculation of values until they are actually needed. This is often used in cases where computing all possible values in advance would be inefficient or impractical. Instead, the algorithm only calculates values as they are requested, often caching the results for future use. This approach can help reduce the amount of computation required, improve memory usage, and speed up the overall execution time of the algorithm.
The usage of lazy-loading forces us to know the context of where lazy loading is used. In the case of the URL class, the cost of parsing vs. the cost of accessing the URL properties is the main factor that we need to consider. As a result of this experiment, and optimizations done on both Ada and Node.js, we were able to reduce the performance cost of parsing by a significant amount.
Here's the result of parsing 100,000 URLs in different Node.js versions on M1 Pro Max. The benchmark code is available at Github.
|Runtime||Ada Version||Time (ms/iter)|
If you have a passion for performance and Node.js, we are actively looking for contributors for our performance team.