Reducing the cost of string serialization in Node.js core
Serializing strings has been a pain point for developers, and in the context of this article, is a bottleneck in URL operations. Recently, with the help of Daniel Lemire, we conducted an extensive research to reduce the cost of string serialization on URL parsing operations in Node.js core, resulting in a series of optimizations that addressed the issue, leading to Ada v2.0.0. By implementing these techniques, we were able to improve the performance of URL parsing and formatting, as well as reducing memory usage and improving overall runtime stability. In this article, we will delve into the challenges we encountered while optimizing such bottlenecks in Node.js core, and try to explain the techniques we used to achieve the significant performance improvements.
Quick Recap
What is the purpose of serialization?
Serialization enables us to save the state of an object and recreate the object in a new location. In the context of this paper, serialization is required and used to pass data between C++ and JavaScript layers.
How does C++ code communicate with JavaScript code in Node.js?
Node.js exposes C++ classes to the JavaScript layer using V8 through an
interface called internalBinding
where each subsystem of Node.js registers
its own bindings. An example implementation of how node:buffer
registers
a certain function is available below.
void Initialize(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
SetMethod(context, target, "setBufferPrototype", SetBufferPrototype);
}
void RegisterExternalReferences(ExternalReferenceRegistry* registry) {
registry->register(SetBufferPrototype);
}
NODE_BINDING_CONTEXT_AWARE_INTERNAL(buffer, node::Buffer::Initialize)
NODE_BINDING_EXTERNAL_REFERENCE(buffer, node::Buffer::RegisterExternalReferences)
The internals of how internalBinding
is created, maintained and used is out
of context of this article. For more information, please refer to the
Github discussion I've created called Communication steps between JS and C++.
Problem Definition
Here is a quick overview of the implementation provided by Node.js v19.8.0. The code below is a simplified version of the actual implementation, and does not include base url as the parameter.
Whenever a user calls new URL
inside Node.js, the following class is created.
This class is a wrapper for calling the actual implementation in C++ provided
by the Ada URL parser. The following code is available on Github.
const { parse } = internalBinding('url');
class URL {
#context = new URLContext();
constructor(input) {
input = `${input}`;
if (!parse(input, this.#onParseComplete)) {
throw new ERR_INVALID_URL(input);
}
}
}
The parse
method takes 2 parameters, input and the completion callback.
This is mostly done to avoid the overhead of creating a new object for each
function.
For example, the following code is slow due to the serialization cost of objects:
const parse = internalBinding('url');
const url = 'https://www.yagiz.co';
const { isValid, ...rest } = parse(url);
if (isValid) {
console.log(`parsed href is ${rest.href}`);
}
In the example above, the parse
function returns a boolean isValid
and
other properties of the parsed URL. However, the parse
function returns
these properties regardless of the isValid
flag. This means that
the structure of the rest
object is unknown on the compile time, and
V8 has to do its magic to optimize it with its limited knowledge on
the executed code block. This is a very common problem with JIT (Just in time)
compilers.
Let's dive into the details of how the parse
function is implemented: The URL
constructor by default calls a C++ function called parse
which is
defined inside src/node_url.cc
. The parse
function is defined as follows:
void Parse(const FunctionCallbackInfo<Value>& args) {
CHECK_GE(args.Length(), 2);
CHECK(args[0]->IsString()); // input
CHECK(args[1]->IsFunction()); // complete callback
Local<Function> success_callback_ = args[2].As<Function>();
Environment* env = Environment::GetCurrent(args);
HandleScope handle_scope(env->isolate());
Context::Scope context_scope(env->context());
Utf8Value input(env->isolate(), args[0]);
ada::result out = ada::parse(input.ToStringView());
if (!out) {
return args.GetReturnValue().Set(false);
}
auto argv = GetCallbackArgs(env, out);
USE(success_callback_->Call(
env->context(), args.This(), argv.size(), argv.data()));
args.GetReturnValue().Set(true);
}
Whenever the parse function is called, it needs to be called with
input
parameter which is a string, and a callback function to pass
the values back to the JavaScript layer. This is a smart way of solving
the serialization problem of objects, and it is also a very common pattern
in Node.js core. Unfortunately, this pattern leads to making this function
a function that has a side effect. Meaning, it has to mutate the callback
according to the result of the parsing.
Here's an example of how the callback is mutated to return the result to the JavaScript layer:
auto GetCallbackArgs(Environment* env, const ada::result& url) {
Local<Context> context = env->context();
Isolate* isolate = env->isolate();
auto js_string = [&](std::string_view sv) {
return ToV8Value(context, sv, isolate).ToLocalChecked();
};
return std::array{
js_string(url->get_href()),
js_string(url->get_origin()),
js_string(url->get_protocol()),
js_string(url->get_hostname()),
js_string(url->get_pathname()),
js_string(url->get_search()),
js_string(url->get_username()),
js_string(url->get_password()),
js_string(url->get_port()),
js_string(url->get_hash()),
};
}
In order to process and save this data on the JavaScript layer,
preferably in URL class, JavaScript layer had to have a heavy
function to update the current context of the URL:
#onParseComplete = (href, origin, protocol, hostname, pathname,
search, username, password, port, hash) => {
this.#context.href = href;
this.#context.origin = origin;
this.#context.protocol = protocol;
this.#context.hostname = hostname;
this.#context.pathname = pathname;
this.#context.search = search;
this.#context.username = username;
this.#context.password = password;
this.#context.port = port;
this.#context.hash = hash;
};
This implementation as you've realized is not very efficient. It requires sharing the knowledge of the callback function parameters, by index, between JavaScript and C++. On top of this being a bad practice, there is a lot of room for improvement in terms of performance. The bridge between C++ to JavaScript is not very efficient, leading to bottlenecks when used in hot paths.
This wasn't a problem until now, where the true performance cost of this function lied in the fact that the URL parser was slow. However, with Ada URL parser the bottleneck was moved to the serialization of the result.
As you know the URL contains a lot of properties, where href
is the only
attribute that contains all of the properties of URL, hence the identifier
of the URL.
> new URL('https://www.yagiz.co')
URL {
href: 'https://www.yagiz.co/',
origin: 'https://www.yagiz.co',
protocol: 'https:',
username: '',
password: '',
host: 'www.yagiz.co',
hostname: 'www.yagiz.co',
port: '',
pathname: '/',
search: '',
searchParams: URLSearchParams {},
hash: ''
}
As you might notice, origin, protocol, host, hostname and others are all
substrings of href
. Well, the solution is not as simple as this, because
the origin
might differ from URL
where the hostname
can be different
with different pathname
values. There are lots of edge cases that needs
to be resolved if we are going to resolve this.
The Solution
With Ada URL Parser v2.0.0, we incorporated a common approach in industry for storing the URL properties. The idea is to store the href, and use offsets to represent the URL properties. This way, we can have access to the URL properties without knowing the business logic behind "How to parse a URL?".
This solution comes with another advantage on top of solving the serialization
cost. The parsing becomes faster, because we don't need to create multiple
strings for each URL property. We can reserve and allocate a string with a
guessed size, and use the offsets to construct the href
while parsing the URL.
This reduces the memory allocations, and the time spent on parsing the URL.
URL Components
Here's a quick recap from Ada v2.0 article:
https://user:pass@example.com:1234/foo/bar?baz#quux
| | | | ^^^^| | |
| | | | | | | `----- hash_start
| | | | | | `--------- search_start
| | | | | `----------------- pathname_start
| | | | `--------------------- port
| | | `----------------------- host_end
| | `---------------------------------- host_start
| `--------------------------------------- username_end
`--------------------------------------------- protocol_end
The structure of the URL class stayed the same, but with little caveats.
On the C++ side, we created a class called BindingData
.
class BindingData : public SnapshotableObject {
public:
// This is a simplified version of the class
static void Parse(const v8::FunctionCallbackInfo<v8::Value>& args);
static void Initialize(v8::Local<v8::Object> target,
v8::Local<v8::Value> unused,
v8::Local<v8::Context> context,
void* priv);
static void RegisterExternalReferences(ExternalReferenceRegistry* registry);
private:
static constexpr size_t kURLComponentsLength = 9;
AliasedUint32Array url_components_buffer_;
};
Bindingdata
Bindingdata
class is initialized and snapshotted in the build time
and is used to store an AliasedUint32Array
called url_components_buffer_
with a length of 9
unsigned integers. This property will be used to store the
offsets of the URL. Due to the single-threaded environment and the non-parallel
execution of the URL parser, we ensure that that only 1 Uint32Array is
created for parsing URLs throughout the lifecycle of the Node.js application.
What is AliasedUint32Array?
Referencing from the implementation itself: AliasedUint32Array
is a class
that encapsulates the technique of having a native buffer mapped to a
JavaScript object. Writes to the native buffer can happen efficiently without
going through JavaScript, and the data is then available to user via the
exposed JavaScript object. While this technique is computationally efficient,
it is effectively a write to JavaScript application state without going through
the monitored API. Thus any VM capabilities to detect the modification are
circumvented. The implementation is available at Github.
Here's the implementation of the parse function from BindingData
class.
void BindingData::Parse(const FunctionCallbackInfo<Value>& args) {
CHECK_GE(args.Length(), 1);
CHECK(args[0]->IsString()); // input
// args[1] // base url
BindingData* binding_data = Realm::GetBindingData<BindingData>(args);
Environment* env = Environment::GetCurrent(args);
HandleScope handle_scope(env->isolate());
Context::Scope context_scope(env->context());
Utf8Value input(env->isolate(), args[0]);
auto out = ada::parse<ada::url_aggregator>(input.ToStringView());
if (!out) {
return args.GetReturnValue().Set(false);
}
binding_data->UpdateComponents(out->get_components(), out->type);
args.GetReturnValue().Set(
ToV8Value(env->context(), out->get_href(), env->isolate())
.ToLocalChecked());
}
As a result of this optimization, we just need to update the
url_components_buffer_
with the offsets of the URL properties. This
is done by the binding_data->UpdateComponents
method.
JavaScript Class
Let's dive into the JavaScript implementation of the URL class. Here's the implementation as of Node.js 20 (April 24, 2023).
const bindingUrl = internalBinding('url');
class URL {
#context = new URLContext();
constructor(input, base = undefined) {
input = `${input}`;
const href = bindingUrl.parse(input, base);
if (!href) {
throw new ERR_INVALID_URL(input);
}
this.#updateContext(href);
}
}
The parse
method returned a string or an undefined value depending
on the success of the parsing function. The string value represents
the href
part of the URL. Immediately after successful parsing,
this.#updateContext(href)
method is called to access the URL
components (indexes of the URL properties) and update the current
url instance context. As a result, the cost of parsing an invalid
URL has significantly decreased.
Updating the URL context
The following code is triggered every time, a URL setter is triggered, as well as the everytime a URL is constructed.
#updateContext(href) {
const {
0: protocol_end,
1: username_end,
2: host_start,
3: host_end,
4: port,
5: pathname_start,
6: search_start,
7: hash_start,
8: scheme_type,
} = bindingUrl.urlComponents;
this.#context.protocol_end = protocol_end;
this.#context.username_end = username_end;
this.#context.host_start = host_start;
this.#context.host_end = host_end;
this.#context.port = port;
this.#context.pathname_start = pathname_start;
this.#context.search_start = search_start;
this.#context.hash_start = hash_start;
this.#context.scheme_type = scheme_type;
}
Due to the object destructure of bindingUrl.urlComponents
,
the barrier between JavaScript and C++ is only crossed once,
reducing the performance cost of string serialization.
The usage of indexes as offsets to access the URL properties through a string is not a new idea. As most of you know, it's called lazy loading.
What is lazy loading?
In the context of algorithms, lazy loading is a strategy for optimizing performance by deferring the calculation of values until they are actually needed. This is often used in cases where computing all possible values in advance would be inefficient or impractical. Instead, the algorithm only calculates values as they are requested, often caching the results for future use. This approach can help reduce the amount of computation required, improve memory usage, and speed up the overall execution time of the algorithm.
The usage of lazy-loading forces us to know the context of where lazy loading is used. In the case of the URL class, the cost of parsing vs. the cost of accessing the URL properties is the main factor that we need to consider. As a result of this experiment, and optimizations done on both Ada and Node.js, we were able to reduce the performance cost of parsing by a significant amount.
Here's the result of parsing 100,000 URLs in different Node.js versions on M1 Pro Max. The benchmark code is available at Github.
Runtime | Ada Version | Time (ms/iter) |
---|---|---|
Node.js 20 | 2.0.0 | 38.97 |
Node.js 19.8.1 | 1.0.4 | 79.29 |
Node.js 19.7.0 | 1.0.1 | 105.59 |
Node.js 19.6.1 | - | 140.42 |
If you have a passion for performance and Node.js, we are actively looking for contributors for our performance team.