Reducing the cost of string serialization in Node.js core

Serializing strings has been a pain point for developers, and in the context of this article, is a bottleneck in URL operations. Recently, with the help of Daniel Lemire , we conducted an extensive research to reduce the cost of string serialization on URL parsing operations in Node.js core, resulting in a series of optimizations that addressed the issue, leading to Ada v2.0.0. By implementing these techniques, we were able to improve the performance of URL parsing and formatting, as well as reducing memory usage and improving overall runtime stability. In this article, we will delve into the challenges we encountered while optimizing such bottlenecks in Node.js core, and try to explain the techniques we used to achieve the significant performance improvements.

Quick Recap

What is the purpose of serialization?

Serialization enables us to save the state of an object and recreate the object in a new location. In the context of this paper, serialization is required and used to pass data between C++ and JavaScript layers.

How does C++ code communicate with JavaScript code in Node.js?

Node.js exposes C++ classes to the JavaScript layer using V8 through an interface called internalBinding where each subsystem of Node.js registers its own bindings. An example implementation of how node:buffer registers a certain function is available below.

src/node_buffer.cc

void Initialize(Local<Object> target,
                Local<Value> unused,
                Local<Context> context,
                void* priv) {
  SetMethod(context, target, "setBufferPrototype", SetBufferPrototype);
}
 
void RegisterExternalReferences(ExternalReferenceRegistry* registry) {
  registry->register(SetBufferPrototype);
}
 
NODE_BINDING_CONTEXT_AWARE_INTERNAL(buffer, node::Buffer::Initialize)
NODE_BINDING_EXTERNAL_REFERENCE(buffer, node::Buffer::RegisterExternalReferences)

The internals of how internalBinding is created, maintained and used is out of context of this article. For more information, please refer to the Github discussion I’ve created called Communication steps between JS and C++ .

Problem Definition

Here is a quick overview of the implementation provided by Node.js v19.8.0. The code below is a simplified version of the actual implementation, and does not include base url as the parameter.

Whenever a user calls new URL inside Node.js, the following class is created. This class is a wrapper for calling the actual implementation in C++ provided by the Ada URL parser . The following code is available on Github .

Node.js URL class constructor

const { parse } = internalBinding('url');
 
class URL {
  #context = new URLContext();
 
  constructor(input) {
    input = `${input}`;
    if (!parse(input, this.#onParseComplete)) {
      throw new ERR_INVALID_URL(input);
    }
  }
}

The parse method takes 2 parameters, input and the completion callback. This is mostly done to avoid the overhead of creating a new object for each function.

For example, the following code is slow due to the serialization cost of objects:

Object serialization example

const parse = internalBinding('url');
const url = 'https://www.yagiz.co';
const { isValid, ...rest } = parse(url);
if (isValid) {
  console.log(`parsed href is ${rest.href}`);
}

In the example above, the parse function returns a boolean isValid and other properties of the parsed URL. However, the parse function returns these properties regardless of the isValid flag. This means that the structure of the rest object is unknown on the compile time, and V8 has to do its magic to optimize it with its limited knowledge on the executed code block. This is a very common problem with JIT (Just in time) compilers.

Let’s dive into the details of how the parse function is implemented: The URL constructor by default calls a C++ function called parse which is defined inside src/node_url.cc. The parse function is defined as follows:

Parse

void Parse(const FunctionCallbackInfo<Value>& args) {
  CHECK_GE(args.Length(), 2);
  CHECK(args[0]->IsString());  // input
  CHECK(args[1]->IsFunction());  // complete callback
 
  Local<Function> success_callback_ = args[2].As<Function>();
 
  Environment* env = Environment::GetCurrent(args);
  HandleScope handle_scope(env->isolate());
  Context::Scope context_scope(env->context());
 
  Utf8Value input(env->isolate(), args[0]);
  ada::result out = ada::parse(input.ToStringView());
 
  if (!out) {
    return args.GetReturnValue().Set(false);
  }
 
  auto argv = GetCallbackArgs(env, out);
  USE(success_callback_->Call(
      env->context(), args.This(), argv.size(), argv.data()));
  args.GetReturnValue().Set(true);
}

Whenever the parse function is called, it needs to be called with input parameter which is a string, and a callback function to pass the values back to the JavaScript layer. This is a smart way of solving the serialization problem of objects, and it is also a very common pattern in Node.js core. Unfortunately, this pattern leads to making this function a function that has a side effect. Meaning, it has to mutate the callback according to the result of the parsing.

Here’s an example of how the callback is mutated to return the result to the JavaScript layer:

GetCallbackArgs

auto GetCallbackArgs(Environment* env, const ada::result& url) {
  Local<Context> context = env->context();
  Isolate* isolate = env->isolate();
 
  auto js_string = [&](std::string_view sv) {
    return ToV8Value(context, sv, isolate).ToLocalChecked();
  };
  return std::array{
      js_string(url->get_href()),
      js_string(url->get_origin()),
      js_string(url->get_protocol()),
      js_string(url->get_hostname()),
      js_string(url->get_pathname()),
      js_string(url->get_search()),
      js_string(url->get_username()),
      js_string(url->get_password()),
      js_string(url->get_port()),
      js_string(url->get_hash()),
  };
}

In order to process and save this data on the JavaScript layer, preferably in URL class, JavaScript layer had to have a heavy function to update the current context of the URL:

Simplified version of the URL class onParseComplete function

#onParseComplete = (href, origin, protocol, hostname, pathname,
                    search, username, password, port, hash) => {
  this.#context.href = href;
  this.#context.origin = origin;
  this.#context.protocol = protocol;
  this.#context.hostname = hostname;
  this.#context.pathname = pathname;
  this.#context.search = search;
  this.#context.username = username;
  this.#context.password = password;
  this.#context.port = port;
  this.#context.hash = hash;
};

This implementation as you’ve realized is not very efficient. It requires sharing the knowledge of the callback function parameters, by index, between JavaScript and C++. On top of this being a bad practice, there is a lot of room for improvement in terms of performance. The bridge between C++ to JavaScript is not very efficient, leading to bottlenecks when used in hot paths.

This wasn’t a problem until now, where the true performance cost of this function lied in the fact that the URL parser was slow. However, with Ada URL parser the bottleneck was moved to the serialization of the result.

As you know the URL contains a lot of properties, where href is the only attribute that contains all of the properties of URL, hence the identifier of the URL.

URL properties

> new URL('https://www.yagiz.co')
URL {
  href: 'https://www.yagiz.co/',
  origin: 'https://www.yagiz.co',
  protocol: 'https:',
  username: '',
  password: '',
  host: 'www.yagiz.co',
  hostname: 'www.yagiz.co',
  port: '',
  pathname: '/',
  search: '',
  searchParams: URLSearchParams {},
  hash: ''
}

As you might notice, origin, protocol, host, hostname and others are all substrings of href. Well, the solution is not as simple as this, because the origin might differ from URL where the hostname can be different with different pathname values. There are lots of edge cases that needs to be resolved if we are going to resolve this.

The Solution

With Ada URL Parser v2.0.0 , we incorporated a common approach in industry for storing the URL properties. The idea is to store the href, and use offsets to represent the URL properties. This way, we can have access to the URL properties without knowing the business logic behind “How to parse a URL?“.

This solution comes with another advantage on top of solving the serialization cost. The parsing becomes faster, because we don’t need to create multiple strings for each URL property. We can reserve and allocate a string with a guessed size, and use the offsets to construct the href while parsing the URL. This reduces the memory allocations, and the time spent on parsing the URL.

URL Components

Here’s a quick recap from Ada v2.0 article :

URL Components structure

https://user:[email protected]:1234/foo/bar?baz#quux
      |     |    |          | ^^^^|       |   |
      |     |    |          | |   |       |   `----- hash_start
      |     |    |          | |   |       `--------- search_start
      |     |    |          | |   `----------------- pathname_start
      |     |    |          | `--------------------- port
      |     |    |          `----------------------- host_end
      |     |    `---------------------------------- host_start
      |     `--------------------------------------- username_end
      `--------------------------------------------- protocol_end

The structure of the URL class stayed the same, but with little caveats. On the C++ side, we created a class called BindingData.

src/node_url.h

class BindingData : public SnapshotableObject {
 public:
  // This is a simplified version of the class
  static void Parse(const v8::FunctionCallbackInfo<v8::Value>& args);
 
  static void Initialize(v8::Local<v8::Object> target,
                         v8::Local<v8::Value> unused,
                         v8::Local<v8::Context> context,
                         void* priv);
  static void RegisterExternalReferences(ExternalReferenceRegistry* registry);
 
 private:
  static constexpr size_t kURLComponentsLength = 9;
  AliasedUint32Array url_components_buffer_;
};

`Bindingdata`

Bindingdata class is initialized and snapshotted in the build time and is used to store an AliasedUint32Array called url_components_buffer_ with a length of 9 unsigned integers. This property will be used to store the offsets of the URL. Due to the single-threaded environment and the non-parallel execution of the URL parser, we ensure that that only 1 Uint32Array is created for parsing URLs throughout the lifecycle of the Node.js application.

What is AliasedUint32Array?

Referencing from the implementation itself: AliasedUint32Array is a class that encapsulates the technique of having a native buffer mapped to a JavaScript object. Writes to the native buffer can happen efficiently without going through JavaScript, and the data is then available to user via the exposed JavaScript object. While this technique is computationally efficient, it is effectively a write to JavaScript application state without going through the monitored API. Thus any VM capabilities to detect the modification are circumvented. The implementation is available at Github .

Here’s the implementation of the parse function from BindingData class.

src/node_url.cc

void BindingData::Parse(const FunctionCallbackInfo<Value>& args) {
  CHECK_GE(args.Length(), 1);
  CHECK(args[0]->IsString());  // input
  // args[1] // base url
 
  BindingData* binding_data = Realm::GetBindingData<BindingData>(args);
  Environment* env = Environment::GetCurrent(args);
  HandleScope handle_scope(env->isolate());
  Context::Scope context_scope(env->context());
 
  Utf8Value input(env->isolate(), args[0]);
  auto out = ada::parse<ada::url_aggregator>(input.ToStringView());
 
  if (!out) {
    return args.GetReturnValue().Set(false);
  }
 
  binding_data->UpdateComponents(out->get_components(), out->type);
 
  args.GetReturnValue().Set(
      ToV8Value(env->context(), out->get_href(), env->isolate())
          .ToLocalChecked());
}

As a result of this optimization, we just need to update the url_components_buffer_ with the offsets of the URL properties. This is done by the binding_data->UpdateComponents method.

JavaScript Class

Let’s dive into the JavaScript implementation of the URL class. Here’s the implementation as of Node.js 20 (April 24, 2023).

URL class

const bindingUrl = internalBinding('url');
 
class URL {
  #context = new URLContext();
 
  constructor(input, base = undefined) {
    input = `${input}`;
 
    const href = bindingUrl.parse(input, base);
 
    if (!href) {
      throw new ERR_INVALID_URL(input);
    }
 
    this.#updateContext(href);
  }
}

The parse method returned a string or an undefined value depending on the success of the parsing function. The string value represents the href part of the URL. Immediately after successful parsing, this.#updateContext(href) method is called to access the URL components (indexes of the URL properties) and update the current url instance context. As a result, the cost of parsing an invalid URL has significantly decreased.

Updating the URL context

The following code is triggered every time, a URL setter is triggered, as well as the everytime a URL is constructed.

Simplified version of #updateContext(href) implementation

#updateContext(href) {
  const {
    0: protocol_end,
    1: username_end,
    2: host_start,
    3: host_end,
    4: port,
    5: pathname_start,
    6: search_start,
    7: hash_start,
    8: scheme_type,
  } = bindingUrl.urlComponents;
 
  this.#context.protocol_end = protocol_end;
  this.#context.username_end = username_end;
  this.#context.host_start = host_start;
  this.#context.host_end = host_end;
  this.#context.port = port;
  this.#context.pathname_start = pathname_start;
  this.#context.search_start = search_start;
  this.#context.hash_start = hash_start;
  this.#context.scheme_type = scheme_type;
}

Due to the object destructure of bindingUrl.urlComponents, the barrier between JavaScript and C++ is only crossed once, reducing the performance cost of string serialization.

The usage of indexes as offsets to access the URL properties through a string is not a new idea. As most of you know, it’s called lazy loading.

What is lazy loading?

In the context of algorithms, lazy loading is a strategy for optimizing performance by deferring the calculation of values until they are actually needed. This is often used in cases where computing all possible values in advance would be inefficient or impractical. Instead, the algorithm only calculates values as they are requested, often caching the results for future use. This approach can help reduce the amount of computation required, improve memory usage, and speed up the overall execution time of the algorithm.

The usage of lazy-loading forces us to know the context of where lazy loading is used. In the case of the URL class, the cost of parsing vs. the cost of accessing the URL properties is the main factor that we need to consider. As a result of this experiment, and optimizations done on both Ada and Node.js, we were able to reduce the performance cost of parsing by a significant amount.

Here’s the result of parsing 100,000 URLs in different Node.js versions on M1 Pro Max. The benchmark code is available at Github .

Runtime	Ada Version	Time (ms/iter)
Node.js 20	2.0.0	38.97
Node.js 19.8.1	1.0.4	79.29
Node.js 19.7.0	1.0.1	105.59
Node.js 19.6.1	-	140.42

If you have a passion for performance and Node.js, we are actively looking for contributors for our performance team .