NgRx is 40x faster than your code – find out why

When I started using @ngrx/store to hold collections of information, I usually put the data into the store as a JavaScript array. It seemed to be the simplest and most appropriate data structure for the information. However, when @ngrx/entity came out, I saw that it used a different pattern – instead of using the array directly, it converts the array to two data structures; an array of ids and an object map keyed by those ids. Why did they do this? And is there a lesson we can learn for our own code?

Array.prototype.find()

The standard way of dealing with a collection of data in JavaScript is to use an array, which works fine for many purposes, but it’s not great for searching for a value within the array. There is the find method on the Array prototype, but it’s a brute force search; when the data isn’t sorted by what we’re searching for, the only thing the search can do it check each value to look for the first match. If the first value matches – great! If the last value matches, then the find algorithm has to look at every single value in the array (the same is true if there’s no match – it has to check every value of the array to seek a match). For a small array, or for an array that isn’t searched often, that’s not a problem. But it can be slow on a big array or for frequent searches.

Imagine we have an array with 100 values, and we need to check that array 100 times for matches. Using find, we would do 100 brute-force searches, each of which would take an average of 50.5 comparisons before finding a match (assuming there’s always a match) – that’s 5,050 comparisons.

Entity objects

We can reduce the number of comparisons (and the total number of operations) by using an Entity Object instead. This object will have the same number of properties as the original array, but the keys will be the ids of the objects instead of an index and the values will be the thing we’re searching through. So, instead of [{id: 'abc', value: 1}, {id: 'def', value: 2}], we get {abc: {id: 'abc', value: 1}, def: {id: 'def', value: 2}}. Why does this matter? Because getting a value from an object through the key is a lightning-fast operation, taking the same amount of time no matter the size of the data structure.

In the case we imagined above, the total number of operations would be 200: 100 operations to create the object and another 100 operations to get the values out. 200 is a lot fewer than 5,050.

The Entity Object approach scales really well. If the array were 1000 elements checked 1000 times, the array approach would balloon to 500,500 comparisons, but the object approach would grow to only 2000 operations. This happens because the brute-force find approach grows in effort with the size of the array (known a O(n) in computer science), whereas the object property approach is constant independent of object size (known as O(1)).

There are some situations where this wouldn’t be a great idea – this approach involves making a copy of the initial data structure and keeping it in memory. If that copy is never used, then the initial effort to build the array is wasted. Also, if the device is memory-constrained and speed doesn’t matter much, then adding a copy of the data structure might be optimizing in the wrong direction.

Seeing it in action

The NgRx team uses this for the entity library in order to support large apps with large arrays with the best performance possible. So, how do you make use of the same approach?

Let’s say you have an array of countries and an array of Olympic athletes. Your backend provides both of these arrays, and you want to show the athletes and each of their countries in a table. To reduce the size of data sent over the network, the array of athletes contains the id of the country, not the country itself. So, for display, your job is to find the relevant country for each athlete for display.

export interface Country {
  id: number;
  name: string;
}

export interface Athlete {
  id: number;
  name: string;
  countryId: number;
}

export interface AthleteDisplay {
  name: string;
  countryName: string;
}

export countries: Country[] = [
  { id: 1, name: 'USA' },
  { id: 2, name: 'Canada' },
  { id: 3, name: 'Russia' },
  { id: 4, name: 'France' },
];

export athletes: Athlete[] = [
  { id: 1, name: 'Jane Alexander', countryId: 1 },
  { id: 2, name: 'Henri Lamarque', countryId: 4 },
  { id: 3, name: 'Bobby Thomas', countryId: 2 },
  { id: 4, name: 'Lyudmila Belyaev', countryId: 3 },
  { id: 5, name: 'Marie Combe', countryId: 4 },
];

So, with those Athletes and those Countries, how do we create the five relevant AthleteDisplays to show in a table? Here’s a typical approach:

const athleteDisplays: AthleteDisplay[] = athletes.map(athlete => {
  const country = countries.find(country => country.id === athlete.countryId);
  return {
    name: athlete.name,
    countryName: country ? country.name : ''
  }
});

With this approach, we run through one to four of the countries for each of the five athletes; in this case, we will check 14 countries overall.

Here’s an alternate approach to reduce the number of operations:

const countryDictionary = keyBy(countries, 'id');
const athleteDisplays: AthleteDisplay[] = athletes.map(athlete => {
  return {
    countryName: countryDictionary[athlete.countryId].name,
    name: athlete.name
  };
});

In this version, we’re using keyBy from lodash to create a Dictionary<Country> (the same thing as a Record<string, Country>) that will map every id to a Country. With this dictionary in hand, we can now get the country from the dictionary using property bracket notation, which is a single operation for each display.

The total count of operations for this approach is 9 (5 conversions internal to keyBy and 4 property access calls). This looks like it might be better, but is it?

I’ve created a small sample script on GitHub to test the performance of these two scripts, and here’s a typical result:
Array.find() time 0.089s
entity object time 0.479s

Clearly, using the Entity Object costs some time. So, is it worth it?

Yes, if your array is large enough. Below is a table of the output from the same application, associating 1000 athletes from a different number countries. In the first run, there is one country, then 10, then 100, etc., all the way up to 100000 possible countries (that would be quite the Olympics!).

The table shows a typical run of the amount of time (in seconds) to build the display array using Array.find() vs. the total time using the Entity Object approach. It also breaks the Entity Object runs down to see how much time is necessary to build the map vs. running the search.

(index)countriesArray.find()total time w/ entitybuilding the entity mapsearching with entity
010.2880.2260.0230.203
1100.5700.4860.0360.450
21001.5460.1760.0240.151
310000.8840.3780.0610.317
41000010.9260.7650.6790.086
5100000275.3716.5276.3810.146

As you can see, there is little benefit to using the Entity Object approach for a small array (in some runs, it’s even a negative benefit for the small arrays), but the benefit becomes clear as the arrays get larger.

For 1 and 10 elements, using the simple approach takes roughly the same amount of time. For 100, 1000, and 10000 elements, the entity approach is starting to be a bit faster; by the 10000-element run, the total running time for the entity approach is roughly 14 times less than for the simple approach. For the 100000 element run, the simple approach is about 40 times slower.

The lesson is that the entity approach scales significantly better than the simple approach: it’s a O(1) comparison instead of a O(n) comparison, and that pays huge dividends for large data sets. The Entity Object approach scales much better than the simple Array.find()-based approach.

What about that id array?

At the top, I mentioned that there were two data structures. I’ve discussed the object keyed by the ids fully, but what about the array of ids? What was that for?

The id array maintains the ordered nature of the array. The object is great at fast access, but it doesn’t maintain the order of the elements; the id array does that for us. For example, if you want to get the 5th element of the original array, you can get the id of the 5th element from the id array and then use the object to get the full element quickly.

Conclusion

The @ngrx team didn’t invent this pattern (it was part of the Redux world and many other libraries earlier), but they knew what they were doing when they adopted the Entity Object approach; although it doesn’t matter much for small arrays, it doesn’t really hurt. In contrast, the performance improvement for searching large arrays is huge.

If your code has a lot of large arrays that you need to search through, adopting this approach (with or without @ngrx/entity) can be a significant performance win.

Note: This article originally appeared on JavaScript January