16

In node v14.3.0, I discovered (while doing some coding work with very large arrays) that sub-classing an array can cause .slice() to slow down by a factor 20x. While, I could imagine that there might be some compiler optimizations around a non-subclassed array, what I do not understand at all is how .slice() can be more than 2x slower than just manually copying elements from one array to another. That does not make sense to me at all. Anyone have any ideas? Is this a bug or is there some aspect to this that would/could explain it?

For the test, I created a 100,000,000 unit array filled with increasing numbers. I made a copy of the array with .slice() and I made a copy manually by then iterating over the array and assigning values to a new array. I then ran those two tests for both an Array and my own empty subclass ArraySub. Here are the numbers:

Running with Array(100,000,000)
sliceTest: 436.766ms
copyTest: 4.821s

Running with ArraySub(100,000,000)
sliceTest: 11.298s
copyTest: 4.845s

The manual copy is about the same both ways. The .slice() copy is 26x slower on the sub-class and more than 2x slower than the manual copy. Why would that be?

And, here's the code:

// empty subclass for testing purposes
class ArraySub extends Array {

}

function test(num, cls) {
    let name = cls === Array ? "Array" : "ArraySub";
    console.log(`--------------------------------\nRunning with ${name}(${num})`);
    // create array filled with unique numbers
    let source = new cls(num);
    for (let i = 0; i < num; i++) {
        source[i] = i;
    }

    // now make a copy with .slice()
    console.time("sliceTest");
    let copy = source.slice();
    console.timeEnd("sliceTest");

    console.time("copyTest");
    // these next 4 lines are a lot faster than this.slice()
    const manualCopy = new cls(num);
    for (let [i, item] of source.entries()) {
        manualCopy[i] = item;
    }
    console.timeEnd("copyTest");
}

[Array, ArraySub].forEach(cls => {
    test(100_000_000, cls);
});

FYI, there's a similar result in this jsperf.com test when run in the Chrome browser. Running the jsperf in Firefox shows a similar trend, but not as much of a difference as in Chrome.

11
  • My guess is that the inner loop of slice() is checking for getter/setter methods. Commented Jun 4, 2020 at 0:18
  • @Barmar - Is there a reason, .slice() would have to do more work than the manual assignment loop? Commented Jun 4, 2020 at 0:27
  • There's probably an optimization being done for the array indexing that isn't happening inside slice. Commented Jun 4, 2020 at 0:29
  • The V8 code here makes reference to FastSlice(), but is beyond what I can follow. Commented Jun 4, 2020 at 0:55
  • 1
    Perhaps the subclass elements are being treated as individual object key/value entries while the array entries are being treated as one piece of contiguous memory. Commented Jun 4, 2020 at 1:10

1 Answer 1

21

V8 developer here. What you're seeing is fairly typical:

The built-in .slice() function for regular arrays is heavily optimized, taking all sorts of shortcuts and specializations (it even goes as far as using memcpy for arrays containing only numbers, hence copying more than one element at a time using your CPU's vector registers!). That makes it the fastest option.

Calling Array.prototype.slice on a custom object (like a subclassed array, or just let obj = {length: 100_000_000, foo: "bar", ...}) doesn't fit the restrictions of the fast path, so it's handled by a generic implementation of the .slice builtin, which is much slower, but can handle anything you throw at it. This is not JavaScript code, so it doesn't collect type feedback and can't get optimized dynamically. The upside is that it gives you the same performance every time, no matter what. This performance is not actually bad, it just pales in comparison to the optimizations you get with the alternatives.

Your own implementation, like all JavaScript functions, gets the benefit of dynamic optimization, so while it naturally can't have any fancy shortcuts built into it right away, it can adapt to the situation at hand (like the type of object it's operating on). That explains why it's faster than the generic builtin, and also why it provides consistent performance in both of your test cases. That said, if your scenario were more complicated, you could probably pollute this function's type feedback to the point where it becomes slower than the generic builtin.

With the [i, item] of source.entries() approach you're coming close to the spec behavior of .slice() very concisely at the cost of some overhead; a plain old for (let i = 0; i < source.length; i++) {...} loop would be about twice as fast, even if you add an if (i in source) check to reflect .slice()'s "HasElement" check on every iteration.


More generally: you'll probably see the same general pattern for many other JS builtins -- it's a natural consequence of running on an optimizing engine for a dynamic language. As much as we'd love to just make everything fast, there are two reasons why that won't happen:

(1) Implementing fast paths comes at a cost: it takes more engineering time to develop (and debug) them; it takes more time to update them when the JS spec changes; it creates an amount of code complexity that quickly becomes unmanageable leading to further development slowdown and/or functionality bugs and/or security bugs; it takes more binary size to ship them to our users and more memory to load such binaries; it takes more CPU time to decide which path to take before any of the actual work can start; etc. Since none of those resources are infinite, we'll always have to choose where to spend them, and where not.

(2) Speed is fundamentally at odds with flexibility. Fast paths are fast because they get to make restrictive assumptions. Extending fast paths as much as possible so that they apply to as many cases as possible is part of what we do, but it'll always be easy for user code to construct a situation that makes it impossible to take the shortcuts that make a fast path fast.

Sign up to request clarification or add additional context in comments.

11 Comments

What this also shows is how good the dynamic optimizations are.
Thx for responding. Why does the fact that you subclassed the array disqualify it from the built-in .slice() optimizations? It's the exact same array data structure. Nothing has changed there at all. In my real world example, there are just a few extra methods in the prototype chain and in my test example above, there aren't even more methods in the prototype chain. Wouldn't it be doable to extend the optimization to a subclass like this? Or just not considered worth it to even investigate? Or some edge case trouble?
I'm rapidly coming to the conclusion that I'm better off modifying the built-in Array prototype to add methods rather than sub-classing which until a 20x speed difference showed itself seemed like a bad thing to do. This definitely creates a quandary. Can't use typical OO design principles without losing meaningful performance. The fast path performance is appreciated, but seeing it disappear just because you used a basic OO sub-class that shouldn't really interfere seems problematic too.
FYI, changing the manual copy for loop to for (let i = 0; i < len; i++) boosted its speed another 30%, making it that much faster than .slice() on the Array sub-class. So, I guess either plain old for loops are meaningfully faster than iterators or I again stumbled into a fast path optimization.
@jmrk - I only noticed the performance difference because it was relevant in real world code. I'm doing some processing of very large arrays in node.js and the 20x difference changed a 10 second operation into a 200 second (over 3 minutes) operation. I noticed. I'd be curious what relevant assumptions might be invalidated by a sub-class? Once the .slice() method call has been received, everything done in service of that request can be internal to the implementation. FYI, I never received notification of your comments so just now saw them - sorry for the delay.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.