Conquering JavaScript and Beating the Natives

Pardon the provocative title; I generally advocate pacifism. But waging war against page bloat and janky web animations is worth pursuing.

Performance is Paramount

A primary goal of every serious web developer should be to make their sites and services as performant as possible. Good User eXperiences (UX) depend on it.

Within the RAIL performance model, at least 3 of the 4 performance categories are effected by JavaScript performance. So it is well worth considering if you can squeeze more speed from your code, particularly if you are not meeting the minimum requirement for user responsiveness.

The dangers of premature optimization are well documented. Indeed, it is essential that you are making improvements where they will be most felt. Generally this lies in smarter management of the DOM, network services, caching, and other big-return performance enhancements. We're talking 10x or 100x improvements here.

Improving on the performance of native code is a micro-optimization. We are taking something that is already very fast, and making it a bit faster. It is definitely not, as they say, low-hanging fruit.

But sometimes it is worth pursuing the micro; the 20% improvement; the fruit that sits safely towards the top of the tree.

When Its Time to Micro-Optimize

Recently, while optimizing Zousan, my implementation of the JavaScript Promise, I found myself looking for any bit of performance I could squeeze out of JavaScript. I wanted to create the very fastest Promise implementation around. And with a lot of smart people working on other implementations, that wasn't going to be easy.

It wasn't just due to some kind of competitive fervor. I am building a JavaScript foundation atop these Promises and I want to have no hesitation whatsoever in using them. Regardless of whether I'm deep in an animation loop, performing network communications in a game, playing music, or some other performance-sensitive task, I want to know that returning a Promise rather than a callback will have absolutely no adverse effect on performance. Under any circumstances. Ever.

This requires blistering speed from the Promise implementation.

I had taken care of the (more) obvious performance changes, such as minimizing the code path, lazy variable creation, converting any array forEach() calls to for-loops, etc.* The code was already very fast – on par with some of the faster Promise libraries out there. But to be the fastest, it was time for micro-optimization!

*Note: These changes could be called a micro-optimization as well. But what I did next was even MORE micro! ;-)

Pursuing A Faster FIFO Queue

A FIFO Queue (for those who may not know) is a First In First Out queue, such as the line at your local grocery store. The order you get into line is the order you are served. This is how queues usually work, in fact, but is in contrast to LIFO queues (Last In First Out) which is often referred to as a "stack". LIFO is more like how your sock drawer works. And if you're not careful, your refrigerator.

But internal Promise management relies on a FIFO queue because many tasks need to be scheduled to happen asynchronously, but in a specific order. The details here are not important - just the fact that a very fast FIFO queue was required.

A Simple JavaScript FIFO Queue

So how would you manage a FIFO queue in JavaScript? You can simply use an array, and place new items at the end of it, while taking items from the front. To place an item at the end of an array, you can use push():

array.push(item)

An item can be removed from the front via shift():

item = array.shift()

So a complete FIFO Queue could be easily written as:

function FIFOQueue()
{
	// Our internal queue array. Push items to end, grab from front.
	var q = []; 
	
	return {
		add: function(item)
		{
			// this appends the item to the end of the array
			q.push(item);
		},
	
		take: function()
		{
			// this removes and returns the first item in the array
			return q.shift();
		}
	}	
}

You can use this queue as such:

var myq = FIFOQueue();

myq.add(10);
myq.add(20);
myq.add(30);
myq.add(40);

console.log("first in was " + myq.take())  // 10
console.log("The next was " + myq.take()) // 20

JavaScript engines in modern browsers are fairly "well-oiled" at this point, and this runs pretty fast. Very fast in fact.

When Fast is Not Fast Enough

Lets simulate some traffic on this queue - we will add and take some items until a million have gone through the queue. On the machine I am writing this on, it takes about 150ms (using Safari 9.0.3). Try it yourself:

FIFO Queue test

Now, a million is a fairly large number in the physical world. But for a computer simply pushing and popping from a queue, one million is rather small. So for this to take 150ms on modern hardware is somewhat slow. (Recall that for a smooth 60fps animation, our per-frame budget is about 10ms.) What gives?

Mind the Gap

I'm not privy to the details of the JavaScriptCore/Nitro engine which powers Safari. But I do know how arrays are often stored internally. They are written in memory such that a reference to an item can be calculated directly from the index into the array.*

An item at array[743], for example, can be found in memory by taking 743 and multiplying it by the length of each item in the array table. This provides very fast access to any item in the array, regardless of how large it grows.

But what happens when you delete an item in such an array? To maintain the index/position correlation, every item that appears after the deleted item must be moved up to fill the gap. The closer to the "front" of the array that an item is deleted, the more items must be moved.

The worst case scenario is when the very first item is deleted, as in a simple FIFO queue as we implemented above. I suspected these JavaScript engines were slow due to this reason.

*Note: Arrays in JavaScript presumably are often implemented as associative arrays and referenced via a hash table. If/when true, the engine must re-index rather than move items when an item is deleted. But this results in the same effect and shares the same solution.

A Fast FIFOQueue

In our FIFO queue, we are always removing the first item, and so the "gap" is always at the start. Knowing this, we can employ a "cheat" so that we do not need to delete the first item and incur that slow move operation.

We simply track how many items have been deleted rather than actually deleting them. Each time a value is taken from this "fast" FIFO queue, the deleted items counter is increased, and that becomes the index to the start of the queue. Internally, the deleted values will remain, but the behavior is identical to the implementation above.

Doing this, however, will cause the array to keep growing as items are added and removed since we are never really deleting them, which results a memory leak. To solve this, we let the array grow until a certain number of items have been "deleted", and then actually delete them internally and let the memory move happen - but happen much less often.

So our FIFOQueue now looks like this:

function FastFIFOQueue()
{
	var q = [],				// Our internal queue array. Push items to end
		start = 0,			// Our deleted items counter - index to front of queue
		CHUNK_SIZE = 1024	// The number of allowed deleted items before internal delete
	
	return {
		add: function(item)
		{
			q.push(item)
		},
	
		take: function()
		{
			if(q.length - start > 0)	// Be sure the queue isn't empty
			{
				var item = q[start]		// grab the first item in the queue
				start += 1				// increment the start
				if(start == CHUNK_SIZE)	// Check if we reached max deleted size
				{
					q.splice(0,CHUNK_SIZE)	// Actually remove the deleted items
					start = 0				// Reset deleted items counter to 0
				}

				return item
			}
		}
	}	
}

Now lets try the same test as before with the new queue:

Fast FIFO Queue Test

This test completes on my machine in 25ms - a 6 times improvement!

Are Browser Developers Incompetent?

You may be wondering, how can a few lines of JavaScript outperform the same operations within the native browser engine? How can highly optimized C++ code be 500% slower than its JavaScript equivalent? Are the browser developers idiots!?

Of course not. Some of them have decades of experience optimizing interpreted and JIT compiled languages like JavaScript. We can out-perform them because of prioritization, and more importantly, the "fit to purpose".

Custom Built Always Wins

The JavaScript array implementation is general purpose. Using it as a FIFO queue is merely one of an infinite number of uses - and so it does not make sense for the native implementation to optimize for FIFO queues.

Also, the browser developer team has a finite amount of time, and so must prioritize what they spend it on. The Zousan promise implementation blows away the native browser promise implementation because it was important enough to me to spend the time to optimize it. It clearly was not for the browser developers.

When you fully understand the technology and your problem domain, you can always write code that outperforms existing libraries - even native code.

Which brings us to the key takeaway; that you can almost always improve performance of an existing library or native implementation when you custom code it for your own specific purpose.

Of course, this doesn't mean you always should. You need to weigh that performance gain against the time to develop/debug/maintain your custom code.

But sometimes performance is of the upmost importance and even squeezing out a millisecond or two is worth the effort, such as:

When developing games in HTML/JavaScript
When developing mobile apps in HTML/JavaScript
When your code is used within an animation frame render
When working on server code that will be used often by your site or services
When you are developing a library that may be used in any of the above circumstances

You should never settle for janky web interfaces, heavy mobile apps, or slow web services - and neither should your users. Always consider the performance characteristics of your decisions and don't be afraid to reinvent a faster wheel on occasion.