TL;DR: Don’t do weird stuff to JavaScript arrays.
*ahem*
What are arrays?
In the standard definition of Array data structures in Computer Science, arrays are a collection of like elements where any member may be accessed by an index. That definition doesn’t sound very different from an Hash Table or it’s ilk, but the key distinctions here are that the elements of the array must take up the same size in memory, and that the array indexes need to be numeric so that the logical address of any individual element can be derived from that number.
Arrays are typically created from contiguous memory. This makes for a simple calculation, like if we know that an Array lives at memory address 1024 and that each element in the array holds a 32-bit reference to an object then we can look up any individual item with a simple computation:
element_address = (1024 + (32 * index))You want the element at index 14? Here ya go, it’s the 32bits starting at address 1472!
What makes arrays important?
It’s important that indexes be numeric because it means we can compute the memory address of any element in the array in constant time. This makes for the fastest random access of any data structure – even in the worst case scenario. Hash Tables are great and all, but there is overhead associated with their hashing and collision handling.
Check out the Big-O Algorithm Complexity Cheat Sheet for a nice visual comparison.
So, back to our original question…
When is an array, not an array?
JavaScript is notoriously loosey goosey, and it’s array types are no exception.
Sure, we can do all the normal array type operations. But we can also do weird stuff.
Here’s a simple array definition, but how much memory do you think is ultimately allocated by this operation?
1 |
var myArray = [0,1,2]; |
Similar code in a language like C would have declared an a array with 3 x sizeof(int). Trying to set a value outside of that memory block (say at [100000]) would go badly in most languages, but JavaScript doesn’t mind at all. In fact, JavaScript is extremely lenient in what it will take. Let’s continue the example…
2 3 4 |
myArray[100000] = 100000; // JS is cool with this myArray[-1] = -1.33333; // and this myArray['mmm...need more coffee'] = 'go get it, lazy bones!'; // this too |
The above examples show several crimes against nature. Exceeding the bounds of original declaration, negative indexes, non-numeric indexes, setting different value types…heinous crimes indeed. These operations are incompatible with the definition above.
Given the information above about the key defining features of array data structures…how can JavaScript Arrays possibly be arrays? The answer, my friend, is ~blowing in the wind~ that there is a difference between array data structures and array types.
JavaScript arrays are always array types, but they aren’t always array data structures.
JavaScript doesn’t have an implementation, it has many. V8, TraceMonkey, and Chakra are all examples of popular JavaScript engines. This makes it hard to really say what “JavaScript does” when it comes to technical implementations. Even if we managed to find the relevant spot(s) in code for a particular engine, it could all change tomorrow. That said, things are pretty stable so a radical re-working of array implementations in V8 (for example) are unlikely. Code diving is difficult given the size of the engines, but lucky for us some really smart people have opted to guide us through the most remarkable bits.
In V8 (and most other implementations) The underlying data structures used to represent your JavaScript array in memory can change based on your usage. If you start out with an array of integers, then V8 will represent that with a dynamic array of integers. If you exceed the bounds of your array, then V8 will either allocate you a new array that is big enough to fit your new indexes or in some cases might even dynamically convert your array to a new data structure that is more efficient for dealing with sparse values. If you add a value to your array that does not match with the data type that V8 started you with then it will allocate you a new array with a more general type. In the code above we started with integers, but the type was changed once we added a decimal, and would have changed again when we added the string except that the index I used was also a string so once again, V8 would have dynamically converted the underlying data structure to accommodate.
The underlying data structures used to represent your JavaScript array in memory can change based on your usage
How does JavaScript get away with calling their arrays…arrays? Formal definitions don’t always line up cleanly with the “real” world and there is a distinction between Array Data Structures and Array Types. Array types are sometimes implemented with other data structures, like in the cases listed above, and that is okay. JavaScript is not the only language that is loosey goosey, at least it’s indexes start with zero!
So, what have we learned?
Don’t do weird stuff to JavaScript arrays. JS is lenient and complicated, and there are performance implications in addition to the normal readability nightmares caused by the examples shown above. Read this article, and watch the video if you want to learn more.
Thanks for reading, let me know what you think in the comments!
Like this post? It was a result of research for an upcoming podcast on data structures, so subscribe to Coding Blocks if you are interested in topics like this.