Vector with Constant-Time Remove - still a Vector?
- by Darrel Hoffman
One of the drawbacks of most common implementations of the Vector class (or ArrayList, etc. Basically any array-backed expandable list class) is that their remove() operation generally operates in linear time - if you remove an element, you must shift all elements after it one space back to keep the data contiguous. But what if you're using a Vector just to be a list-of-things, where the order of the things is irrelevant? In this case removal can be accomplished in a few simple steps:
Swap element to be removed with the last element in the array
Reduce size field by 1. (No need to re-allocate the array, as the deleted item is now beyond the size field and thus not "in" the list any more. The next add() will just overwrite it.)
(optional) Delete last element to free up its memory. (Not needed in garbage-collected languages.)
This is clearly now a constant-time operation, since only performs a single swap regardless of size. The downside is of course that it changes the order of the data, but if you don't care about the order, that's not a problem.
Could this still be called a Vector? Or is it something different? It has some things in common with "Set" classes in that the order is irrelevant, but unlike a Set it can store duplicate values. (Also most Set classes I know of are backed by a tree or hash map rather than an array.) It also bears some similarity to Heap classes, although without the log(N) percolate steps since we don't care about the order.