Sorting Speed Limits
Steven J. Zeil
OK, we’ve seen an algorithm that sorts in $O(n^{2})$ time.
There is actually an algorithm called Shell’s sort that uses insertion sorting as an internal component and that yields a family of sorting algorithms that run in some rather odd times: $O(n^{5/4})$, $O(n^{3/2})$, etc.
Those are faster than an ordinary insertion sort, so we might ask…
Just how fast can a sorting algorithm get?
1 Counting Comparisons
Consider how many comparisons are needed to determine the proper order for a set of elements:
-
$n=2$ needs 1 comparison
-
$n=3$ needs between 2 and 3 comparisons
-
E.g., if $a < b$ then is $c < a$? If so, done. If not, is $c > b$?
-
-
$n=k+1$ needs $\log k$ comparisons
Figure out the order for the 1st $k$ elements, then use $\log k$ steps to figure out where the $k+1$st element goes.
Why $log k$?
Because we can use binary search to find the place where a given element should be inserted.
2 Speed Limit
So, it looks like we need $sum_{i=1}^{n} \log i$ comparisons.
- We could prove this formally, with more work — more than is really justified for this course.
What does this mean about performance?
\[ \begin{eqnarray*} \sum_{i=1}^{n} \log(i) & \geq & \sum_{i={n}/{2}}^{n} \log(i) \\ & \geq & \sum_{i=n/2}^{n} \log n/2 \\ & = & n/2 \log(n/2) \\ & = & n/2 (\log(n) - \log(2)) \\ & = & O(n \log(n)) \end{eqnarray*} \]
So, we conclude that
No sorting algorithm that works by pair-wise comparison (i.e., comparing elements 2 at a time) can be faster than $O(n*log(n))$ (worst or average case).
3 But Can We Even Go That Fast?
$O(n \log(n))$ is a pretty good speed in practice, often not noticeably slower than $O(n)$. After all, we usually need $O(n)$ time just to read or fetch the array of data that we want to sort.
But can we actually find algorithms that achieve this $O(n \log(n))$ optimum speed?
Our next lessons will show that we can.