You have a sequence of numbers. You want to rearrange them so they are in non-decreasing order.
More precisely: given , find a permutation of the indices such that .
A permutation is just a reordering. For the output is .
Each element has a key (what you sort by) and possibly a value (data attached to it). You sort by key, you carry the value along for the ride.
The algorithm sorts the array using only the original array itself, with at most a constant amount of extra memory (a few variables, not another array of size ). No auxiliary array proportional to is needed.
Why does this matter? Memory. If your array is 4GB, you can't afford a second 4GB copy just to sort it.
If two elements have equal keys, they appear in the output in the same relative order they had in the input.
Why does this matter? Consider sorting a list of students first by grade, then by name. If the name sort is stable, students with the same name will remain in grade order from the first sort. If it is not stable, the grade ordering is destroyed.
Making any algorithm stable: replace each key with the pair where is the original index. Compare pairs lexicographically: first by , then by . This breaks all ties by original position, guaranteeing stability. Cost: slightly larger keys, same asymptotic complexity.
An array is a fixed-size block of memory. Element lives at a fixed memory address computable from the base address and , so every access is regardless of .
Arrays cannot grow. To resize, you allocate a new bigger array and copy everything over — work. This is fine if you do it rarely.
Notation: is the subarray .
The idea behind incremental sorting is simple: maintain a sorted prefix and extend it one element at a time.
Formally, at step you assume is sorted and you extend the sorting to .
Both algorithms below are in-place.
At step , the first positions already contain their final values (the smallest elements in sorted order). Find the minimum of the remaining unsorted portion and swap it into position .
SelectionSort(A[1..n]):
for i = 1 to n-1:
m = MinIndex(A, i) // find index of min in A[i..n]
Swap(A, i, m)
MinIndex(A[1..n], i):
for j = i+1 to n:
if A[j] < A[i]:
i = j
return i
Swap(A[1..n], i, j):
tmp = A[i]
A[i] = A[j]
A[j] = tmp
Array:
Final:
The swap in step 1 of the example moved the from position 1 all the way to position 6, jumping over the other at position 4. Two elements with equal keys have swapped their relative order. There is no fix that preserves in-place and — stability is sacrificed by design.
At step , MinIndex scans from position to : that is comparisons. The swap is .
Crucially, both loops always run to completion regardless of the input. The array could already be sorted; it doesn't matter. The inner loop always scans the full remaining portion.
Total comparisons:
Best case, average case, worst case: all .
The input order is completely irrelevant to the running time.
At step , assume is sorted. Take element and walk it leftward, swapping it with its left neighbor as long as it is smaller. It bubbles into its correct position within the sorted prefix.
This is exactly how most people sort a hand of playing cards: you pick up one card at a time and slide it left until it fits.
InsertionSort(A[1..n]):
for i = 2 to n:
j = i
while j > 1 and A[j] < A[j-1]:
Swap(A, j, j-1)
j = j - 1
Array:
The while loop condition uses strict inequality: A[j] < A[j-1]. When two elements are equal, the condition is false and the loop stops. Equal elements are never swapped past each other. Their original relative order is preserved.
This is not accidental — it is an intentional design choice. If you changed < to <=, InsertionSort would become unstable.
The worst case is when the array is sorted in reverse order (largest to smallest). Then at step , the element is smaller than all of , so the while loop runs times (it travels all the way to position 1).
The best case is when the array is already sorted. At step , , so the while loop condition is false immediately and the loop body never executes.
This is linear. SelectionSort never achieves this — it always does work.
We assume that at step , the while loop runs times on average (the element lands somewhere in the middle of ).
The constant factor is halved compared to the worst case, but the growth rate is still .
The best case generalizes beyond perfectly sorted arrays.
Suppose the array is sorted except for misplaced elements. The outer for loop always runs times, contributing . The while loop only does significant work on the misplaced elements. Each one travels at most positions leftward.
Total cost: .
When (constant number of misplaced elements), this is — linear.
This is important in practice. Databases, log files, and sensor streams often produce nearly-sorted data. InsertionSort handles these efficiently while MergeSort or QuickSort would still pay .
A general algorithmic strategy:
The two algorithms below both split the array in two and recurse, but they differ in where the work happens:
Base case: a subarray of length 1 is already sorted (nothing to do).
MergeSort(A[1..n], p, r):
if p < r:
q = floor((p + r) / 2)
MergeSort(A, p, q)
MergeSort(A, q+1, r)
Merge(A, p, q, r)
First call: MergeSort(A, 1, n).
This is the heart of the algorithm. You have two sorted subarrays and and you need to produce a single sorted subarray in .
The approach: use two pointers, one for each half. At each step, compare the two pointed-to elements and take the smaller one into a temporary array . When one half is exhausted, dump the remainder of the other half.
Merge(A[1..n], p, q, r):
B = new array of size r - p + 1
i = p // pointer into left half A[p..q]
j = q + 1 // pointer into right half A[q+1..r]
k = 1 // pointer into B
while i <= q and j <= r:
if A[i] <= A[j]:
B[k] = A[i]; i++
else:
B[k] = A[j]; j++
k++
// drain whichever half still has elements
while i <= q:
B[k] = A[i]; k++; i++
while j <= r:
B[k] = A[j]; k++; j++
// copy B back into A[p..r]
for k = 1 to r - p + 1:
A[p + k - 1] = B[k]
Each iteration of each while loop increments exactly one of , , or . The pointers and together move from and all the way to and respectively — exactly total increments across all three while loops. The final copy loop runs times. Total: , linear in the size of the subarray.
The merge condition is A[i] <= A[j] (with , not ). When two elements are equal, the one from the left half is taken first. Since the left half corresponds to earlier positions in the original array, relative order is preserved.
The merge step allocates a temporary array of size . At the top level, this is extra memory. It is possible to merge in-place, but the resulting algorithm is significantly more complex and slower in practice.
The recurrence:
Two recursive calls each on half the array, plus for the merge.
Solving with the Master Theorem: this is the form with , , . We compare with . Since , we are in case 2 of the Master Theorem: .
Intuition without the Master Theorem: at each level of recursion, the total work for all merges at that level is (every element participates in exactly one merge per level). There are levels (the recursion halves each time). Total: .
Best case = average case = worst case = . The initial order of the array is completely irrelevant.
QuickSort(A[1..n], p, r):
if p < r:
q = Partition(A, p, r)
QuickSort(A, p, q - 1)
QuickSort(A, q + 1, r)
First call: QuickSort(A, 1, n).
Partition(A[1..n], p, r) -> int:
x = A[r] // pivot = last element
i = p - 1
for j = p to r - 1:
if A[j] <= x:
Swap(A, i + 1, j)
i = i + 1
Swap(A, i + 1, r) // place pivot in its final position
return i + 1
What this maintains: at all times, the subarray is divided into three regions:
As scans from to : if , swap it into the region (swap with ) and advance . If , do nothing — the region expands naturally.
After the loop, swap with to place the pivot. Now everything to the left of position is and everything to the right is , and is in its final position.
Array: , pivot = ,
Verify: are all . is the pivot in its final position. are all . ✓
Partition always costs : it scans every element exactly once. It is in-place (only swaps) but not stable (swaps disrupt relative order of equal elements).
The worst case for QuickSort is when every partition splits the array as unequally as possible: one subarray has size 0 and the other has size .
When does this happen? When the pivot is always the smallest or largest element in the subarray. For example, if the array is already sorted (or reverse-sorted) and you always pick the last element as pivot, every partition is maximally unbalanced.
Recurrence:
(, the empty subarray costs nothing.)
Unrolling the recurrence:
Worst-case QuickSort is as slow as SelectionSort. The recursion tree is a completely unbalanced chain of depth .
The best case is when every partition splits the array exactly in half: two subarrays of size .
Recurrence:
This is identical to MergeSort's recurrence. By the Master Theorem: .
The recursion tree is perfectly balanced, with levels, work per level.
In the general case, the partition lands at some position (0-indexed), creating subarrays of size and :
The value of changes at every recursive call — it depends on the pivot value relative to the subarray. Under the assumption that all possible split positions are equally likely (each with probability ), the average cost is:
Since (the same terms, just listed in reverse order):
Multiply both sides by :
n \cdot T(n) = 2\sum_{i=0}^{n-1} T(i) + n^2 \tag{1}
Replace with :
(n-1) \cdot T(n-1) = 2\sum_{i=0}^{n-2} T(i) + (n-1)^2 \tag{2}
Subtract (2) from (1). The sums almost cancel — only remains from the left sum:
Divide both sides by :
Now define . The inequality becomes:
Apply this telescopically times:
The harmonic sum . So:
Multiplying back:
The average case is .
The constant hidden in the is about , roughly times the constant for MergeSort. In practice, QuickSort's cache-friendliness and low overhead make it faster than MergeSort despite the slightly larger constant.
The average-case analysis assumes all split positions are equally likely. With the deterministic pivot (always A[r]), this assumption is about the input. Nearly-sorted arrays violate it badly — they trigger the worst case.
Fix: choose the pivot uniformly at random from .
RandomizedPartition(A[1..n], p, r) -> int:
i = Random(p, r) // uniform random index in [p, r]
Swap(A, i, r)
return Partition(A, p, r)
With a random pivot, the probability of a particular split is regardless of the input order. The "equally likely" assumption is now a property of the algorithm, not a hope about the input. The expected running time is for every possible input.
Python's sorted() and list.sort() use PowerSort (since Python 3.11), a refined version of TimSort (Python 2.3 to 3.10).
TimSort is a hybrid: it exploits structure already present in the data.
The algorithm identifies runs — maximal already-sorted (or reverse-sorted, which it flips) contiguous subarrays. Short runs are extended using InsertionSort up to a minimum size (typically 32–64 elements). Then runs are merged using a MergeSort-style merge procedure. Runs of similar size are merged preferentially to maintain balance.
Why this is clever: real-world data often has natural order already embedded in it. TimSort finds and exploits this structure instead of ignoring it.
Complexity: best case (already sorted), worst and average case. Stable. Not in-place.
PowerSort improves the run-merging strategy using a near-optimal merge tree, reducing the number of comparisons in practice.
| Algorithm | Best | Average | Worst | In-Place | Stable |
|---|---|---|---|---|---|
| SelectionSort | Yes | No | |||
| InsertionSort | Yes | Yes | |||
| MergeSort | No | Yes | |||
| QuickSort | Yes | No | |||
| TimSort | No | Yes |
No single algorithm dominates on all axes. This motivates the next question.
Can any comparison-based algorithm sort faster than in the worst case?
No. This is a theorem, not an empirical observation.
Any comparison sort can be modeled as a decision tree. Each internal node represents a comparison vs . The left branch is taken if , the right branch if . Each leaf represents a specific output permutation.
A run of the algorithm on a given input traces a root-to-leaf path. The number of comparisons for that input equals the depth of the leaf. The worst-case number of comparisons equals the height of the tree.
Every comparison sort has its own decision tree, but all correct sorts share a key property: the tree must have at least leaves, one for each possible permutation of elements (any permutation could be the correct answer for some input, so the algorithm must be able to output all of them).
Theorem. A binary tree with leaves where every internal node has exactly two children has height at least .
Proof by induction on .
Base case (): the tree is a single node (the leaf). Height . ✓
Inductive step: assume the result holds for all trees with fewer than leaves. Let be a tree with leaves. It has a root with two subtrees (left) and (right), where and without loss of generality .
By the inductive hypothesis (since ):
So . ✓
Theorem. Any comparison sort requires comparisons in the worst case.
Proof. A correct comparison sort's decision tree must have at least leaves (it must be able to output every permutation). By the lemma above, its height is at least .
By Stirling's approximation, . (More precisely: , so .)
The worst-case number of comparisons height of decision tree . ✓
Consequence: MergeSort is asymptotically optimal among comparison sorts. Its worst-case cost matches the lower bound exactly.
Important caveat: this lower bound applies only to algorithms that determine order exclusively through pairwise comparisons. If you know something extra about the keys — like they are integers in a bounded range — you can exploit that structure and break the barrier.
The bound comes from modeling sorts as decision trees, where each node is a comparison. If instead your algorithm uses arithmetic on the keys (like treating a key as an array index), it is not a comparison sort and the lower bound does not apply.
The tradeoff: you need assumptions about the structure of the keys.
All keys are integers in the range .
Instead of comparing elements, count how many times each possible key value appears. Then reconstruct the sorted array directly from the counts.
CountingSort(A[1..n]):
a = min(A), b = max(A), k = b - a + 1
B = new array of size k, initialized to 0
// Count occurrences
for i = 1 to n:
B[A[i] - a + 1] += 1
// Reconstruct A from counts
j = 1
for i = 1 to k:
while B[i] > 0:
A[j] = i + a - 1
B[i] -= 1
j += 1
The index shift maps the range to , so it works even if is negative.
This version is not stable and destroys associated data — it just reconstructs values, not the original elements.
To sort elements that carry data alongside their key, we need to know where each element goes, not just how many there are. We use prefix sums.
After counting, transform so that holds the number of elements with key . This gives us the last position where a key- element should land in the output.
To preserve stability, iterate from right to left during placement. This ensures that among elements with the same key, the one appearing later in (higher index) is placed later in (higher index) — preserving relative order.
CountingSort(A[1..n]):
a = minkey(A), b = maxkey(A), k = b - a + 1
B = new array of size k, initialized to 0
C = new array of size n
// Count
for i = 1 to n:
B[A[i].key - a + 1] += 1
// Prefix sums: B[i] = number of keys <= i + a - 1
for i = 2 to k:
B[i] = B[i] + B[i-1]
// Place elements (right to left for stability)
for i = n downto 1:
C[B[A[i].key - a + 1]] = A[i]
B[A[i].key - a + 1] -= 1
// Copy back
for i = 1 to n:
A[i] = C[i]
Suppose two elements and have the same key, with (so appeared earlier). After prefix sums, both would map to the same position range. The right-to-left pass processes first (it has the higher index). It takes the current value and then decrements it. When is processed later, it gets the decremented value — a lower index in . Lower index in means earlier in the output. So (earlier in input) ends up earlier in output. ✓
Computing min and max: . Count loop: . Prefix sum loop: . Placement loop: . Copy loop: .
Total: for all cases (best, average, worst).
Critical observation: if (the range is proportional to the number of elements), this is — genuinely linear. But if , CountingSort is wasteful. Sorting gives but : catastrophically bad.
CountingSort is stable (in the general version) and not in-place (needs arrays of size and of size ).
Keys are composed of digits or characters — integers, strings, etc. The radix is the number of distinct digit values (10 for decimal, 26 for lowercase letters, 256 for bytes, etc.).
Sort digit by digit, starting from the least significant digit (rightmost), using a stable sort at each digit level.
RadixSort(A[1..n]):
d = max number of digits in any key
for i = 1 to d:
stable sort A on digit i (digit 1 = rightmost)
This is the non-obvious part of the algorithm.
Invariant: after processing digits through , the array is sorted with respect to its last digits.
Base case (): trivially true.
Inductive step: assume after passes, the array is sorted by its last digits. Now we sort by digit using a stable sort. After this sort:
So after passes, the array is sorted by its last digits. ✓
After all passes, it is sorted by all digits = fully sorted.
Why not most-significant-digit first? If you sort by the most significant digit first, elements that share that digit need to be sorted among themselves by the remaining digits — you'd need to recursively sort each bucket. That is essentially a different (and more complex) algorithm.
Consider keys and . After sorting by the last digit, they might be in order (both have 3, their relative order depends on what came before). After sorting by the middle digit, , so should stay before . A stable sort guarantees that elements with the same digit in the current position are not reordered — so the ordering from previous passes is preserved. A non-stable sort would randomly scramble the previous work.
Using CountingSort as the per-digit stable sort, each of the passes costs where is the number of possible digit values (the radix).
Total: .
If and is constant (key length doesn't grow with ), this is — linear.
RadixSort is stable and not in-place.
| Algorithm | Best | Average | Worst | In-Place | Stable |
|---|---|---|---|---|---|
| SelectionSort | Yes | No | |||
| InsertionSort | Yes | Yes | |||
| MergeSort | No | Yes | |||
| QuickSort | Yes | No | |||
| TimSort | No | Yes | |||
| CountingSort | No | Yes | |||
| RadixSort | No | Yes |
Where: = number of elements, = range of key values (or digit values), = maximum number of digits in a key.
The incremental algorithms (SelectionSort, InsertionSort) are simple and quadratic — they don't exploit structure. The divide-and-conquer algorithms (MergeSort, QuickSort) break through to by decomposing the problem recursively. The lower bound theorem proves this is optimal for any comparison sort — no pairwise-comparison algorithm can do better. CountingSort and RadixSort escape this bound by exploiting key structure (bounded integers, digit decomposition), achieving linear time under the right conditions. Each improvement comes not from coding tricks but from a deeper understanding of what information the algorithm has access to and how to use it.