1

 LinearTime Sorting Algorithms

2

 Insertion sort:
 Easy to code
 Fast on small inputs (less than ~50 elements)
 Fast on nearlysorted inputs
 O(n^{2}) worst case
 O(n^{2}) average (equallylikely inputs) case
 O(n^{2}) reversesorted case

3

 Merge sort:
 Divideandconquer:
 Split array in half
 Recursively sort subarrays
 Lineartime merge step
 O(n lg n) worst case
 Doesn’t sort in place

4

 Heap sort:
 Uses the very useful heap data structure
 Complete binary tree
 Heap property: parent key > children’s keys
 O(n lg n) worst case
 Sorts in place
 Fair amount of shuffling memory around

5

 Quick sort:
 Divideandconquer:
 Partition array into two subarrays, recursively sort
 All of first subarray < all of second subarray
 No merge step needed!
 O(n lg n) average case
 Fast in practice
 O(n^{2}) worst case
 Naïve implementation: worst case on sorted input
 Address this with randomized quicksort

6

 We will provide a lower bound, then beat it
 How do you suppose we’ll beat it?
 First, an observation: all of the sorting algorithms so far are comparison
sorts
 The only operation used to gain ordering information about a sequence
is the pairwise comparison of two elements
 Theorem: all comparison sorts are W(n lg n)
 A comparison sort must do O(n) comparisons (why?)
 What about the gap between O(n) and O(n lg n)

7

 Decision trees provide an abstraction of comparison sorts
 A decision tree represents the comparisons made by a comparison
sort. Every thing else ignored
 (Draw examples on board)
 What do the leaves represent?
 How many leaves must there be?

8

 Decision trees can model comparison sorts. For a given algorithm:
 One tree for each n
 Tree paths are all possible execution traces
 What’s the longest path in a decision tree for insertion sort? For merge sort?
 What is the asymptotic height of any decision tree for sorting n
elements?
 Answer: W(n lg n) (now let’s prove it…)

9

 Thm: Any decision tree that sorts n elements has height W(n lg n)
 What’s the minimum # of leaves?
 What’s the maximum # of leaves of a binary tree of height h?
 Clearly the minimum # of leaves is less than or equal to the maximum #
of leaves

10

 So we have…
n! £ 2^{h}
 Taking logarithms:
lg (n!) £ h
 Stirling’s approximation tells us:
 Thus:

11

 So we have
 Thus the minimum height of a decision tree is W(n lg n)

12

 Thus the time to comparison sort n elements is W(n lg n)
 Corollary: Heapsort and Mergesort are asymptotically optimal comparison
sorts
 But the name of this lecture is “Sorting in linear time”!
 How can we do better than W(n lg n)?

13

 Counting sort
 No comparisons between elements!
 But…depends on assumption about the numbers being sorted
 We assume numbers are in the range 1.. k
 The algorithm:
 Input: A[1..n], where A[j] Î {1, 2, 3, …, k}
 Output: B[1..n], sorted (notice: not sorting in place)
 Also: Array C[1..k] for auxiliary storage

14

 1 CountingSort(A, B, k)
 2 for i=1 to k
 3 C[i]= 0;
 4 for j=1 to n
 5 C[A[j]] += 1;
 6 for i=2 to k
 7 C[i] = C[i] + C[i1];
 8 for j=n downto 1
 9 B[C[A[j]]] = A[j];
 10 C[A[j]] = 1;

15

 1 CountingSort(A, B, k)
 2 for i=1 to k
 3 C[i]= 0;
 4 for j=1 to n
 5 C[A[j]] += 1;
 6 for i=2 to k
 7 C[i] = C[i] + C[i1];
 8 for j=n downto 1
 9 B[C[A[j]]] = A[j];
 10 C[A[j]] = 1;

16

 Total time: O(n + k)
 Usually, k = O(n)
 Thus counting sort runs in O(n) time
 But sorting is W(n
lg n)!
 No contradictionthis is not a comparison sort (in fact, there are no
comparisons at all!)
 Notice that this algorithm is stable

17

 Cool! Why don’t we always use
counting sort?
 Because it depends on range k of elements
 Could we use counting sort to sort 32 bit integers? Why or why not?
 Answer: no, k too large (2^{32} = 4,294,967,296)

18

 How did IBM get rich originally?
 Answer: punched card readers for census tabulation in early 1900’s.
 In particular, a card sorter that could sort cards into different bins
 Each column can be punched in 12 places
 Decimal digits use 10 places
 Problem: only one column can be sorted on at a time

19

 Intuitively, you might sort on the most significant digit, then the
second msd, etc.
 Problem: lots of intermediate piles of cards (read: scratch arrays) to
keep track of
 Key idea: sort the least significant digit first
 RadixSort(A, d)
 for i=1 to d
 StableSort(A) on digit
i
 Example: Fig 9.3

20

 Can we prove it will work?
 Sketch of an inductive argument (induction on the number of passes):
 Assume lowerorder digits {j: j<i}are sorted
 Show that sorting next digit i leaves array correctly sorted
 If two digits at position i are different, ordering numbers by that
digit is correct (lowerorder digits irrelevant)
 If they are the same, numbers are already sorted on the lowerorder
digits. Since we use a stable
sort, the numbers stay in the right order

21

 What sort will we use to sort on digits?
 Counting sort is obvious choice:
 Sort n numbers on digits that range from 1..k
 Time: O(n + k)
 Each pass over n numbers with d digits takes time O(n+k), so total time
O(dn+dk)
 When d is constant and k=O(n), takes O(n) time
 How many bits in a computer word?

22

 Problem: sort 1 million 64bit numbers
 Treat as fourdigit radix 2^{16} numbers
 Can sort in just four passes with radix sort!
 Compares well with typical O(n lg n) comparison sort
 Requires approx lg n = 20 operations per number being sorted
 So why would we ever use anything but radix sort?

23

 In general, radix sort based on counting sort is
 Fast
 Asymptotically fast (i.e., O(n))
 Simple to code
 A good choice
 To think about: Can radix sort be used on floatingpoint numbers?

24

