1

 Solving Recurrences Continued
 The Master Theorem
 Introduction to heapsort

2

 MergeSort(A, left, right) {
 if (left < right) {
 mid = floor((left + right) / 2);
 MergeSort(A, left, mid);
 MergeSort(A, mid+1, right);
 Merge(A, left, mid, right);
 }
 }
 // Merge() takes two sorted subarrays of A and
 // merges them into a single sorted subarray of A.
 // Code for this is in the book.
It requires O(n)
 // time, and *does* require allocating O(n) space

3

 Statement Effort
 So T(n) = Q(1)
when n = 1, and 2T(n/2)
+ Q(n) when
n > 1
 This expression is a recurrence

4

 Substitution method
 Iteration method
 Master method

5

 The substitution method
 A.k.a. the “making a good guess method”
 Guess the form of the answer, then use induction to find the constants
and show that solution works
 Run an example: merge sort
 T(n) = 2T(n/2) + cn
 We guess that the answer is O(n lg n)
 Prove it by induction
 Can similarly show T(n) = Ω(n lg n), thus Θ(n lg n)

6

 The “iteration method”
 Expand the recurrence
 Work some algebra to express as a summation
 Evaluate the summation
 We showed several examples, were in the middle of:

7

 T(n) =
 aT(n/b) + cn
 a(aT(n/b/b) + cn/b) + cn
 a^{2}T(n/b^{2}) + cna/b + cn
 a^{2}T(n/b^{2}) + cn(a/b + 1)
 a^{2}(aT(n/b^{2}/b) + cn/b^{2}) + cn(a/b + 1)
 a^{3}T(n/b^{3}) + cn(a^{2}/b^{2}) +
cn(a/b + 1)
 a^{3}T(n/b^{3}) + cn(a^{2}/b^{2} + a/b
+ 1)
 …
 a^{k}T(n/b^{k}) + cn(a^{k1}/b^{k1} +
a^{k2}/b^{k2} + … + a^{2}/b^{2} + a/b
+ 1)

8

 So we have
 T(n) = a^{k}T(n/b^{k}) + cn(a^{k1}/b^{k1 }+
... + a^{2}/b^{2 }+ a/b + 1)
 For k = log_{b} n
 n = b^{k}
 T(n) = a^{k}T(1) + cn(a^{k1}/b^{k1 }+ ... + a^{2}/b^{2
}+ a/b + 1)
 = a^{k}c + cn(a^{k1}/b^{k1 }+ ... + a^{2}/b^{2
}+ a/b + 1)
 = ca^{k} + cn(a^{k1}/b^{k1 }+ ... + a^{2}/b^{2
}+ a/b + 1)
 = cna^{k }/b^{k} + cn(a^{k1}/b^{k1 }+
... + a^{2}/b^{2 }+ a/b + 1)
 = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)

9

 So with k = log_{b }n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a = b?
 T(n) = cn(k + 1)
 = cn(log_{b} n + 1)
 = Q(n log n)

10

 So with k = log_{b }n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a < b?

11

 So with k = log_{b }n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a < b?
 Recall that S(x^{k}
+ x^{k1} + … + x + 1) = (x^{k+1} 1)/(x1)

12

 So with k = log_{b }n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a < b?
 Recall that (x^{k} + x^{k1} + … + x + 1) = (x^{k+1}
1)/(x1)
 So:

13

 So with k = log_{b }n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a < b?
 Recall that S(x^{k}
+ x^{k1} + … + x + 1) = (x^{k+1} 1)/(x1)
 So:
 T(n) = cn ·Q(1) =
Q(n)

14

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?

15

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?

16

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?
 T(n) = cn · Q(a^{k}
/ b^{k})

17

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?
 T(n) = cn · Q(a^{k}
/ b^{k})
 = cn · Q(a^{log
n} / b^{log n}) = cn · Q(a^{log n} / n)

18

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?
 T(n) = cn · Q(a^{k}
/ b^{k})
 = cn · Q(a^{log
n} / b^{log n}) = cn · Q(a^{log n} / n)
 recall logarithm fact: a^{log n} = n^{log a}

19

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?
 T(n) = cn · Q(a^{k}
/ b^{k})
 = cn · Q(a^{log
n} / b^{log n}) = cn · Q(a^{log n} / n)
 recall logarithm fact: a^{log n} = n^{log a}
 = cn · Q(n^{log
a} / n) = Q(cn · n^{log a} / n)

20

 So with k = log_{b} n
 T(n) = cn(a^{k}/b^{k }+ ... + a^{2}/b^{2 }+
a/b + 1)
 What if a > b?
 T(n) = cn · Q(a^{k}
/ b^{k})
 = cn · Q(a^{log
n} / b^{log n}) = cn · Q(a^{log n} / n)
 recall logarithm fact: a^{log n} = n^{log a}
 = cn · Q(n^{log
a} / n) = Q(cn · n^{log a} / n)
 = Q(n^{log a}
)

21


22

 Given: a divide and conquer algorithm
 An algorithm that divides the problem of size n into a subproblems,
each of size n/b
 Let the cost of each stage (i.e., the work to divide the problem +
combine solved subproblems) be described by the function f(n)
 Then, the Master Theorem gives us a cookbook for the algorithm’s running
time:

23

 if T(n) = aT(n/b) + f(n) then

24

 T(n) = 9T(n/3) + n
 a=9, b=3, f(n) = n
 n^{log}^{b}^{ a} = n^{log}^{3}^{
9} = Q(n^{2})
 Since f(n) = O(n^{log}^{3}^{ 9  }^{e}), where e=1, case 1 applies:
 Thus the solution is T(n) = Q(n^{2})

25

 So far we’ve talked about two algorithms to sort an array of numbers
 What is the advantage of merge sort?
 What is the advantage of insertion sort?
 Next on the agenda: Heapsort
 Combines advantages of both previous algorithms

26

 A heap can be seen as a complete binary tree:
 What makes a binary tree complete?
 Is the example above complete?

27

 A heap can be seen as a complete binary tree:
 The book calls them “nearly complete” binary trees; can think of
unfilled slots as null pointers

28

 In practice, heaps are usually implemented as arrays:

29

 To represent a complete binary tree as an array:
 The root node is A[1]
 Node i is A[i]
 The parent of node i is A[i/2] (note: integer divide)
 The left child of node i is A[2i]
 The right child of node i is A[2i + 1]

30

 So…
 Parent(i) { return ëi/2û; }
 Left(i) { return 2*i; }
 right(i) { return 2*i + 1; }
 An aside: How would you implement this
most efficiently?
 Another aside: Really?

31

 Heaps also satisfy the heap property:
 A[Parent(i)] ³ A[i] for
all nodes i > 1
 In other words, the value of a node is at most the value of its parent
 Where is the largest element in a heap stored?
 Definitions:
 The height of a node in the tree = the number of edges on the longest
downward path to a leaf
 The height of a tree = the height of its root

32

 What is the height of an nelement heap? Why?
 This is nice: basic heap operations take at most time proportional to
the height of the heap

33

 Heapify(): maintain the heap property
 Given: a node i in the heap with children l and r
 Given: two subtrees rooted at l and r, assumed to be heaps
 Problem: The subtree rooted at i may violate the heap property (How?)
 Action: let the value of the parent node “float down” so subtree at i
satisfies the heap property
 What do you suppose will be the basic operation between i, l, and r?

34

 Heapify(A, i)
 {
 l = Left(i); r = Right(i);
 if (l <= heap_size(A) && A[l] > A[i])
 largest = l;
 else
 largest = i;
 if (r <= heap_size(A) && A[r] > A[largest])
 largest = r;
 if (largest != i)
 Swap(A, i, largest);
 Heapify(A, largest);
 }

35


36


37


38


39


40


41


42


43


44

 Aside from the recursive call, what is the running time of Heapify()?
 How many times can Heapify() recursively call itself?
 What is the worstcase running time of Heapify() on a heap of size n?

45

 Fixing up relationships between i, l, and r takes Q(1) time
 If the heap at i has n elements, how many elements can the subtrees at l
or r have?
 Answer: 2n/3 (worst case: bottom row 1/2 full)
 So time taken by Heapify() is given by
 T(n) £ T(2n/3) + Q(1)

46

 So we have
 T(n) £ T(2n/3) +
Q(1)
 By case 2 of the Master Theorem,
 T(n) = O(lg n)
 Thus, Heapify() takes linear time

47

 We can build a heap in a bottomup manner by running Heapify() on
successive subarrays
 Fact: for array of length n, all elements in range
A[ën/2û + 1 .. n] are heaps (Why?)
 So:
 Walk backwards through the array from n/2 to 1, calling Heapify() on
each node.
 Order of processing guarantees that the children of node i are heaps
when i is processed

48

 // given an unsorted array A, make A a heap
 BuildHeap(A)
 {
 heap_size(A) = length(A);
 for (i = ëlength[A]/2û downto 1)
 Heapify(A, i);
 }

49

 Work through example
A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}

50

 Each call to Heapify() takes O(lg n) time
 There are O(n) such calls (specifically, ën/2û)
 Thus the running time is O(n lg n)
 Is this a correct asymptotic upper bound?
 Is this an asymptotically tight bound?
 A tighter bound is O(n)
 How can this be? Is there a flaw
in the above reasoning?

51

 To Heapify() a subtree takes O(h) time where h is the height of the
subtree
 h = O(lg m), m = # nodes in subtree
 The height of most subtrees is small
 Fact: an nelement heap has at most én/2^{h}^{+1}ù nodes of height h
 CLR 7.3 uses this fact to prove that BuildHeap() takes O(n) time

52

 Given BuildHeap(), an inplace
sorting algorithm is easily constructed:
 Maximum element is at A[1]
 Discard by swapping with element at A[n]
 Decrement heap_size[A]
 A[n] now contains correct value
 Restore heap property at A[1] by calling Heapify()
 Repeat, always swapping A[1] for A[heap_size(A)]

53

 Heapsort(A)
 {
 BuildHeap(A);
 for (i = length(A) downto 2)
 {
 Swap(A[1], A[i]);
 heap_size(A) = 1;
 Heapify(A, 1);
 }
 }

54

 The call to BuildHeap() takes O(n) time
 Each of the n  1 calls to Heapify() takes O(lg n) time
 Thus the total time taken by HeapSort()
= O(n) + (n  1) O(lg n)
= O(n) + O(n lg n)
= O(n lg n)

55

 Heapsort is a nice algorithm, but in practice Quicksort (coming up)
usually wins
 But the heap data structure is incredibly useful for implementing priority
queues
 A data structure for maintaining a set S of elements, each with an
associated value or key
 Supports the operations Insert(), Maximum(), and ExtractMax()
 What might a priority queue be useful for?

56

 Insert(S, x) inserts the element x into set S
 Maximum(S) returns the element of S with the maximum key
 ExtractMax(S) removes and returns the element of S with the maximum key
 How could we implement these operations using a heap?
