Notes
Slide Show
Outline
1
CS 332: Algorithms
  • Solving Recurrences Continued
  • The Master Theorem
  • Introduction to heapsort
2
Review: Merge Sort
  • MergeSort(A, left, right) {
  • if (left < right) {
  • mid = floor((left + right) / 2);
  • MergeSort(A, left, mid);
  • MergeSort(A, mid+1, right);
  • Merge(A, left, mid, right);
  • }
  • }
  • // Merge() takes two sorted subarrays of A and
  • // merges them into a single sorted subarray of A.
  • // Code for this is in the book.  It requires O(n)
  • // time, and *does* require allocating O(n) space
3
Review: Analysis of Merge Sort
  • Statement Effort





  • So T(n) = Q(1) when n = 1, and              2T(n/2) + Q(n) when n > 1
  • This expression is a recurrence
4
Review: Solving Recurrences
  • Substitution method
  • Iteration method
  • Master method
5
Review: Solving Recurrences
  • The substitution method
    • A.k.a. the “making a good guess method”
    • Guess the form of the answer, then use induction to find the constants and show that solution works
    • Run an example: merge sort
      • T(n) = 2T(n/2) + cn
      • We guess that the answer is O(n lg n)
      • Prove it by induction
    • Can similarly show T(n) = Ω(n lg n), thus Θ(n lg n)
6
Review: Solving Recurrences
  • The “iteration method”
    • Expand the recurrence
    • Work some algebra to express as a summation
    • Evaluate the summation
  • We showed several examples, were in the middle of:
7
"T(n)"
  • T(n) =
  • aT(n/b) + cn
  • a(aT(n/b/b) + cn/b) + cn
  • a2T(n/b2) + cna/b + cn
  • a2T(n/b2) + cn(a/b + 1)
  • a2(aT(n/b2/b) + cn/b2) + cn(a/b + 1)
  • a3T(n/b3) + cn(a2/b2) + cn(a/b + 1)
  • a3T(n/b3) + cn(a2/b2 + a/b + 1)
  • …
  • akT(n/bk) + cn(ak-1/bk-1 + ak-2/bk-2 + … + a2/b2 + a/b + 1)
8
"So we have"
  • So we have
    • T(n) = akT(n/bk) + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)
  • For k = logb n
    • n = bk
    • T(n) = akT(1) + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)
    • = akc + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)
    • = cak + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)
    • = cnak /bk + cn(ak-1/bk-1 + ... + a2/b2 + a/b + 1)
    • = cn(ak/bk + ... + a2/b2 + a/b + 1)
9
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a = b?
    • T(n) = cn(k + 1)
    • = cn(logb n + 1)
    • = Q(n log n)
10
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a < b?
11
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a < b?
    • Recall that S(xk + xk-1 + … + x + 1) = (xk+1 -1)/(x-1)
12
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a < b?
    • Recall that (xk + xk-1 + … + x + 1) = (xk+1 -1)/(x-1)
    • So:
13
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a < b?
    • Recall that S(xk + xk-1 + … + x + 1) = (xk+1 -1)/(x-1)
    • So:



    • T(n) = cn ·Q(1) = Q(n)
14
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?


15
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?
16
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?


    • T(n) = cn · Q(ak / bk)
17
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?


    • T(n) = cn · Q(ak / bk)
    • = cn · Q(alog n / blog n) = cn · Q(alog n / n)
18
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?


    • T(n) = cn · Q(ak / bk)
    • = cn · Q(alog n / blog n) = cn · Q(alog n / n)
    • recall logarithm fact: alog n = nlog a
19
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?


    • T(n) = cn · Q(ak / bk)
    • = cn · Q(alog n / blog n) = cn · Q(alog n / n)
    • recall logarithm fact: alog n = nlog a
    • = cn · Q(nlog a / n) = Q(cn · nlog a / n)
20
"So with k = logb..."
  • So with k = logb n
    • T(n) = cn(ak/bk + ... + a2/b2 + a/b + 1)
  • What if a > b?


    • T(n) = cn · Q(ak / bk)
    • = cn · Q(alog n / blog n) = cn · Q(alog n / n)
    • recall logarithm fact: alog n = nlog a
    • = cn · Q(nlog a / n) = Q(cn · nlog a / n)
    • = Q(nlog a )
21
"So…"
  • So…
22
The Master Theorem
  • Given: a divide and conquer algorithm
    • An algorithm that divides the problem of size n into a subproblems, each of size n/b
    • Let the cost of each stage (i.e., the work to divide the problem + combine solved subproblems) be described by the function f(n)
  • Then, the Master Theorem gives us a cookbook for the algorithm’s running time:
23
The Master Theorem
  • if  T(n) = aT(n/b) + f(n) then



24
Using The Master Method
  • T(n) = 9T(n/3) + n
    • a=9, b=3, f(n) = n
    • nlogb a = nlog3 9 = Q(n2)
    • Since f(n) = O(nlog3 9 - e), where e=1, case 1 applies:



    • Thus the solution is T(n) = Q(n2)
25
Sorting Revisited
  • So far we’ve talked about two algorithms to sort an array of numbers
    • What is the advantage of merge sort?
    • What is the advantage of insertion sort?
  • Next on the agenda: Heapsort
    • Combines advantages of both previous algorithms
26
Heaps
  • A heap can be seen as a complete binary tree:







    • What makes a binary tree complete?
    • Is the example above complete?
27
Heaps
  • A heap can be seen as a complete binary tree:







    • The book calls them “nearly complete” binary trees; can think of unfilled slots as null pointers
28
Heaps
  • In practice, heaps are usually implemented as arrays:






29
Heaps
  • To represent a complete binary tree as an array:
    • The root node is A[1]
    • Node i is A[i]
    • The parent of node i is A[i/2] (note: integer divide)
    • The left child of node i is A[2i]
    • The right child of node i is A[2i + 1]
30
Referencing Heap Elements
  • So…
    • Parent(i) { return ëi/2û; }
    • Left(i) { return 2*i; }
    • right(i) { return 2*i + 1; }
  • An aside: How would you implement this
    most efficiently?
  • Another aside: Really?
31
The Heap Property
  • Heaps also satisfy the heap property:
  • A[Parent(i)] ³ A[i] for all nodes i > 1
    • In other words, the value of a node is at most the value of its parent
    • Where is the largest element in a heap stored?
  • Definitions:
    • The height of a node in the tree = the number of edges on the longest downward path to a leaf
    • The height of a tree = the height of its root
32
Heap Height
  • What is the height of an n-element heap? Why?
  • This is nice: basic heap operations take at most time proportional to the height of the heap


33
Heap Operations: Heapify()
  • Heapify(): maintain the heap property
    • Given: a node i in the heap with children l and r
    • Given: two subtrees rooted at l and r, assumed to be heaps
    • Problem: The subtree rooted at i may violate the heap property (How?)
    • Action: let the value of the parent node “float down” so subtree at i satisfies the heap property
      • What do you suppose will be the basic operation between i, l, and r?
34
Heap Operations: Heapify()
  • Heapify(A, i)
  • {
  • l = Left(i); r = Right(i);
  • if (l <= heap_size(A) && A[l] > A[i])
  • largest = l;
  • else
  • largest = i;
  • if (r <= heap_size(A) && A[r] > A[largest])
  • largest = r;
  • if (largest != i)
  • Swap(A, i, largest);
  • Heapify(A, largest);
  • }
35
Heapify() Example
36
Heapify() Example
37
Heapify() Example
38
Heapify() Example
39
Heapify() Example
40
Heapify() Example
41
Heapify() Example
42
Heapify() Example
43
Heapify() Example
44
Analyzing Heapify(): Informal
  • Aside from the recursive call, what is the running time of Heapify()?
  • How many times can Heapify() recursively call itself?
  • What is the worst-case running time of Heapify() on a heap of size n?
45
Analyzing Heapify(): Formal
  • Fixing up relationships between i, l, and r takes Q(1) time
  • If the heap at i has n elements, how many elements can the subtrees at l or r have?
    • Draw it
  • Answer: 2n/3 (worst case: bottom row 1/2 full)
  • So time taken by Heapify() is given by
  • T(n) £ T(2n/3) + Q(1)
46
Analyzing Heapify(): Formal
  • So we have
  • T(n) £ T(2n/3) + Q(1)
  • By case 2 of the Master Theorem,
  • T(n) = O(lg n)
  • Thus, Heapify() takes linear time


47
Heap Operations: BuildHeap()
  • We can build a heap in a bottom-up manner by running Heapify() on successive subarrays
    • Fact: for array of length n, all elements in range
      A[ën/2û + 1 .. n] are heaps (Why?)
    • So:
      • Walk backwards through the array from n/2 to 1, calling Heapify() on each node.
      • Order of processing guarantees that the children of node i are heaps when i is processed
48
BuildHeap()
  • // given an unsorted array A, make A a heap
  • BuildHeap(A)
  • {
  • heap_size(A) = length(A);
  • for (i = ëlength[A]/2û  downto 1)
  • Heapify(A, i);
  • }
49
BuildHeap() Example
  • Work through example
    A = {4, 1, 3, 2, 16, 9, 10, 14, 8, 7}
50
Analyzing BuildHeap()
  • Each call to Heapify() takes O(lg n) time
  • There are O(n) such calls (specifically, ën/2û)
  • Thus the running time is O(n lg n)
    • Is this a correct asymptotic upper bound?
    • Is this an asymptotically tight bound?
  • A tighter bound is O(n)
    • How can this be?  Is there a flaw in the above reasoning?
51
Analyzing BuildHeap(): Tight
  • To Heapify() a subtree takes O(h) time where h is the height of the subtree
    • h = O(lg m), m = # nodes in subtree
    • The height of most subtrees is small
  • Fact: an n-element heap has at most én/2h+1ù nodes of height h
  • CLR 7.3 uses this fact to prove that BuildHeap() takes O(n) time


52
Heapsort
  • Given BuildHeap(),  an in-place sorting algorithm is easily constructed:
    • Maximum element is at A[1]
    • Discard by swapping with element at A[n]
      • Decrement heap_size[A]
      • A[n] now contains correct value
    • Restore heap property at A[1] by calling Heapify()
    • Repeat, always swapping A[1] for A[heap_size(A)]
53
Heapsort
  • Heapsort(A)
  • {
  • BuildHeap(A);
  • for (i = length(A) downto 2)
  • {
  • Swap(A[1], A[i]);
  • heap_size(A) -= 1;
  • Heapify(A, 1);
  • }
  • }
54
Analyzing Heapsort
  • The call to BuildHeap() takes O(n) time
  • Each of the n - 1 calls to Heapify() takes O(lg n) time
  • Thus the total time taken by HeapSort()
    = O(n) + (n - 1) O(lg n)
    = O(n) + O(n lg n)
    = O(n lg n)
55
Priority Queues
  • Heapsort is a nice algorithm, but in practice Quicksort (coming up) usually wins
  • But the heap data structure is incredibly useful for implementing priority queues
    • A data structure for maintaining a set S of elements, each with an associated value or key
    • Supports the operations Insert(), Maximum(), and ExtractMax()
    • What might a priority queue be useful for?


56
Priority Queue Operations
  • Insert(S, x) inserts the element x into set S
  • Maximum(S) returns the element of S with the maximum key
  • ExtractMax(S) removes and returns the element of S with the maximum key
  • How could we implement these operations using a heap?