[an error occurred while processing this directive]

<!--<div class="pdf"><a href="comments.pdf">[PDF version for
printing]</a></div>-->
<h2>Exam 1 Comments</h2>

<h3>Order Notation</h3>

For each of the questions below, fill in the missing symbol with one of
these choices:
<ul>
<li> <b>=</b> &mdash; the sets are equal
<li> <b><FONT FACE="Symbol">&#204;</FONT></b> &mdash; the left set is
a strict subset (cannot be equal) of the right set
<li> <b><FONT FACE="Symbol">&#201;</FONT></b> &mdash; the left set
is a strict superset (cannot be equal) of the right set
<li> <b><font face="Symbol">&#205;</font></b> &mdash; the left set is
a subset (can be equal) of the right set
<li> <b><font face="Symbol">&#202;</font></b> &mdash; the left set
is a superset (can be equal) of the right set
<li> <b><font face="Symbol">&#185;</font></b> &mdash; the sets are not equal, but there is no subset
or superset relationship
</ul>
You should select the strongest possible choice (for example, if two
sets are equal and you select subset, that is a correct statement, but
not worth full credit).
<p>
For each answer, provide a short justification of your answer.  Your
justification should follow from the definitions of the order notations.
<p>
<b>1.</b> (Average 4.49 out of 5 points) <font face="Times"><em>O</em> (<em>n</em>) <b><font face="Symbol">&#185;</font></b> &Theta; (<em>n</em><sup>2</sup>)</font><br>
<blockquote>
The sets are unequal (in fact they are disjoint, and have no members in
common).  The set &Theta;(<em>n</em><sup>2</sup>) is all functions that
grow as fast as <em>n</em><sup>2</sup>; the set <em>O</em>(<Em>n</em>)
is the set of all functions that grow no faster than <em>n</em>.  Since
<em>n</em><sup>2</sup> grows faster than <em>n</em>, there is no overlap
between the sets.
</blockquote>

<b>2.</b> (4.11 / 5) <font face="Times">&Theta; (<em>n</em>) <b><FONT FACE="Symbol">&#204;</FONT></b> <em>O</em> (2<em>n</em>)</font><Br>
<blockquote>
First, note that the factor has no effect on the growth rate, so
<em>O</em>(2<em>n</em>) is equivalent to <em>O</em>(<em>n</em>).  The
set &Theta;(<em>n</em>) is the set of all functions that grow as fast as
<em>n</em>; this is a proper subset of <em>O</em>(<em>n</em>) which is
the set of all functions that grow no faster than <em>n</em>.  There are
functions in <em>O</em>(<em>n</em>) that are not in &Theta;(<em>n</em>)
such as constant functions or log <em>n</em>.
</blockquote>

<b>3.</b> (4.14 / 5) <font face="Times"><FONT FACE="Symbol">&#198;</FONT>
<b>=</b> <em>O</em> (<em>n</em>!) 
<FONT FACE="Symbol">&#199;</FONT>
&Omega;(<em>n</em><sup><em>n</em></sup>)</font><br>
<blockquote>
The intersection of <em>O</em>(<em>n</em>!) and
&Omega;(<em>n</em><sup><em>n</em></sup>) is empty, since
<em>n</em><sup><em>n</em></sup> grows faster than <em>n</em>! (see <a
href="/cs216/ps/ps1/">Problem Set 1</a>).
</blockquote>

<b>4.</b> (4.82 / 5) <font face="Times"><em>O</em> (1) <b><font
face="Symbol">&#201;</font></b> &Theta; (1)</font><br>
<blockquote>
This one is tricky, and we accepted <b>=</b>, <b><font
face="Symbol">&#202;</font></b> or <b><font
face="Symbol">&#201;</font></b> as a full credit answer (with a good
explanation).  Certainly we known every element of &Theta;(1) is also in
<em>O</em>(1) from the definitions of &Theta; and <em>O</em>.  The
tougher question is figuring out if there is any function in
<em>O</em>(1) that is not in &Theta;(1).  From the informal definitions,
this would mean is there any function that produces a positive value as
output that grows slower than a constant.  Intuitively, it would seem
that functions like 1/<em>n</em> have this property &mdash; as
<em>n</em> increases, the value of the function decreases
asymptotically.  To answer it more convincingly, we need to consider the
definitions of <em>O</em> and &Theta;.
<p>

If <em>f</em> is in <em>O</em> (1), then we know there are constants
<em>c</em> &gt; 0 and <em>n</em><sub>0</sub> &ge; 0 such that
<em>f</em>(<em>n</em>) &lt; <em>c</em> for all <em>n</em> &ge;
<em>n</em><sub>0</sub>.  (Since <em>g(n)</em> = 1, the <em>g(n)</em>
term goes away.)  For <em>O</em>(1) to be equal to &Theta;(1), we would
need to show that this also implies <em>f</em> is in &Omega;(1).  This
requires that we can find constants <em>c</em> &gt; 0 and
<em>n</em><sub>0</sub> &ge; 0 such that <em>f</em>(<em>n</em>) &gt;
<em>c</em> for all <em>n</em> &ge; <em>n</em><sub>0</sub>.  That is,
<em>f</em> is in &Omega;(1) if and only if 1 is in
<em>O</em>(<em>f</em>).  Consider <em>f(n)</em> = 1/<em>n</em>.  We
cannot have 1 is in <em>O</em>(1/<em>n</em>) since for some value of
<em>n</em> we know eventually 1 &gt; <em>c</em> * 1 / <em>n</em> for
some value of <em>c</em> (we can choose <em>n</em> &gt; <em>c</em> for
any choice of <em>c</em>).  Hence, we know <em>O</em>(1) is a strict
superset of &Theta;(1).

</blockquote>

<h3>Lists</h3>

<b>5.</b> (6.98 / 10) Complete the definition of the <tt>ListNode</tt> method
<tt>reverse</tt>, that is called by the <tt>LinkedList</tt> method
<tt>reverse</tt> to produce a reversed self list as its output (the same
elements as in self, but in reverse order).  For example,
<pre>
   l = LinkedList.LinkedList ().append(1).append(2).append(3)
   r = l.reverse ()
</pre>
should make <tt>r</tt> the list <tt>[3, 2, 1]</tt> and leave <tt>l</tt>
as the list <tt>[1, 2, 3]</tt>.
<p>

The rest of the code is taken from the <tt>LinkedList.py</tt>
implementation of an immutable list datatype from Problem Set 2.
<p>
<blockquote>
The easiest way to think of reverse is simply adding the first element
to the end of the list resulting from reversing the rest of the
elements:

<pre>
    def reverse (self):
        head = ListNode (self.__info)
        if self.__next == None:
            return head
        else:
<b>            return self.__next.reverse().append (self.__info)</b>
</pre>

There are more complex iterative ways (that are more efficient) to do
this by switching the <tt>__next</tt> pointers down the list, but they
are much tougher than the simple recursive solution.

</blockquote>

<b>6.</b> (8.46 / 10) What is the asymptotic running time of your <tt>reverse</tt>
implementation?  Explain your answer convincingly, and be sure to define
any variables you use and state any assumptions you make clearly.
<blockquote>
Our reverse implementation has running time in &Theta;(<em>n</em><sup>2</sup>) where
<em>n</em> is the number of elements in the input list.  (Note, it is
possible to implement reverse with running time in &Theta;(<em>n</em>)
by adjusting the <tt>__next</tt> pointers directly.)
<p>
There will be <em>n</em> calls to <tt>reverse</tt>, once for each element in the
list (actually, there are <em>n</em> - 1 because the base case stops the
recursion for the list of length 1).  But, each call to reverse involves
a call to our <tt>append</tt> method.  The provided <tt>append</tt>
implementation has running time in &Theta;(<em>m</em>) where <em>m</em>
is the size of the input to <tt>append</tt> (see <a
href="/cs216/ps/ps2/">Problem Set 2</a>).  The average length of the
input list to append is <em>n</em> / 2, so the average running time of
the append call is in &Theta;(<em>n</em> / 2) = &Theta;(<em>n</em>).  
<p>
We are making <em>n</em> calls to reverse, each of which involves
running time &Theta;(<em>n</em>) from the call to append, so the total
running time will be in &Theta;(<em>n</em><sup>2</sup>).
</blockquote>

<h3>Matching</h3>


<b>7.</b> (9.2 / 10) Prove the greedy partnering algorithm shown is not
optimal by showing an input for which it would not produce the correct
result.
<blockquote>
We just need to find an input list where the best match for the first
student does not produce the optimal global match.  Here's a simple
example: 
<ul>
<li> students = { "Alice", "Bob", "Colleen" }
<li> The goodness scores of (Alice, Bob) = 10, (Alice, Colleen) = 20,
and (Bob, Colleen) = 110.  (Bob and Colleen are in the same section and
different years; Alice and Bob are in different years and different
sections; Alice and Colleen are in different majors, the same year, and
different sections.)
<li> The greedy algorithm will first find the best partner for Alice,
which is Colleen.  Then, there is no partner left fo Bob, so the total
goodness score is 20.  If we matched Bob with Colleen, and left Alice
unpaired, the goodness score is 110.
</ul>
</blockquote>

</pre>
<b>8.</b> (4.3 / 5) What is the asymptotic running time of <tt>assignPartners</tt>?
Be sure to define any variables you use in your answer and state your
assumptions about Python operations clearly.
<blockquote>
&Theta;(<em>n</em><sup>2</sup>) where <em>n</em> is the number of
students.  We have two nested for loops, each of which iterates through
the students.  So, there are &Theta;(<em>n</em><sup>2</sup>) evaluations of the
inner for loop (there are not exactly <em>n</em><sup>2</sup> iterations,
since we skip the inner loop if the student already has a partner; in
the worst case, however, no students will be assigned partners if the
best match for each student is with None).
<p>
The body of the inner for loop evaluates <tt>goodnessScore</tt> twice.
This involves indexing into <tt>records</tt> to find each students
record, and doing a dictionary lookup on the field we are comparing, as
well as the <tt>find</tt> call in the notpartners list.  We need to
assume all these operations have running times in <em>O</em>(1) for the
overall running time to be in &Theta;(<em>n</em><sup>2</sup>).  The one
we are most concerned about is <tt>find</tt>, which searches the
notpartners string for each student to see if it contains the other
student.  This is likely to have running time in <em>O</em>(<em>s</em>)
where <em>s</em> is the length of the notpartners string.  If we assume
those lengths are small and fixed (which is in fact the case for CS216
students), then this is still constant time as <em>n</em> grows.  If we
assume people have a fixed fraction of the rest of the world they
wouldn't want to partner with, then we expect the lengths of the partner
lists to grow as a fraction of <em>n</em>.  This would make the overall
running time of <tt>assignPartners</tt> in
<em>O</em>(<em>n</em><sup>3</sup>).
</blockquote>

<b>9.</b> (6.37 / 10) Define the <tt>allPossiblePartnerAssignments</tt>
procedure Zulma needs.  
<blockquote>
The easiest way to do this was to realize that we can find all possible
partner assignments by arranging the students in all possible orders,
and then just having them partner with the adjacent student.  The
subtlety (which no one got quite correctly) with this approach is
dealing with the possibility that someones best match is with None
(recall that the <tt>goodnessScore</tt> returns <tt>-1</tt> if either
partner is <tt>None</tt>, but can return a more negative score if the
one partner is on the others notpartners list).  So, we need to consider
all possible orderings of the students with the list extended with
enough <tt>None</tt> values so each student could potentially be
partnered with <tt>None</tt>.  We do this by appending enough
<tt>None</tt>s to the list before calling <tt>allLists</tt>.  We use the
<tt>allLists</tt> procedure from 
<a href="http://www.cs.virginia.edu/cs216/classes/section3.html">Section
3</a> to produce all possible lists.
<pre>

def allPossiblePartnerAssignments (students):
   s = students[:] # we use a copy to avoid modifying input list
   for n in range (len (students)):
      s.append (None)
   for ordering in allLists (s):
      assignments = { }
      for i in range (len (ordering)):
         if i + 1 == len(ordering):
            assignments[ordering[i]] = None
         else:
            if not ordering[i] == None: # avoid adding partners for None
               assignments[ordering[i]] = ordering[i + 1]
            if not ordering[i + 1] == None:
               assignments[ordering[i + 1]] = ordering[i]
      yield assignments      
</pre>

(Note: I applied the same rule to myself as you had on this exam of not
actually trying the code in an interpreter.  So, there is probably at
least one bug in it.  The first person to find a non-trivial bug in this
and explain how to fix it gets 10 bonus points.)
</blockquote>

<b>10.</b> (7.63 / 10) Explain why Zuma's partner assignment algorithm would
not run fast enough to be used to assign partners for PS4.  (Note: a
good answer would include an explanation of the running time of
<tt>assignPartners</tt>.  Assume you have a correct and optimally
efficient <tt>allPossiblePartnerAssignments</tt> implementation
regardless of your answer to question 9.  You should be able to answer
question 10 well, even if you could not answer question 9.) 
<blockquote>
The algorithm we used in question 9 involves creating all possible
permutations of a list of length 2<em>n</em>, where <em>n</em> is the
number of students (recall we added <em>n</em> <tt>None</tt>s to the
list we passed to <tt>allLists</tt>).  Hence, there are (2<em>n</em>)!
possible orderings.  For each ordering, we loop through all the elements
(up to 2<em>n</em> of them), so the work of
<tt>allPossiblePartnerAssignments</tt> for each possible assignment is
<em>2n</em> * (2<em>n</em>)!.  Regardless of any other work, this
exceeds 10<sup>357</sup>.  This far exceeds the number of 
atoms in the universe, so is most definitely not something we could
compute in time for PS4.

<P>
This is not the best possible algorithm, however, especially because of
how we dealt with the None matches.  We could improve it by recognizing
the each None is identical, so all the orderings that just involve
moving Nones around with other Nones are identical.  So, perhaps we
could approach <em>O</em>(<em>n</em>!).  We know the brute force
algorithm can't do better than this however &mdash; we need to try all
possible assignments of students, so we need to try all possible
orderings.  But even 96! is way beyond the realm of anything tractable
for computing.
<p>
I was surprised by the number of people who answered this by analyzing
the complexity to be in something like
<em>O</em>(<em>n</em><sup>3</sup>), and then arguing that performing
100<sup>3</sup> iterations would be intractable.  You should have a bit
better of a sense of the power of the computers you use.  When they say
a 2GHz processor, that means it can perform 2 Billion operations per
second (what an operation is is a bit complex, but you can think of any
simple calculation as a few operations).  Hence, numbers like
100<sup>3</sup> are not a problem at all.  Its only 1 Million, and the
computers in lab can do around two thousand million operations in one
second.
</blockquote>

<b>11.</b> (average 0.98 bonus points) Suggest a better
algorithm to use to assign partners for future problem sets.  
<blockquote>
We'll discuss this in a future class.
</blockquote>

[an error occurred while processing this directive]