Comments on Solutions to Problem Set 3, Fall 2005: -------------------------------------------------- 2a) Write an algorithm to find a vertex v in a directed graph G where every other node in G is reachable from v. Solution: The two DFS functions in the book are easily modified to determine if one call to dfs_recurs() visits all vertices in the graph. The "driver function" loops over all vertices, calling dfs_recurs() on a given vertex. When this call returns, the driver checks if all vertices were visited. (E.g., could have a counter of nodes visited, or could check the visit[] array to see if all set to true. In either case, must reset these before next recursive call.) -------------- 2b) Describe how you would update the DFS implementation for an undirected graph so that it determined if the graph was acyclic. Solution: For an undirected graph, in dfs_recurs() when you are processing the node named "start" and you see an adjacent node v that *has* already been visited, this edge start->v goes back up the DFS tree. This could be a back edge representing a cycle, *but* it might be the tree edge you just descended to get to start. In other words, v might be start's parent in the DFS tree - this case doesn't indicate a cycle. How to distinguish these two cases? You'll want to modify the code in the while-loop in dfs_recurs() that looks like this:      if (!visit[v])           dfs_recurs(adj,v) else if ( !isParent(start, v, ...) // here you know start-v is part of a cycle What's the implementation of isParent()? Option 1: You can store each node's parent in a parent[] array. Option 2: Alternatively, you could add an additional argument to dfs_recurs, which will contain the parent of the node stored in start. Either way, you set the parent value right before you do the recursive call above -- right before that call you know that v's parent in the tree is start. -------------- 2c) Explain why an undirected graph G cannot have any cross or descendent edges. Answer: Anything that would be a cross edge in a digraph would become a tree edge if the graph were undirected. The direction of the edge in the digraph prevented you from moving to that node, but later you were able to follow it back to a visited node in another sub-tree. Anything that would be a descendant edge (AKA a forward edge) in a digraph would become a back edge if the graph were undirected. The direction prevented you from seeing the previously visted node that's an ancestor in the tree. A picture's worth a thousand words, so, well, *you* can draw the following picture! This is the simplest graph that illustrates this. Given four nodes {A, B, C, D} and these *directed* edges: A->B, A->C, A->D, B->C, D->B. Draw this! When you do DFS, you'll see that A->C is a descendent edge, and that D->B is a cross edge. Now draw that graph again but as a *undirected* graph, and do DFS again. You'll see that A-C is now a back edge, and that D-B becomes a tree edge. -------------- 2d) Given any undirected graph G, the eccentricity of node v is the largest of the shortest possible distances from v to any other node in the graph. The minimum eccentricity in the graph is the graph radious for G. All the nodes in G that have eccentricity equal to the graph radius form a set called the graph center of G. Describe (using pseudo-code) an efficient algorithm to find the graph center of a graph G, and describe its complexity. Prof. Horton' comments: ------------------------ For any graph problem, you should first think if DFS or BFS could be used. Here the problem is about distances (in terms of number of edges), so you should immediately try to use BFS. If you call BFS on a given node v, it will find the shortest-paths to all other nodes, with cost BigTheta(n+m). You can easily find the max of these, BigTheta(n), and store that value as the eccentricity of v. After doing this for all nodes, at cost BigTheta(n(n+m)), you find the smallest value and then find nodes with that value, both at cost BigTheta(n). Overall cost: BigTheta(n(n+m)) A solution from one of our TAs: ------------------------------- Note: the word "gets" means the assignment-operator. # Get the eccentricity of v, a vertex in G.vertices. This method # has time complexity BigTheta(n+m) and space # complexity BigTheta(n) function eccentricity_of(G, v) returns the eccentricity of v as an integer arguments G, an undirected graph v, a vertex in G.vertices variables distance_to, an array of integers indexed by G.vertices q, a queue of vertices eccentricity, an integer i, a vertex in G.vertices j, a vertex in G.vertices begin # Initialize variables. This has time complexity # BigTheta(n) for each i in G.vertices, distance[i] gets unknown distance_to[v] gets 0 enqueue v in q # Use our queue and visited array to visit each vertex # in the graph once in BFS order. When we visit a node, # record its distance from v. This has time complexity # BigTheta(n + m) while q is not empty i gets dequeue from q for each j in i.adjacentVertices if distance_to[j] is unknown distance_to[j] gets distance_to[i] + 1 enqueue j in q end if end for end while # The eccentricity is the largest of the distances. This # has time complexity BigTheta(n) eccentricity gets 0 for each i in G.vertices eccentricity gets max of eccentricity or distance_to[i] end for return eccentricity end # This function has space complexity BigTheta (n) and time # complexity BigTheta((#G.verticies)2 + (n)(m)). function graph_centre(G) returns the set of vertices that are the centre of G arguments G, an undirected graph variables radius, an integer centre, a set of vertices v, a vertex in G.vertices e, an integer begin radius gets infinite centre gets { } # Get the eccentricity of each vertex in turn, keeping track # of the set of verticies that match the best-yet-seen # eccentricity. This loop executes n times and # calls eccentricity_of each time, so it has time complexity # BigTheta(n^2 + (n)(m)). for each v in G.vertices e gets eccentricity_of(G, v) if e < radius then radius gets e centre gets { v } else if e equals radius then centre gets centre union { v } end if end for return centre; end =========================================================================== 2e) A bipartite graph is a graph where the verices may be divided into two subsets such that there is no edge between any two vertices in the same subset. Write an algorithm to determine if a graph is bipartite, and give its worst-case complexity. Prof. Horton's comments: ------------------------ I always thinks it's important to draw pictures and to think of simple cases. Let's think of simple graphs and then work up to more complicated ones. A graph with no edges is bipartite (duh!), and if it's a "chain", e.g. like A-B-C-D-E-F, then that's bipartite -- every-other-node is in the same subset, e.g. {A,C,E} and {B,D,F}. Now consider an arbitrary tree -- that's just like the chain: every-other-level in the tree belongs to the same subset. Now, what if it's more complicated than a tree, i.e. it has cycles? Draw this, and you'll see there are two cases. If the back-edge connects v back to a previously found node that was put in the same set, then it's not bipartite. If it connects v back to one that was in a different set, that cycle doesn't prevent it from being. When you think cycles, you know you can use DFS or BFS to find them. In DFS, of course cycles are represented by back-edges. If you number levels of the tree as you search alternatively (0, 1, 0, 1), then a back-edge back to a node with the same level value as the current node shows the graph is not bipartite. If you use BFS, recall the non-tree edges either go back to a visited node on the previous level (OK for bipartite) or to a node on the same level (not OK for bipartite). So just modify one of these algorithsm very slightly and, as they say in Britain, Bob's your uncle! Cost: BigTheta(n+m) BTW, we'll talk about graph coloring soon. The bipartite problem is the same as producing a valid two-coloring for a graph. A solution from one of our TAs: ------------------------------- # This function determines whether a graph is bipartite by attempting # to colour its vertices using two colours so that no two adjacent nodes # are marked with the same colour. You can think of the colours as # denoting each of the two sets; the constraint that no two adjacent # vertices have the same colour is equivalent to the statement that # no two adjacent nodes are in the same subset. # # This algorithm has time complexity BigTheta(n + m) # and space complexity BigTheta(n). function is_bipartite(G) returns whether G is bipartite, as a boolean arguments G, an undirected graph variables c, one of { unknown, red, blue } colour_of, an array of { unknown, red, blue } indexed by G.vertices v, a vertex in G.vertices q, a queue of vertices i, a vertex in G.vertices j, a vertex in G.vertices begin # Start by marking each vertex as uncoloured. This has time # complexity BigTheta(n) for each v in G.vertices, colour_of[v] gets unknown # Traverse the graph using depth-first traversal from the first # vertex. If the graph is not connected, keep starting depth-first # traversals from the nodes the previous traversals have not # visited until the entire graph is covered. This has time # complexity BigTheta(n + m). for each v in G.vertices if colour_of[v] is unknown # Start by marking the first visited node in each connected # region of the graph an arbitrary colour. colour_of[v] gets red enqueue v in q while q is not empty i gets dequeue from q # Traverse the set of neighbors to i. If they're already # marked with a colour it has better be the right one; if # not, mark them with the colour they must be. if colour_of[i] equals blue, c gets red, else c gets blue for each j in i.adjacentVertices if colour_of[j] is unknown colour_of[j] gets c enqueue j in q else if colour_of[j] is not c return false end if end for end while end if end for return true end ========================================================================= 2f) For a given undirected graph G, prove that the depth of a DFS tree cannot be smaller than the depth of the BFS tree. We can prove this by contradiction. Suppose that height of the tree representing a Depth First Search traversal of G could be smaller than the tree representing a Breadth First Search traversal of G. Then there is at least one vertex s in G such that the height of the DFS tree for traversing G from s is greater than that of the BFS tree for traversing G from s. In that case, there is at least one vertex v in G such that the level of v in the DFS tree is less than its height in the BFS tree. The level of v in each tree is the distance between s and v along the path followed by that tree's traversal. So the distance from s to v along the path followed by the depth first search is less than the distance from s to v along the path followed by the breadth first search. But we know that breadth-first search always produces the shortest path between s and each node it visits. (Note: this is the Big Thing about BFS search!) This is a contradiciton, therefore our hypothesis is false and for a given undirected graph G, the depth of a DFS tree cannot be smaller than the depth of the BFS tree, QED. ======================================================================= 2g) Exhaustive search can be used to find Hamilton cycles in a graph. However, the solution space is very large, and so you decide to modify the code to prune based on a 'cutInTwo' method that decides whether the path traced so far divides the graph so that further exploration of this branch of the search tree will not yield a Hamilton cycle: if (!visit[v] && !cutInTwo(adj, visit)) exh_search_recurse(adj, v) 2g.i) Describe a strategy for the cutInTwo() function. You can answer this by writing a few sentences describing an approach that solves this problem. cutInTwo can be written as a traversal of the graph starting from any as- yet-unvisited vertex that avoids traversing through any visited node along with the node that might be visited next (i.e. v in the code above). If the graph is cut in two, than no matter which vertex we start from there will be at least one unreachable vertex that hasn't been visited; if not, there will not be. So this is relatively simple: somehow do a BFS or DFS on the subgraph you get when you remove the set of visited nodes and the node that might be visited next. A very simple approach is to create another visited-array to use with this DFS, and instead of initializing it to all false, set the visited2[i] to true those nodes. 2g.ii) In the worst case the time complexity of this algorithm for cutInTwo is BigTheta(n1 + m1), where n1 is the number of not yet visited vertices and m1 is the number of edges that do not connect two visited vertices. Note that n1 < n and m1 < m. But the question asked for an upper bound, so we can just say BigTheta(n+m). 2g.iii) In problems where state-space search could be exponential, it almost always pays to do as much as possible right before exploring from a given node to determine if that path will lead to a dead-end. Even if each pruning test seems to be expensive, anything to cut back the exhaustive search will pay off. (And note this pruning test is *not* expensive, since one DFS call is linear, BigTheta(n+m)). The size of the exhaustive search tree will be roughly x^n where x is the average number of edges connected to any given vertex and n is the number of as-yet-unvisited vertices. Eliminating a sub-tree high up in the complete exhaustive search may reduce a very large number of recursive calls. Determining the exact effect would require a priori knowledge of the graph's structure or simulation. One can imagine (and draw) a graph structured in a way so that there is a connecting path connects nodes very far away from the start node in the tree. In this case, the pruning test will not prevent a recursive visit until well down the tree. In summary: exhaustive search is truly expensive. One pruning test is cheap, so adding it does not make things worse by any means. Though the problem might still be exponential, there is potential to reduce the number of visits by a lot (in practice).