# Graphs:

• The graph ADT comprises a set of vertices (or nodes) together with a set of pairs of vertices, the edges. A vertex can be any arbitrary abstract object.
• Graphs are an extremely powerful concept. Linked lists, trees, state transition diagrams, etc. are all just cases of graphs.

## Terminology:

• Direction:
• Directed graph (or digraph) = a graph where the edges have a direction associated with them. I.e., each edge is now an ordered pair of vertices.
• In conceptual terms, direction implies that relations between vertices are not symmetric. For example, a graph of friends would be undirected, since X being friends with Y implies that Y is friends with X. And on the other hand, a graph of Twitter followers would be directed, since X being a follower of Y does not imply that Y is a follower of X.
• Directed acyclic graph (or DAG) = a digraph with no (directed) cycles.
https://www.wikiwand.com/en/Directed_acyclic_graph
• Many important applications, e.g., systems of events or partial events, scheduling, causal structures.
• A rooted tree, as in the ADT, is a DAG.
• Workflowy is essentially a directed acyclic graph!
• Topological sort (or topological ordering):
• A topological sort of a digraph is just the linear ordering of its vertices such that for every directed edge ab, a comes somewhere before b.
• Any DAG has at least one topological ordering.
• The canonical case is scheduling: the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another (prerequisites). Thus, a topological ordering is just a valid sequence for the tasks.
• A reverse postorder traversal provides a topological ordering.
• Multiple edges & loops:
• Multiple (or parallel) edges = 2 or more edges that connect the same 2 vertices.
• Loop = an edge that connects a vertex to itself.
• Multigraph = a graph that can have multiple edges and (depending on the convention) sometimes loops.
• Simple graph = a graph that cannot have multiple edges or loops.
• Connectivity:
• Connected graph = graph with no vertices by themselves; i.e., there is some path between every pair of vertices / every pair of the graph's vertices is connected.
• A connected component is a maximal subgraph that is (1) connected and (2) not connected to any vertices in the rest of the supergraph.
• Subgraph of a graph G = graph whose vertices are a subset of G's vertices and whose edges are a subset of G's edges. Note that this is subset, not proper subset.
• Tracking the connected components of a changing graph is a straightforward partitioning problem, which can be naturally solved with the application of disjoint-set (or union-find) data structures.
https://www.wikiwand.com/en/Disjoint-set_data_structure
• Maximal such subgraph because otherwise there would be "overlapping" connected components. This constrains it such that each vertex and each edge belongs to exactly one connected component. Mathematically, connected components partition their supergraph.
• A digraph is strongly connected if every pair of vertices (x and y) has a directed path both from x to y and from y to x. Otherwise, it is merely (weakly) connected.
• The strong components of a digraph are just the maximal strongly connected subgraphs.
• Kosaraju's algorithm and Tarjan's strongly connected components algorithm are 2 popular ways to compute a digraph's strongly connected components.
• A vertex cut (or separating set) = the set of vertices whose removal would make the graph disconnected.
Note that this is not the same as a cut.
• Easy to compute connectivity: if the number of nodes visited through either BFS or DFS equals the number of vertices, then the graph is connected.
• Weighted graphs have weights (or costs or distances) associated with their edges. So 2 paths can have the same number of edges (same path length), but have different total weights (or distances) because their edges' weights differ.
• Cut:
• A partition of a graph's vertices into two disjoint subsets.
• Also defines a cut-set, the set of edges that have 1 endpoint in each subset of the partition. I.e., the edges that cross the cut (also called a crossing edge). Sometimes a cut is identified as its cut-set.
• The size / weight of a cut is the (total) weight of the cut-set, the edges crossing the cut. If unweighted, the size is the number of such crossing edges.
• Degree of a node is the number of edges connected to it. If directed graph, there is an indegree and an outdegree.
• A Hamiltonian path is a path that visits each vertex exactly once. Similarly, a Hamiltonian cycle (or circuit) is a Hamiltonian path that is a cycle. Determining whether such a path exists in a graph is known as the NP-complete Hamiltonian path problem.
• Coloring:
• A graph coloring is just assigning colors to the nodes in a graph.
• A legal coloring is when no two adjacent / connected nodes have the same color.
• Finding the fewest number of colors we can use for a legal coloring (the chromatic number) is an NP problem.

## Three primary implementation methods / data structures:

• https://www.wikiwand.com/en/Graph_(abstract_data_type) and http://www.cs.rochester.edu/~nelson/courses/csc_173/graphs/implementation.html
• Each vertex is associated with an unordered list that contains its neighbors.
• More efficient than the adjacency matrix in returning a given vertex's neighbors, O(1); less efficient in testing whether two vertices are neighbors, O(|V|), because in the worst case the whole list of neighbors has to be scanned.
|V| is the number of vertices.
• Slow in removing a vertex / edge, because must find all vertices / edges.
• To represent weights, each node in the neighbor list could also contain the weight.
• In practice the simplest way is this; in Python, just use defaultdicts:
• Graph is a dict of nodes where each node is mapped to a set of its neighbors.
• Weights is a dict where each origin and dest-node is a tuple that is mapped to the weight val. (Or could do it where the set of neighbors is a tuple of (neighbor, weight)).
• But then the problem is I could have two edges between the same nodes but with different weights. *Could* be desirable though.
• Code:
From https://leetcode.com/problems/evaluate-division/description/
• # initialize graph where each node is mapped to a set of its neighbors
• # and each node 2-tuple is mapped to its result
• G, W = defaultdict(set), defaultdict(float)
• for (a, b), val in zip(equations, values):
• G[a], G[b] = G[a].union({b}), G[b].union({a})
• W[a, b], W[b, a] = val, 1.0 / val
• A matrix where each non-diagonal entry Aij is the number of edges from vertex i to vertex j, and the diagonal entry Aii, depending on the convention, is either once or twice the number of edges (loops) from vertex i to itself.
Undirected graphs often use the latter convention of counting loops twice, whereas directed graphs typically use the former convention.
• Sometimes the matrix value is just a boolean, if there can be no parallel edges (i.e., the graph is a simple graph, not a multigraph).
• In a graph without loops, the diagonal in the matrix will have all zero entries.
• Adjacency matrices are space-efficient, because each matrix entry only requires one bit. However, for a sparse graph (few edges), adjacency lists use less space because they do not represent nonexistent edges.
• More efficient than the adjacency list in determining whether two vertices are neighbors, O(1); less efficient in getting all of a given vertex's neighbors because the entire row must be scanned, O(|V|).
• Slow to add or remove vertices, because the matrix must be resized / copied.
• To represent weights, the matrix value could be the weight of that edge.
• The third (less common but very simple) implementation method is just objects and pointers:
• Naturally, Nodes / Vertex[es] can be represented as objects, with pointers to neighbor nodes. If a list of those pointers is kept, it becomes very similar to an adjacency list (http://stackoverflow.com/questions/5886274/comparing-object-graph-representation-to-adjacency-list-and-matrix-representatio).
• Also, Edges are another natural object that can be used.

## Graph traversals:

• http://www.wikiwand.com/en/Graph_traversal and for implementations in Python http://eddmann.com/posts/depth-first-search-and-breadth-first-search-in-python/
• Depth-first search (DFS):
• Depth-first search is like walking through a corn maze. You explore one path, hit a dead end, and go back and try a different one.
• DFS visits the child nodes before visiting the sibling nodes; i.e., it traverses the depth of any particular path before exploring its breadth.
• Pro: usually less memory than BFS and easily implemented with recursion.
• Con: doesn't necessarily get shortest path.
• Pseudo-pseudocode:
• Visit node r and then iterate through each of r's unvisited neighbors.
• When visiting a neighbor n, immediately visit all of his neighbors. And so on.
Thus depth-first. So neighbor n and its children are exhaustively searched before r moves on to its other adjacent nodes.
• If no neighbors, i.e., a dead end, backtrack (pop the stack) until reach unvisited node.
• Python code:
• graph = {'A': set(['B', 'C']),
• 'B': set(['A', 'D', 'E']),
• 'C': set(['A', 'F']),
• 'D': set(['B']),
• 'E': set(['B', 'F']),
• 'F': set(['C', 'E'])}
• Iterative:
• def dfs(graph, start):
• visited, stack = set(), [start]
• while stack:
• vertex = stack.pop()
• if vertex not in visited:
• stack.extend(graph[vertex] - visited)
• return visited
• Recursive:
• def dfs(graph, start, visited=None):
• if visited is None:
• visited = set()
• for next in graph[start] - visited:
• dfs(graph, next, visited)
• return visited
• Easy to iteratively implement with a stack. Just make sure to "mark" visited nodes to prevent infinite loops.
• Also easy to recursively implement (also via a stack, i.e., the program's call stack).
• Breadth-first search is like throwing a stone in the center of a pond. The nodes you explore "ripple out" from the starting point.
• BFS explores the neighbor nodes first, before moving to the next "level."
• Pro: will always find the shortest path.
• Con: usually requires more memory than DFS.
• Pseudo-pseudocode:
• Visit node r and then iterate through each of r's unvisited neighbors.
• When visiting a neighbor n, add its neighbors to queue, but don't visit them yet.
• Visit all of node r's neighbors before visiting any of r's "(grand)children."
• Python code:
• graph = {'A': set(['B', 'C']),
• 'B': set(['A', 'D', 'E']),
• 'C': set(['A', 'F']),
• 'D': set(['B']),
• 'E': set(['B', 'F']),
• 'F': set(['C', 'E'])}
• Iterative:
• def bfs(graph, start):
• visited, queue = set(), [start]
• while queue:
• vertex = queue.pop(0)
• if vertex not in visited:
• queue.extend(graph[vertex] - visited)
• return visited
• Implement iteratively with a queue and mark visited as always.
• "Remember that breadth-first uses a queue and depth-first uses a stack (could be the call stack or an actual stack object). That's not just a clue about implementation, it also helps with figuring out the differences in behavior. Those differences come from whether we visit nodes in the order we see them (first in, first out) or we visit the last-seen node first (last in, first out)."
• DFS is the easiest if want to visit every node.
• But BFS can get shortest path(s) first.
• Both traversals (for the entire graph) take time linear to the size of the graph (number of vertices + number of edges), O(|V| + |E|) and space linear to the number of vertices.
• In other words: runtime is O(N + M) where M is the number of edges (we check each neighbor twice per edge).
• Note that just as trees are a special case of graphs, tree traversals are a special case of graph traversals.

## Two common graph problems:

• Shortest-path tree (or single-source shortest paths (SSSP) problem):
• A shortest-path tree rooted at vertex v is a spanning tree T of the graph G such that the path distance from v to any other vertex u in the tree is the shortest path from v to u in G. I.e., the shortest-path tree rooted at v is the tree that comprises the shortest paths from v to every node.
• A spanning tree of G is a tree (a connected undirected graph with no cycles) T such that (1) every vertex in G belongs to it and (2) every edge in T belongs to G.
• Must be a spanning tree because otherwise it wouldn't actually be a tree of paths from the root node to all nodes.
• Simpler, related version is finding the shortest path from one node to another. But finding the shortest path from the former to all nodes takes the same amount of time.
• In an unweighted graph or a graph in which all edge weights are 1, the shortest-path tree is isomorphic to the BFS tree.
• Be careful to remember that edge lengths or distances are the same as edge weights. So if the graph is a weighted one, the path distance from vertex a to b is actually the sum of the edge weights of that path.
• If the weights are 1 or nonexistent, then the path distance is just the number of edges from 1 node to another.
• Algorithms:
• Dijkstra's algorithm:
https://www.wikiwand.com/en/Dijkstra's_algorithm
• Fastest shortest-path tree / single-source shortest paths algorithm for [di]graphs with nonnegative edge weights.
• Pseudo-pseudocode:
http://www.egr.unlv.edu/~larmore/Courses/CSC269/pathing
• The distance of node Y is the distance from the root node to Y. First assign some initial distance values and then seek to greedily improve them step by step.
• 1) Assign to every node an initial tentative distance value: set it to 0 for our root node and to infinity (float('inf')) for all other nodes.
• 2) Assign the root node as current. Create an unvisited node set and put all nodes in it.
• 3) For the current node c, consider all of its unvisited neighbors (e.g., d) and calculate their tentative distances: tentativeDist(d) = tentativeDist(c) + c.getEdgeLength(d). Compare the newly calculated tentative distance to the current assigned value and assign the smaller one. For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A) will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8. Otherwise, keep the current value.
• This process is called relaxation; we are relaxing the edges by trying to find if the tentative distances of c's neighbors could be improved by diverting the path through c.
• 4) When we are done considering / updating all of the neighbors of the current node, mark the current node as visited and remove it from the unvisited set. A visited node will never be checked again and it is now in the shortest-path tree. Its tentative distance value is the final distance value. I.e., it is now the actual distance or cost of the shortest path from the source root node to the current just-visited node.
• 5) The algorithm is done if the destination node has been marked visited (for the simple case where we are calculating shortest path between 2 nodes) or if the unvisited set is empty or if the smallest tentative distance among the nodes in the unvisited set is infinity (occurs when there is no connection between the initial node and remaining unvisited nodes).
• 6) Select the unvisited node that is marked with the smallest tentative distance, and set it as the new current node, then go back to step 3.
• Implementation:
• "The simplest implementation of the Dijkstra's algorithm stores vertices of set Q in an ordinary linked list or array, and operation Extract-Min(Q) is simply a linear search through all vertices in Q. In this case, the running time is O(V^2).
• For sparse graphs, that is, graphs with much less than V^2 edges, Dijkstra's algorithm can be implemented more efficiently by storing the graph in the form of adjacency lists and using a binary heap or Fibonacci heap as a priority queue to implement the Extract-Min function. With a binary heap, the algorithm requires O((E+V)lgV) time, and the Fibonacci heap improves this to O(E + VlgV)."
• Pretty understandable Python implementation: http://stackoverflow.com/questions/22897209/dijkstras-algorithm-in-python
• Specifically the prof answer with helper functions.
• More complicated but better: https://gist.github.com/econchick/4666413
• Need to have a predecessor or previous-like reference for each node to actually build a tree. Otherwise, just returns each node and its distance from the root node. Or could build path as go through algorithm, as in the gist implementation above.
• Re: predecessors, getting a predecessor subgraph is a pretty nice way to form the actual (shortest-path) tree:
• The edges in such a subgraph are (preds[v],v), where preds is the array of predecessors for each vertex.
• In the algorithm, whenever relaxation happens, just add the correct predecessor. E.g., if vertex u's distance value can be improved by going through w, then preds[u] = w.
• Then can form shortest-path tree easily at the end.
• https://www.cs.auckland.ac.nz/software/AlgAnim/dijkstra.html
• With a min-priority queue to get the unvisited node with the smallest tentative distance, can run in O( |E| + |V| log |V| ).
• Not capable of handling negative weights (Bellman-Ford can, however).
• Known as a greedy algorithm because at each stage, it greedily selects the unvisited minimum-weight node.
• Supposed to be for directed graphs.
• Dijkstra's is a special case of A*.
• A* (A star) is an extended version of Dijkstra's that uses heuristics to guide its search and be a best-first search.
http://www.wikiwand.com/en/A*_search_algorithm
• Dijkstra's is just A* where the heuristic / evaluation function for each node is null.
• Bellman-Ford algorithm:
http://www.wikiwand.com/en/Bellman%E2%80%93Ford_algorithm
• Slower than Dijkstra's but does handle negative edge weights. And can report the existence of negative cycles.
A negative cycle means that there is no shortest or cheapest path, because any presumably shortest path could be made shorter by going around the cycle again.
• Similar to Dijkstra's in that it depends on relaxation: each node's distance starts out as an overestimate and is iteratively improved through edge relaxation (diverting the old path through a new node to improve the distance).
• Different than Dijkstra's in that at each step the relaxation process is done on all edges at once, instead of just on the outgoing edges from some current node. Thus, this relaxation is done |V|-1 times.
• Implementation:
• "Simply put, the algorithm initializes the distance to the source to 0 and all other nodes to infinity. Then for all edges, if the distance to the destination can be shortened by taking the edge, the distance is updated to the new lower value. At each iteration i that the edges are scanned, the algorithm finds all shortest paths of at most length i edges. Since the longest possible path without a cycle can be |V|-1 edges, the edges must be scanned |V|-1 times to ensure the shortest path has been found for all nodes. A final scan of all the edges is performed and if any distance is updated, then a path of length |V| edges has been found which can only occur if at least one negative cycle exists in the graph."
• Minimum spanning tree (MST):
https://www.wikiwand.com/en/Minimum_spanning_tree and http://algs4.cs.princeton.edu/43mst/
• A minimum spanning tree is a spanning tree (a subgraph that is a tree and that connects all the vertices) that has the lowest total weight.
• Properties:
• If all edge weights are the same, then every spanning tree is a MST.
• On the contrary, if all edge weights are distinct, then there is 1 unique MST.
• There are |V| - 1 edges in a spanning tree.
• If unconnected graph, can still use MST algorithms to find the MST for each connected component, together forming a minimum spanning forest.
• If weights are positive (don't have to be), then the MST is just the subgraph with minimal weight that connects all the vertices. I.e., don't have to say it's a tree (which has no cycles), since those subgraphs which have cycles are necessarily greater weight.
• Algorithms:
• Note that:
• Cuts are very important, specifically, the cut property (the basis of the algorithms): Given any cut in an edge-weighted graph (with distinct weights), the crossing edge of minimum weight is in the MST of the graph.
http://algs4.cs.princeton.edu/43mst/
• Simple greedy algorithm:
• The following method colors black all edges in the the MST of any connected edge-weighted graph with V vertices: Starting with all edges colored gray, find a cut with no black edges, color its minimum-weight edge black, and continue until V-1 edges have been colored black.
http://algs4.cs.princeton.edu/43mst/
• Prim's algorithm:
• Prim's algorithm works by attaching a new edge to a single growing tree at each step: Start with any vertex as a single-vertex tree; then add V-1 edges to it, always taking next (coloring black) the minimum-weight edge that connects a vertex on the tree to a vertex not yet on the tree (a crossing edge for the cut defined by tree vertices).
• This is a relatively simple method, but the crucial step is efficiently detecting the minimum-weight crossing edge to be added at each step:
• Lazy way:
• Just maintain a priority queue (PQ) of the crossing edges (priority is obviously by minimum weight). Each time a crossing edge is added to the MST, a vertex is also added. The new crossing edges to be added to the PQ are just the edges from the new vertex to all non-MST vertices. However, there might be old edges in the PQ which are now ineligible (i.e., they are non-crossing and now connect two MST vertices). This way is lazy since it leaves these edges in the PQ.
• Eager way:
• The first improvement is to delete ineligible edges from the PQ, such that only crossing edges from the growing MST to the rest of the graph are present. But we can do even better. Since (1) we are only concerned with the minimal crossing edge and since (2) when we add a new edge / vertex v to the MST, the only possible improvement for each non-tree vertex w is that adding v brings w closer to the tree, we really just need to keep track of the minimum-weight edge and check whether the addition of v to the tree necessitates that we update that minimum (because of an edge v-w that has lower weight), which we can do as we process each edge. In other words, we maintain on the priority queue just one edge for each non-tree vertex: the shortest edge that connects it to the tree.
• Linear time for a graph is O(N + M) where N is the number of nodes and M is the number of edges.
• E.g., it's the runtime for BFS *and* DFS.
• Edge cases in graph problems are nodes with no edges, cycles, and loops (single-node cycles).
• DAGs and topological sorts are important concepts to know:
• Topological sort is relatively easy to implement, just remember the definition (for vertices u, v where u has a directed edge to v, u has to come before v in the topological ordering) and use a stack.
• Start at a given vertex, add to visited set, recursively (depth-first) visit and explore unvisited neighbors, when have fully explored v's neighbors add v to stack, and pop all at end for topologically sorted ordering.
• Also think of it as a modification of DFS:
• "We can modify DFS to find Topological Sorting of a graph. In DFS, we start from a vertex, we first print it and then recursively call DFS for its adjacent vertices. In topological sorting, we use a temporary stack. We don’t print the vertex immediately, we first recursively call topological sorting for all its adjacent vertices, then push it to a stack. Finally, print contents of stack. Note that a vertex is pushed to stack only when all of its adjacent vertices (and their adjacent vertices and so on) are already in stack."
darkmode