Intro

This document will list and explain popular Graph algorithms that are commonly used in ComputerScience and SoftwareDevelopment. // I’m sure they’re popular with DataScience and MachineLearning too.

Depth First Search

DFS is a Trees traversal algo that starts at the Root node and goes as Deep as possible on each branch before back tracking and switching to the next branch.

  • All children of a node are searched before moving to next sibling.
  • Utilizes a Stack (Stacks) to keep track of vertices to process.

How it works

  1. Start at root node.
  2. Travel down its children. // just pick one, often left to right…
  3. When a vertex is visited, add it to the visited stack.
  4. Make recursive call to DFS function and push to the stack
  5. When dead end, pop the vertex off the stack, this will then process the next vertex top of the stack (the one preceding it).

Pseudo code:

Algorithm Depth_First_Search(Graph G)
// Graph G = {V,E}
		// initialize visited to false for all vertices
		for each vertex v in V
				if v has not been visited
						dfs_helper(v)
						
function dfs_helper(Vertex v)
		// here we do "something" when visiting the node
		visit node v 
		for each vertex w in V adjacent to v
				if w has not been visited
						dfs_helper(w)

Super simplified:

def DFS(node):
	# mark nodes as visited, do something
	for each neighbor of node:
		if neighbor is not visited:
			DFS(neighbor)
	return

Info

Typical Results

  • List of vertices in order visited
  • List of vertices in order of “dead-ends” (when popped from stack)

DFS Tree

The resulting tree from a DFS traversal is called a DFStree, it contains all the edges used to visit the vertices. // Unused edges from the original tree are called “Back edges”.

Common problems

You’ll often encounter the following and similar problems when discussing DFS:

  • Find a spanning tree of a graph
  • Find a path between two vertices v and u
  • Find a path out of a maze
  • Determine if a graph has a cycle
  • Find all the connected components of a graph
  • Search the state-space of problems for a solution (AI)

Efficiency

The basic operation in DFS is the loop checking if the neighbor’s been visited. This loop is recursive, whether it’s DFS or a helper function, and will examine ALL the neighbors of every neighbor… The efficiency relies on the DataStructure used to implement the tree. So the efficiency must be:

Backtracking

A common pattern that many, myself included, struggle to fully grasp at times. #BackTracking is an extension of DFS but instead of traversing an existing structure like a graph or tree, here you need to build the structure as you explore options.

This is relevant for problems involving decisions that dynamically generate structure of the tree. // Think of combinatorial problems.

Scenario:

  1. You don’t have enough info to make informed decision at each step/node.
  2. Each decision will create a new set of choices.
  3. Some sequences of choices (potentially more than one) may be the best solution.

IDEA

  • Construct solutions one component at a time
  • If a partial solution can be developed further without violating constraints → Choose first legitimate option for the next component.
  • If there is no option for the next component → Backtrack to replace the last component of partial solution.

How it works

  1. At each node make a decision on how to proceed based on the problem. Each decision forms a branch.
  2. After each decision, you either get a valid solution or a dead-end.
    1. If dead-end, backtrack to previous decision to undo it and try another path.

Solutions as Trees

The solutions can be organized in a tree.

  • Each node is a state at one stage in the sequence of solutions.
  • The root represents the initials state before the search for a solution begins.
  • Nodes at the first level represent the first choice, second level are second choice…so on so forth.

Examples

// Try LeetCode problem 17: Letter Combinations of phone pad

  • n-Queens Problem
  • Hamiltonian Cycles

Breadth First Search

BFS also starts at the Root BUT explores all the vertices at the current depth level before going deeper or switching to another branch. BFS is great for finding the Shortest path in Unweighted graphs, e.g. shortest path in a grid.

How it works

  1. Start at the root node
  2. visit all neighbors of the same distance from the node, i,e, same depth level. Immediate neighbors first then their neighbors and so on so forth…
  3. Add visited vertices to a queue. Queues
  4. While the queue is not empty
    1. visit all adjacent nodes to the queue’s head.
    2. If not visited, visit and add to queue
    3. remove head from queue

Pseudo code:

Algorithm Breadth_First_Search(Graph G)
	// Graph G = {V,E}
	initialize visited to false for all vertices
	for each vertex v in V
		if v has not been visited
			bfs_helper(v)
								
function bfs_helper(Vertex v)
	visit node v
	initialize a queue Q
	add v to Q
	while Q is not empty
		for each w adjacent to Q.head
			if w has not been visited
				visit node w
				add w to Q
		Q.dequeue()

Or:

function BFS(start_node)
	mark start_node visited
	create queue and enqueue start_node

	while queue not empty
		current = dequeue queue // to get head of queue
		for each neighbor of current
			if neighbor not visited
				mark as visited
				enqueue neighbor

Info

Unlike DFS, BFS is NOT recursive. It uses First-In-First-Out ( FIFO) data structures like Queues to determine the next node to visit. This is embodied by the while loop

The edges from the original tree but not included in the BFStree are called “Cross Edges”.

Common problems

Frankly you can use BFS for anything that calls for DFS, but the prior is best for problems that require finding the shortest path.

Efficiency

Same efficiency as DFS:

  • Adjacency matrix:
  • Adjacency list:

Priority Queue (Heap)

Heaps are common implementations of priority queues and are almost synonymous with “top-K values” or “least-k values” type problems.

Heaps are trees where the elements (nodes) are organized in a specific manner.

  • Min Heap: smallest value is the root and each parent is smaller than both of its children.
  • Max Heap: Largest value is the root and each parent is larger than both of its children.

Heap problem twist

To find K largest, we need to use a Min Heap, whereas with K smallest we need a Max Heap! Example: To find the three smallest values in [3,6,1,5,9,12,8] we need to build a Max Heap with size 3. Go through and add first three values, on 4th value check if current is smaller than previous largest value (the root).

  • If it is, replace root.
  • Otherwise, you’ve found k-smallest values