Intro
This document will list and explain popular Graph algorithms that are commonly used in computerscience and softwaredevelopment.
// I’m sure they’re popular with datascience and machinelearning too.
Depth First Search
dfs is a graphs (including Trees) traversal algorithm that starts at the starting/root node and goes as Deep as possible on each branch before back tracking and switching to the next branch.
- All children of a node are searched before moving to next sibling.
- Uses Stacks to keep track of vertices to process
- Can use a boolean array with (size
|V|) to check visited nodes, toggle the boolean at index correlating with a node
How it works
- Start at root node.
- Travel down its children // just pick one, often left to right…
- When a vertex is visited, add it to the
visitedstack. - Make recursive call to DFS function or
pushto the stack, depending on recursive or iterative implementation - When dead end,
popthe vertex off the stack, this will then process the next vertex top of the stack (the one preceding it).
Algorithm
- Init a
visitedarray to track what’s been visited
// If iterative then also init a Stack and add the starting vertex - if
vnot visited, mark as visited - for each neighbor of v, check if visited and call
dfsrecursively or push to stack
Pseudo code (Wikiepedia):
Algorithm Depth_First_Search(Graph G)
// Graph G = {V,E}
// initialize visited to false for all vertices
for each vertex v in V
if v has not been visited
dfs_helper(v)
function dfs_helper(Vertex v)
// here we do "something" when visiting the node
visit node v
for each vertex w in V adjacent to v
if w has not been visited
dfs_helper(w)
Super simplified:
def DFS(node):
visited # array of visited nodes
# mark nodes as visited, do something
visited add node
for each neighbor of node:
if neighbor is not visited:
DFS(neighbor)
returnFav Implementations
Recursive
visited = False * len(g)
def dfs(g: Graph, v: Vertex):
if g is None:
return None
visit v # visited[v.val] = True to mark in array
for w in g.neighbors(v):
if w not visited: # visited[w.val] == False
dfs(g, w)
returnIterative
def dfs_iter(g: Graph, v: Vertex):
visited = False * len(g)
stack = deque() # new Stack with v
stack.push(v)
while stack:
cur = stack.pop()
if v not visited:
visit cur
for w in neighbors of cur:
if w not visited:
stack.push(w)
Typical Results
- List of vertices in order visited
- List of vertices in order of “dead-ends” (when popped from stack)
DFS Tree
The resulting tree from a DFS traversal is called a dfstree, it contains all the edges used to visit the vertices.
// Unused edges from the original tree are called “Back edges”.
Common problems
You’ll often encounter the following and similar problems when discussing DFS:
- Find a spanning tree of a graph
- Find a path between two vertices v and u
- Find a path out of a maze
- Determine if a graph has a cycle
- Find all the connected components of a graph
- Search the state-space of problems for a solution (AI)
Efficiency
The basic operation in DFS is the loop checking if the neighbor’s been visited.
This loop is recursive, whether it’s by calling the function or using a stack, and will examine ALL the neighbors of every neighbor…
The efficiency relies on the datastructure used to implement the tree.
So the efficiency must be:
- for Adjacency Matrix
- for Adjacency Lists
Backtracking
A common pattern that many, myself included, struggle to fully grasp at times.
backtracking is an extension of dfs but instead of traversing an existing structure like a graph or tree, here you need to build the structure as you explore options.
This is relevant for problems involving decisions that dynamically generate structure of the tree. // Think of combinatorial problems.
Scenario:
- You don’t have enough info to make informed decision at each step/node.
- Each decision will create a new set of choices.
- Some sequences of choices (potentially more than one) may be the best solution.
IDEA
- Construct solutions one component at a time
- If a partial solution can be developed further without violating constraints → Choose first legitimate option for the next component.
- If there is no option for the next component → Backtrack to replace the last component of partial solution.
How it works
- At each node make a decision on how to proceed based on the problem.
Each decision forms a branch. - After each decision, you either get a valid solution or a dead-end.
- If dead-end, backtrack to previous decision to undo it and try another path.
Solutions as Trees
The solutions can be organized in a tree.
- Each node is a state at one stage in the sequence of solutions.
- The root represents the initials state before the search for a solution begins.
- Nodes at the first level represent the first choice, second level are second choice…so on so forth.
Examples
// Try LeetCode problem 17: Letter Combinations of phone pad
- n-Queens Problem
- Hamiltonian Cycles
Breadth First Search
bfs also starts at the Root BUT explores all the vertices at the current depth level before going deeper or switching to another branch.
BFS is great for finding the Shortest path in Unweighted graphs, e.g. shortest path in a grid.
How it works
- Start at the root node
- visit all neighbors of the same distance from the node, i,e, same depth level
Immediate neighbors first then their neighbors and so on so forth… - Add visited vertices to a queue. Queues
- While the queue is not empty
- visit all adjacent nodes to the queue’s head
- If not visited, visit and add to queue
- remove head from queue
Algorithm
- Init the visited array to track what’s visited
- Init the Queue and add the starting vertex
v - while Queue is not null nor empty:
curris dequeued from Queue- if
currnot visited then visit it- for each neighbor of
curr, check if visited or not, and add to Queue accordingly
Pseudo code:
- for each neighbor of
Algorithm Breadth_First_Search(Graph G)
// Graph G = {V,E}
initialize visited to false for all vertices
for each vertex v in V
if v has not been visited
bfs_helper(v)
function bfs_helper(Vertex v)
visit node v
initialize a queue Q
add v to Q
while Q is not empty
for each w adjacent to Q.head
if w has not been visited
visit node w
add w to Q
Q.dequeue()
Iterative Queue implementation
def BFS(start_node)
mark start_node visited
create queue and enqueue start_node
while queue not empty
curr = dequeue queue # to get head of queue
if curr not visited:
for each neighbor of curr:
if neighbor not visited:
if not visited:
enqueue neighborInfo
When Doing BFS on a TREE
The number of nodes per level at any given time is the size of the queue. This is super handy for problems requiring Level Order Traversal like this Binary Tree Level Order Traversal
The edges from the original tree but not included in the bfstree are called “Cross Edges”.
Common problems
Frankly you can use BFS for anything that calls for DFS, but the prior is best for problems that require finding the shortest path.
Efficiency
Same efficiency as DFS:
- Adjacency matrix:
- Adjacency list:
Level Order Traversal
levelordertraversal is a technique to traverse trees , visiting the nodes level-by-level starting at the root. This algorithm also uses a Queue to process nodes, however since trees don’t have cycles, no need to use a visited set.
Unlike BFS, this technique is trees-specific.
Algorithm
- Start with the root node, enqueue it.
- While the queue is not empty:
- Dequeue a node, process it.
- Enqueue its children (left to right, for binary trees) to the queue.
- Repeat until the queue is empty
BFS vs Level Order
| Category | BFS | Level Order Traversal |
|---|---|---|
| Domain | Graphs (directed, undirected, cyclic, etc.) | Trees (hierarchical, acyclic, single root) |
| Cycle Handling | Needs a visited set to avoid cycles | Trees don’t have cycles, no need for visited set |
| Starting Point | Can start at any vertex in the graph | Starts at Tree root node |
| Output Interpretation | Traversal order (flat list of nodes). | Explicit level-wise grouping (e.g., Level 0: [1], Level 1: [2,3]). |
| Purpose | General graph exploration (e.g., shortest path, connected components). | Tree-specific tasks (e.g., height calculation, level-wise insertion). |
Priority Queue (Heap)
heaps are common implementations of priority queues and are almost synonymous with “top-K values” or “least-k values” type problems.
Heaps are trees where the elements (nodes) are organized in a specific manner.
- Min Heap: smallest value is the root and each parent is smaller than both of its children.
- Max Heap: Largest value is the root and each parent is larger than both of its children.
Heap problem twist
To find K largest, we need to use a Min Heap, whereas with K smallest we need a Max Heap!
Example:
To find the three smallest values in [3,6,1,5,9,12,8] we need to build a Max Heap with size 3.
Go through and add first three values, on 4th value check if current is smaller than previous largest value (the root).
- If it is, replace root.
- Otherwise, you’ve found k-smallest values