Data Structures in C
Itâs impossible to name all data structures (there are infinitely many variants and specialized forms), but below is a comprehensive overview of many commonly known or fundamental data structures. They are often grouped by the type of operations they support and how they store data. Each bullet below is itself a broad category or a well-defined data structure.
A contiguous block of memory storing elements of the same type or of different types (generic Dynamic Array). Access by index in O(1).
A linked list is a fundamental data structure where each element (node) holds some data and a reference (pointer) to one or more other nodes. Unlike arrays (which store elements contiguously in memory), a linked listâs elements can be scattered throughout memory, with each node âlinkingâ to the next node in the sequence. This structure enables flexible insertion and deletion operations without shifting large blocks of memory.
In the simplest formâa singly linked listâeach node has:
Data: The value or payload the node holds (e.g., an integer, string, or any other structure). Pointer to the next node: A reference (or pointer) to the next node in the sequence. A standard singly linked list has:
A special pointer called the head that points to the first node. The last nodeâs pointer to ânextâ is typically NULL, indicating the end of the list.
head -> [ data | next ] -> [ data | next ] -> ... -> NULL
Typically O(n) because you usually have to traverse the list from the beginning until you find the target or reach the end.
Like a linked list except that each node has pointers to both next and previous nodes.
Follows LIFO (Last-In, First-Out) semantics. Operations: push, pop, peek. Implemented via arrays or linked lists.
Follows FIFO (First-In, First-Out) semantics. Operations: enqueue, dequeue. Often implemented with arrays or linked lists.
Allows insertion and removal at both ends. Can simulate stacks and queues, and is typically implemented via a linked list or a specialized array structure.
Conceptually can be seen as an array of characters (plus a terminator in many languages). Specialized operations like concatenation, substring, etc.
Below is a list of commonly used C string functions from the <string.h>
library:
- Description: Calculates the length of a string (number of characters) excluding the null terminator (
\0
). - Syntax:
size_t strlen(const char *str);
- Example:
char str[] = "Hello"; size_t len = strlen(str); // len = 5
- Description: Copies the contents of one string (including the null terminator) to another
- Syntax:
char *strcpy(char *dest, const char *src);
- Example:
char str[] = "Hello"; char dest[10]; strcpy(dest, src); //dest = "Hello"
- Description: Copies up to n chars from one string to another. If src is shorter than n, the remaining chars in dest are padded with null bytes
- Syntax:
char *strncpy(char *dest, const char *src, size_t n);
- Example:
char str[] = "Hello"; char dest[10]; strncpy(dest, src, 3); //dest = "Hel"
- Description: Appends (concatenates) one string to the end of another
- Syntax:
char *strcat(char *dest, const char *src);
- Example:
char dest[20] = "Hello"; char src[] = " World"; strcat(dest, src); //dest = "Hello World"
- Description: Appends up to n charachters from one string to the end of another
- Syntax:
char *strncat(char *dest, const char *src, size_t n);
- Example:
char dest[20] = "Hello"; char src[] = " World"; strcat(dest, src, 3); //dest = "Hello Wo"
- Description: Compares two strings lexicographically. Returns 0 if the strings are equal. A negative value if str1 is less than str2. A positive value if str1 is greater than str2.
- Syntax:
int strcmp(const char *str1, const char *str2);
- Example:
char str1[] = "Hello"; char str2[] = "World"; int result = strcmp(str1, str2); //result < 0
- Description: Compares up to n characters of two strings lexicographically.
- Syntax:
int strncmp(const char *str1, const char *str2, size_t n);
- Example:
char str1[] = "Hello"; char str2[] = "Heaven"; int result = strcmp(str1, str2, 2); //result = 0 (first 2 characters are equal)
- Description: Finds the first occurrence of a character in a string. Returns a pointer to the character or NULL if not found
- Syntax:
char *strchr(const char *str, int c);
- Example:
char str[] = "Hello"; char *ptr = strchr(str, 'e'); //ptr points to 'e'
- Description: Finds the last occurrence of a character in a string. Returns a pointer to the character or NULL if not found
- Syntax:
char *strrchr(const char *str, int c);
- Example:
char str[] = "Hello"; char *ptr = strrchr(str, 'l'); //ptr points to the second 'l'
- Description: Finds the first occurrence of a substring in a string. Returns a pointer to the beginning of the substring or NULL if not found.
- Syntax:
char *strstr(const char *haystack, const char *needle);
- Example:
char str[] = "Hello World"; char *ptr = strstr(str, 'World'); //ptr points to 'World'
- Description: Splits a string into tokens based on a set of delimiters. Modifies the original string by inserting null terminators.
- Syntax:
char *strtok(char *str, const char *delim);
- Example:
char str[] = "Hello,World,How,Are,You"; char *token = strtok(str, ","); while (token != NULL){ printf("%s\n", token); token = strtok(NULL, ","); }
A layered, probabilistic data structure that allows O(log n) average insertion/search/deletion. Conceptually built on top of a linked list with additional âexpressâ links to skip over nodes.
insert(SkipList *sl, void *data)
The way the insert algorithm works. First we create an update list, which will help us in the last step when going to update the references and insert the new node. The goal of the algorithm is to create this update list and then use it. So first we want to skip over the top level to a node that is right before our possible insertion point. Then we move down and skip from that same node (current is used at multiple levels). This means that our skipping is leading to log(n) performance, because we are carrying the node over to the next levels. Every level of the structure we want to update the last node we skipped to into our update list. So at level 2 update[1]=current after we skipped current over. At level 1 update[0]=current after we skipped, etc. Once we have updated update[], we then want to check if the data in our skip list contains the insertion data. This means we update current one last time to current = current->forward[0]. Current could be Null at this point, but it might contain the exact data we are trying to insert. We check and return if so, otherwise we no longer need current as our update table remembers how far we skipped at each level. Now we need to flip a coin to see how many nodes are to be inserted in a tower like structure. We update sl->level and add new levels as needed to update[newLevel]. Next we create the data for the new node. Finally we use the update list to insert the new node at each level, updating references.
Stores keyâvalue pairs for average O(1) lookup and insertion, but can degrade to O(n) in worst case. Common collision handling methods: chaining (linked lists) or open addressing (linear probing, quadratic probing, etc.).
Each node has up to 2 children (left, right). Used for hierarchical data, expression parsing, etc.
A binary tree enforcing ordering constraints: all keys in the left subtree < nodeâs key, and all keys in the right subtree > nodeâs key.
Allows search, insertion, and deletion in O(h) time, where h is tree height.
Red-Black trees and AVL trees are both self-balancing binary search trees, guaranteeing O(logn) time complexity for insertions, deletions, and lookups. However, they balance themselves in different ways and have slightly different performance characteristics, which can make Red-Black trees preferable in some scenarios:
Red-Black trees typically perform at most two rotations to rebalance after an insertion or deletion. AVL trees, being more rigidly balanced (height-balance), can require several rotations in the worst case for a single insertion or deletion.
If your application involves many insertions and deletions, the fewer rotations in a Red-Black tree can lead to better overall performance. Easier (in Practice) to Implement
While both data structures can be implemented with standard balancing logic, many programmers find the Red-Black insertion and deletion rules to be a bit more straightforward compared to tracking balance factors and performing the ârotation cascadesâ typical of AVL trees.
This practical simplicity and conventional use is one reason libraries like the C++ std::map and std::set or Javaâs TreeMap are traditionally implemented as Red-Black trees. Less Strict Height Balancing
An AVL tree maintains a very strict balance condition (the difference in heights of left and right subtrees for any node is at most 1), so it tends to have a smaller height, which is good for lookups but can cause more rotations to maintain that property.
A Red-Black treeâs balance criterion is looser (color-based rules rather than exact height checks), resulting in slightly taller trees on average but fewer, simpler rebalancing operations.
AVL: Usually has faster lookups if your dataset is fairly static (because its height is more tightly controlled), but insertions and deletions can be more expensive due to extra rotations.
Red-Black: Offers a good balance of performance across all operationsâlookups, insertions, and deletionsâmaking it a common default choice when you need a general-purpose self-balancing BST.
If search performance is the absolute priority (e.g., you have very few inserts/deletions but lots of lookups), then AVL might offer a slight edge because itâs more strictly balanced, ensuring a shorter height.
If the rotation cost is not significant in your application and you want the best possible search times, you might prefer an AVL tree. In practice, however, Red-Black trees are often used as the standard go-to self-balancing BST in many libraries because they provide efficient and predictable performance over a wide range of operations and use cases.
- A node is either red or black
- Root is always black
- New insertions are always red (and then are recolored depending on the circumstances)
- Every path from root to leaf has the same number of black nodes.
- No path can have two consecutive red nodes
- Nulls are black
- If we have a black aunt, we rotate (BAR). After a rotation the three nodes we are working on end up as
black
/
red red - If we have a red aunt, we color flip. After a color flip the three nodes we are working on end up as
Red
/
black black
https://www.youtube.com/watch?v=nMExd4DthdA&list=PLOXdJ6q8iu4MneI9gySCHiyzAQcveqkIO
Even if a search for a key fails (the key doesn't exist), splaying brins the node closest to that key to the root. This is beneficial if future searches are near that same key or if you end up inserting that key next (the place it needs to ggo is now near the root). Similarlyy, after you insert a node or delete a node, you splay so that frequent or recent acesses stay near the top.
- Rotate Left
- Rotate Right
- Rotate Right Right
- Rotate Left Left
- Rotate Right Left
- Rotate Left Right
A Treap is a binary search tree on keys and a heap on random priorities. After each standard BST insertion, rotations ensure that a nodeâs parent always has a larger or equal priority. Because priorities are random, the tree is balanced in expectation with O(log n) average operation time.
For any node, all keys in its left subtree are smaller than the nodeâs key, and all keys in its right subtree are larger. Heap property (by priority):
Each node also has a priority (often a random number). For a max-heap property, for example, a nodeâs priority is greater than or equal to the priorities of its children.
In practice, the priority is usually chosen randomly when a node is inserted. When you insert a node, you:
First insert it into the BST by key (as if you were inserting into a regular BST). Then ârotateâ the node up or down to fix any violation of the heap property (i.e., making sure each nodeâs priority remains higher or equal to its children if using a max-heap convention).
âRotateâ the node down until it becomes a leaf (fixing the heap property by rotating it towards a direction that maintains the priority structure). Then remove it from the tree.
By assigning random priorities to each node and imposing the heap property, the treeâs shape tends to be balanced on average. Specifically, the expected height of a treap is O(log n) with high probability, leading to average O(log n) times for operations such as search, insert, and delete.
A treap is a BST with respect to the key and a heap with respect to the priority. The priorities are typically random, which ensures a high probability of balanced trees without complex balancing steps used by other self-balancing BSTs (like AVL or Red-Black Trees). All main operations (search, insert, delete) can be performed in O(log n) expected time.
Treap deletion rules:
- Find the node via standard BST search (compare keys, go left/right)
- If the node has:
- 0 children (leaf): just remomve it
- 1 child: remove the node and link the child up
- 2 children: rotate the node down (left or right) until it becmes a case of 0 or 1 child, then remove it. The rotation direction depends on the priorities of the child subtrrees. If left->priority > right->priority, rotate right. Else, rotate left. This pushes down the node to be deleted while keeping the heap property intact on each step.
A B-tree is a self-balancing tree data structure commonly used in databases and file systems to store and manage large volumes of data efficiently. Unlike traditional binary search trees, which have at most two children per node, B-trees can have many children per node (often called "branches" or "subtrees"). This multi-way branching design enables B-trees to maintain shallow height even with large data sets, leading to fewer disk or I/O operationsâan especially important consideration in database and storage systems.
- Multi-way branching:
- Each node can have multiple keys (or values) and can point to multiple children.
- The number of keys and children in a node is constrained by a parameter typically called the order or minimum degree of the B-tree.
- Height is kept small:
- Because each node can hold multiple keys, the treeâs height grows more slowly than a regular binary tree.
- This reduces the number of disk accesses needed to find or insert data.
- Balanced structure:
- B-trees enforce balance by ensuring that each non-root node has at least a certain minimum number of keys and children, and up to a defined maximum.
- As new keys are inserted or old keys are deleted, nodes split or merge to maintain balance, preventing the tree from becoming too tall or skewed.
- Efficient insertion, deletion, and search:
- All of these operations can be performed in O(log n) time.
- B-trees are designed for scenarios where slow disk access (or other forms of high-latency storage access) dominates performance.
- By minimizing the number of disk reads/writes, B-trees achieve better overall performance.
- Common uses:
- Database indexing (e.g., many database management systems use B-trees or variants like B+ trees to store index data).
- File systems (often used to store directory information, file metadata, and block indices).
- Keyâvalue stores and other high-performance storage engines.
- Variants of B-trees:
- B+ tree: A popular variant where all values (records) are stored in the leaf nodes, and internal nodes store only keys used for navigation. This design can provide better range query performance.
- B-tree*, B#-tree, etc.: Variants focusing on different performance characteristics or implementation details.
Overall, the B-treeâs design is tailored for environments where data must be read from and written to large, slow storage blocks (such as disk pages). By storing multiple keys per node and keeping the tree height small, B-trees reduce expensive I/O operations, making them ideal for large-scale data storage systems.
- Traversal Logic
- When searching for a key, if you reach a leaf node and still havenât found the key, you know the key is not in the tree.
- Conversely, if a node is not a leaf (i.e., an internal node), you know there are child pointers you can (and should) follow to continue your search.
- Insertion and Split Rules
- Insertion behaves differently depending on whether the node is a leaf or internal.
- In a leaf node, you can directly insert the new key (if there is room).
- In an internal node, you typically descend into a child. If that child is full, you split it first, then move down to the correct child.
- Knowing if a node is a leaf is critical for deciding these steps.
- Deletion Logic
- When deleting a key, if it is found in a leaf, you simply remove it.
- If found in an internal node, you have to swap with a predecessor (or successor) key or merge children.
- The deletion algorithm for B-Trees is more involved, but it always checks whether a node is leaf to decide which case applies.
- Implementation Simplification
- Marking a node as leaf = true or false makes the code simpler: you donât have to guess if the children[] array is valid or not.
- A leaf node has no valid children, whereas an internal node must keep track of them.
- This single boolean flag allows quick checks anywhere in the code to determine how to handle that node.
So, the leaf flag makes it efficient and clear to handle B-Tree nodes differently when they do or do not contain child pointers.
When you insert, you start at the root and mmove down through the children until you reach the appropriate leaf. If the root is full from the start, you'll eventually have to split it anyway. It's easierr and cleaner to split before descending. After the root is split, the newly created root will have fewerr keys, and you can keep descending to the correct leaf child, splitting chhild nodes as needed along the way.
We use a linear scan for the key:
for(i=0; i<cur->nkeys && tree->cmp(keyy, cur->keys[i])>0; i++);
The loop advances i until one of the two conditions is met:
- We have checked all cur->nkeys
- We encounter a key that is greater than or equal to key.
Next we check if we encountered the key
if (i<cur->nkeys && tree->cmp(key, cur->keys[i])==0){return cur->keys[i];}
else if (cur->leaf){return NULL;}
else {cur=cur->children[i];}
The final statement is of incredible importance:
- It shows how the B-Tree is structured.
- If we stop on the i'th key in the key list, then we also descend into the i'th child.
- To understand further, you would need to understand how the insert function works for B-Tree
- Remember B-Tree's have nodes that have between t-1 and 2t-1 keys, and the number of children = (number of keys)+1
- If a node has n keyys, they act as "dividers" splittingg the key space into n+1 regions. Each gap corresponds to one child pointer
- When we split the full child, we are never actually zero'ing out the key locations that move to new nodes, we are merely keeping track of nkeys of the old child. This is because we essentially copy the right half of the full child into a new child, then we promote the middle key, and keep the rest (to the left) in the old child. That's why we don't have to erase anything.
A B+ tree is a specialized data structure often used in databases and filesystems to maintain sorted data in a way that supports efficient insertion, deletion, and lookup operations (especially for range queries). It is an extension of the B-tree data structure, with some key differences that make it particularly well-suited for disk-based or other secondary-storage-based indexing.
- Balanced Tree Structure
- Like B-trees, B+ trees are balanced, meaning all leaves appear at the same level (the height of every leaf is the same).
- This ensures that searches, insertions, and deletions take O(logn) time in terms of the number of elements n.
- Nodes Have a Range of Children
- B+ trees, like B-trees, have an order (often denoted as m), which specifies the maximum number of children a node can have.
- Each non-leaf (internal) node can have between âm/2â and m children (except possibly the root which can have fewer).
- Each leaf node can also hold a certain range of record pointers or data entries.
- Separation of Internal Keys and Data
- Internal nodes (also referred to as âindex nodesâ) only store keys (up to m-1) that act as separators (guides) to direct the search.
- Leaf nodes store the actual data values (in the form of record pointers or references to the data).
- This is the key difference from a standard B-tree, where both internal nodes and leaves can store actual data.
- Linked Leaves
- In a B+ tree, all leaf nodes are linked together in a linked list (often referred to as a horizontal link or leaf chain).
- This allows efficient range queries, because once a search identifies the starting leaf, the linked list can be traversed to retrieve subsequent values without having to climb back up into the internal tree structure.
- High Fan-Out
- Because nodes can hold multiple keys and children, B+ trees have a high fan-out.
- This reduces the height of the tree and typically optimizes for disk-based block I/O by minimizing the number of pages/blocks that need to be read.
Letâs consider a B+ tree of order m. The structure follows certain rules:
- Root Node
- Special cases apply to the root node. The root can have fewer than âm/2â children if it is not full or if the tree has few elements. However, beyond a minimal threshold, it follows the same constraints as other internal nodes.
- Internal Nodes
- Each internal node contains up to mâ1 keys and up to m child pointers.
- Each key in an internal node serves as a separator defining ranges for the child pointers. For example, if an internal node has keys [đ1,đ2,âŚ,đđ], the pointers between and around these keys direct the search for values less than đ1, between đ1 and đ2, etc.
- Leaf Nodes
- All actual data records or pointers to data are stored in the leaf nodes.
- Each leaf node can hold between âđ/2â and m entries (depending on the exact variant of the B+ tree).
- Each leaf node has a pointer to the next leaf node (and optionally a pointer to the previous leaf node for a doubly linked structure). This is crucial for efficient range queries.
- Search (Lookup)
- Start from the root node.
- Compare the search key with the keys in the current node to find the correct child pointer to follow.
- Move down the tree level by level until you reach a leaf node.
- Within the leaf node, search through its entries for the specific key (if it exists).
- Return the record pointer or indication that the key was not found. Time Complexity: đ(logđ) due to the balanced tree property.
- Insertion
- Search for the correct leaf node where the new key should be inserted.
- Insert the key into the leaf node (keeping the leaf nodeâs keys in sorted order).
- If the leaf node does not overflow (i.e., does not exceed m entries), the insertion is complete.
- If the leaf node overflows:
- Split the leaf into two nodes, typically around the median key.
- Promote the middle key (or a separator key) to the parent internal node.
- If the parent internal node also overflows, recursively split and promote further up.
- If the split reaches the root (and the root overflows), create a new root node and increase the tree height by one.
Time Complexity: đ(logđ)
- Deletion
- Search the leaf node containing the key to be deleted.
- Remove the key from the leaf node.
- If the leaf node still has enough entries (at least âđ/2â), no further action is required.
- If it underflows (i.e., has fewer entries than âđ/2â):
- Attempt to âborrowâ an entry from a sibling node if the sibling has extra.
- If borrowing is not possible, merge the leaf with a sibling, effectively reducing the number of leaf nodes by one.
- Update or remove the corresponding separator key in the parent internal node.
- If the parent underflows, recursively handle it in the same manner (borrow from or merge with sibling).
- If merging/splitting occurs at the root and results in underflow, the root can be adjusted.
Time Complexity: đ(logđ) similarly to insertion.
NOTE: Works on GCC but not CLANG compiler. This is definitely a bad bug, but I can't really fix the compiler. I will try and look for a workaround in the code, but no promises. Just use GCC for now.
A specialized tree-based structure (often represented implicitly via an array) where the parentâs key is either always larger (max-heap) or smaller (min-heap) than its children. Supports efficient retrieval of min/max in O(1) and insertion/deletion in O(log n).
Min-heaps are specialized data structures that always give you quick access to the smallest (or âminimumâ) element in a collection. Because of this guaranteed access to the minimum, min-heaps are commonly used in any scenario where you repeatedly need to extract the smallest item or reorder items based on priority (where âpriorityâ is usually defined in ascending order). Below are some of the most common applications:
A priority queue is an abstract data type where each element has a "priority." A min-heap efficiently implements a min-priority queue, meaning you can always remove (pop) the smallest element first. Common usage examples include job scheduling where tasks with the highest priority (often the lowest âpriority numberâ means urgent) are processed first.
In Dijkstraâs algorithm for shortest paths in a weighted graph, you repeatedly pick the vertex with the smallest tentative distance and relax (update) its neighbors. A min-priority queue (min-heap) is typically used to retrieve the next closest vertex efficiently. Similarly, Primâs algorithm for minimum spanning trees also uses a min-heap to select the next edge of minimum weight.
Huffmanâs algorithm builds an optimal prefix code (often used in data compression). It repeatedly merges the two smallest-frequency nodes (from a priority queue) into a combined node. A min-heap makes it efficient to find and remove the smallest two frequencies each time. Order Statistics
If you only need to track and repeatedly extract the smallest k elements in a streaming fashion, a min-heap can help. (Or in some cases, people use max-heaps to keep track of largest elements â but the heap approach is similar.)
A classic solution for finding a running median uses a max-heap for the lower half of numbers and a min-heap for the upper half. Balancing these two heaps lets you query the median in O(1) time and insert new elements in O( logn) time.
In operating systems or real-time systems, tasks/jobs often have priorities, and you may need to pick the highest-priority (lowest numeric value) job next. A min-heap is well-suited for this type of scheduling queue.
When you have multiple sorted lists and want to merge them into one sorted list (for example in external sorting or in certain types of database queries), you can keep one element from each list in a min-heap. You always extract (pop) the smallest of these, then insert the next element from the list that was just taken. In general, whenever you need quick extraction of the smallest element (in approximately O(logn) time), or you need to insert new elements while preserving a structure where the smallest can be found/removed efficiently, a min-heap is a perfect fit.
A tree specialized for storing strings by character. Each edge typically represents one character. Enables fast prefix lookups.
Specialized tree for storing information about intervals, segments (e.g., sums over array ranges). Allows O(log n) queries and updates on intervals.
A segment tree is a specialized binary tree data structure that stores information about intervals (or segments) of an array. It allows you to efficiently query and update information over a range of indices, making it very useful for problems that require repeated range queries and updates.
- The array is split into two halves recursively until each segment represents a single element.
- Each node in the tree covers a specific segment of the array, typically identified by a range [đż,đ ]
- Each node stores some information about the segment it representsâcommon examples include:
- Sum of the elements in the segment
- Minimum or Maximum value in the segment
- Greatest Common Divisor (GCD) of the segment
- These values in the parent node can be computed by combining values from its children.
- Once built, a segment tree can answer queries like
- âWhat is the sum of the elements in the range [đ,đ]?â or
- âWhat is the minimum value in [đ,đ]?â
- in O(logđ) time, where đ is the size of the array.
In addition to queries, segment trees handle updates (like changing the value of a single element) in O(logđ) time as well. After updating a leaf node, you only need to recalculate the information up the tree path to the root.
A segment tree typically requires about 4n space in the worst case (where đ is the size of the array), which is still considered đ(đ).
A set of vertices (nodes) connected by edges (links). Can be directed, undirected, weighted, or unweighted.
In essence, the core logic of BFS, DFS, or Dijkstraâs remains the same regardless of whether the graph is directed or undirected, weighted or unweighted. However, there are small but important adjustments in how you apply each algorithm to different graph types:
- Purpose: Typically used for unweighted graphs (or graphs where every edge has the same cost) to find the shortest path in terms of the number of edges, or to explore all connected vertices from a starting point.
- In an undirected graph, you follow edges in both directions automatically.
- In a directed graph, you only follow edges in the forward direction (i.e., if thereâs an edge đ˘âđŁ, but not đŁâđ˘, you only enqueue v when youâre at đ˘).
- BFS inherently doesnât handle varying weights.
- If all edges have the same weight (e.g., weight = 1), BFS still finds the shortest path in terms of ânumber of edgesâ (or total uniform cost).
- If the graph has different weights, BFS is no longer suitable for shortest paths, and youâd typically use Dijkstra or another weighted shortest-path algorithm.
Summary: The BFS procedure itself does not change muchâjust respect the direction of edges if the graph is directed. If the graph is weighted with different costs, BFS wonât give the true minimal-cost path.
- Purpose: Depth-first search is used mostly to explore or check connectivity, detect cycles, or do topological sorts (in directed acyclic graphs), etc. It doesnât care about edge weights when simply exploring.
- In an undirected graph, you can move both ways between connected vertices.
- In a directed graph, you only traverse in the allowed direction.
- DFS makes no use of weights in the standard algorithm.
- You can ignore weights or treat them as present but irrelevant.
Summary: DFS is unaffected by weights. The only difference between undirected and directed is whether you consider edges in both directions (undirected) or in the specified direction only (directed).
- Purpose: Finds the shortest path in a graph where edges have nonnegative weights.
- Conceptually, Dijkstraâs logic is the same:
- Keep track of a âdistanceâ array.
- Use a priority queue (min-heap) to pick the next closest vertex.
- Update (relax) distances to neighbors.
- In an undirected graph, each undirected edge (đ˘,đŁ) is stored internally as two directed edges đ˘âđŁ and đŁâđ˘.
- The algorithm still relaxes edges from each node to its neighbors.
- In a directed graph, edges are only in one direction.
- You only relax edges đ˘âđŁ if they exist.
- Weighted (nonnegative): Dijkstra is the standard choice to get the minimal total cost path.
- Unweighted: Dijkstra still âworksâ but is overkillâBFS is typically simpler and more efficient if all edges have the same cost (like weight = 1).
- If there are negative edge weights, you need a different algorithm (e.g., Bellman-Ford).
Summary: The Dijkstra procedure is the same. The difference is in how edges are stored (two directions for undirected vs. one for directed) and whether each edge has a real cost or a uniform cost.
- The fundamental BFS/DFS logic doesnât really change; you just decide whether an edge (đ˘,đŁ)
- (u,v) exists in one direction or both.
- Weights do not affect BFS or DFS unless all weights are the same (in which case BFS can still find the shortest path in âhopsâ) or you are ignoring weights entirely (typical for DFS).
- Dijkstra is always the same ârelaxation + priority queueâ approach.
- The only variation is in how you build your adjacency list or matrix for directed vs. undirected edges.
- So the algorithms themselves are conceptually the same across the different graph âtypes.â The real difference is whether or not you:
- Consider edges in both directions (undirected) vs. one direction (directed).
- Use or ignore edge weights (for BFS/DFS, you ignore them; for Dijkstra, you use them if they are nonnegative).