Data Structures: Graph
Have you ever wondered how does connections work on LinkedIn or any other social media? How does GPS navigation systems or Google Maps Platform work on suggesting us the routes? How do we get recommendations on various e-commerce websites, under the category of ‘Recommended for you’?
And the answer is Graph!! Graph is a non-linear data structure, in simple words ‘A graph is a set of vertices and a collection of edges that each connect a pair of vertices.’
Before moving any further let’s take a look at some basic graph terminologies:
- Vertex: Each node is represented as a vertex of a graph. In the following example the black dots are the vertices of the graph.
- Edge: Edge is the line which connects two vertices of a graph, it can be thought of as a path from one vertex to another vertex ( more on that later). In the following example the black lines connecting the vertices are the edges of the graph.
- Adjacency: Two node or vertices are the adjacent if they are connected to each other through an edge.
- Path: Path is a sequence of vertices connected by edges. It similar to a map in real life the vertices are the locations and the edges are the roads connecting the locations, and there may be various paths from one place to the other. This concept leads to one of the interesting graph processing algorithm on finding the shortest path from one vertex to another.
- Cycle: A path is said to be a cycle, if the first and last vertices of the path are the same.
Now, let’s take a second look at the examples with which we begun this article, to get some intuition out of it by referencing some of the graph terminologies which we have seen.
- Google Maps: Google uses graph data structure for representing the road networks connecting various places, where the intersection of two roads are considered to be a vertex and the road connecting two vertices is considered to be an edge, and route suggestion is based on the shortest path between two vertices.
- Social networks: A user’s profile can be can be considered as a vertex or any events such as stories, groups which has properties that store data is a vertex. And every connection or relationship is an edge.
- Recommender systems on e-commerce sites: We can wrap this concept into a graph where each node is a product and each edge is a relation between two products. Each node in the graph has a weight that will be used to compute the graph edge weight. To find recommendations for a product, some odd nearest products are searched, based on the user’s data and the suggestions are returned back to the user.
I believe we have got some descent intuition on graph, but there may be some lingering thought on the back of your mind, how can this be represented as a data structure, so that we could get these awesome functionalities into action. So let’s build a basic undirected graph.(spoiler alert! I’m going to use Java, as I’m more comfortable with it, but I will be providing the API for it so you could have fun building your own!)
Graph API:
Before diving into the code, if you had taken a keen look you could have noticed that one of the parameters is of type LinkedList, this is one of the ways to represent a graph. The two ways which are most commonly used to represent a graph are 1) adjacency matrix and 2) adjacency list. Let’s take a look at these two ways of representing a graph with the help of an example. Following is an example of an undirected graph:
Adjacency Matrix:
Adjacency matrix is a 2-D array of size V x V where V is the number of vertices in a graph. Let the matrix be mat[][], if mat[i][j] = 1 then it indicates there is an edge from vertex i to vertex j. For example, in the above table mat[0][1] is marked as 1, as there is an edge which runs between vertex ‘0’ and vertex ‘1’ in the graph(the image of undirected graph).
Adjacency List:
- adjacency list of vertex ‘0’ : 1 -> 4
- adjacency list of vertex ‘1’ : 0 -> 2 -> 3 -> 4
- adjacency list of vertex ‘2’ : 1 -> 3
- adjacency list of vertex ‘3’ : 1 -> 2 -> 4
- adjacency list of vertex ‘4’ : 0 -> 1 -> 3
Adjacency list is nothing but an array of lists. The size of the array is equal to the number of vertices. Let it be represented by ‘adj[]’, adj[i] has a list of vertices which are adjacent to the vertex ‘i’. For example, in the above representation you could see that vertex ‘2’ has a list consisting vertices of ‘1’ and ‘3’ because in the graph vertices ‘1’ and ‘3’ are adjacent to vertex ‘2’.
Note: Usually an adjacency list is preferred over an adjacency matrix because:
- Space complexity of adjacency list is of O(V + E) whereas adjacency matrix is of O(V²).
- Algorithms based on iterating over vertices adjacent to V, so most of the operations such as adding or removing vertex are much efficient using an adjacency list.
- Real-world graphs tend to be sparse -huge number of vertices with small average vertex degree.
Now there may be question, how can we represent a vertex which is not a number? The answer to it is pretty simple use a symbol table(HashMap) to create mapping of the elements to numbers.
Implementation:
Client code:
That’s it we have successfully built a simple undirected graph from scratch, you could build upon this by adding some more simple features for some typical graph processing problems. Like,
- Compute the degree of a vertex, which is nothing but the number of edges meeting at that vertex or the number of vertices which are adjacent to it.
- Count the number of self loops, a self loop is where an edge is to be found connecting a vertex to itself.
- Find a path(usually the shortest path) between two vertices, it can be solved by classical graph processing algorithms called DFS(Depth First Search) and BFS(Breadth First Search).
So, we have come to the end of this post and thank you so much 😊 if you have made this far into this post. I will see you in the next post with some graph processing algorithms!! Always grateful, never complacent!
References:
- Algorithms book by Robert Sedgewick
- https://leapgraph.com/graph-data-structures-applications
- https://www.geeksforgeeks.org/graph-and-its-representations/
- https://towardsdatascience.com/graph-based-recommendation-engine-for-amazon-products-1a373e639263