-
Notifications
You must be signed in to change notification settings - Fork 3
The algorithm
The algorithm works on dynamic networks and aims to identify connected communities and track them through time (that is, as the network dynamics change).
The best way to illustrate this idea is with an example network. The image below depicts a 25 vertex network, with 18 edges:
By visually inspecting the network, it appears that there seem to be two communities, where vertices are generally interconnected. There are different terms that can be used to formalize this interconnectivity. For example, if a set of vertices is fully interconnected, it is called a clique. The group of vertices at the top of the image form a clique, as there are three edges (i.e., it would not be possible to draw another edge between these vertices). Cliques are a relatively conservative description of a community. For example, the square in the bottom left is not a clique (although each triangle is).
In order to relax that constraint a bit, we use the idea of a k-plex
. A k-plex is a set of n
vertices where each vertex is connected to n - k
other vertices in the set. This allows a broader definition of communities, where all vertices are almost fully interconnected. (For k = 1
, k-plexes are the same as cliques.) But if we use k = 2
, suddenly the square in the bottom can be identified as a community, as each of the four vertices (n = 4
) is connected to at least 2 other vertices in the set (n - k = 4 - 2 = 2
).
The algorithm initially looks through each time point and identifies all communities by finding all k-plexes. The network has parameters for k
, determining how interconnected a k-plex must be, and m
, the minimum size for a k-plex. By default, k = 2
, so the algorithm searches for 2-plexes, and m = 4
, so the algorithm searches for 2-plexes with 4 or more vertices.
After processing the above network, it would produce two communities as shown in the colored image below:
Once communities have been identified at each time step, the algorithm then stitches them together over time, allowing a community to be tracked as it evolves. This step is accomplished by comparing all communities identified at each adjacent time point, and merging the labels together for those communities that have n - k
vertices in common in both time steps. This process of merging together communities over time is akin to how communities are constructed within a time step. As a result, the algorithm can identify communities from time step to time step as long as they have n - k
vertices in common.
For example, below is the subsequent time step for the previously introduced graph. You can see that the community in the bottom left has grown and has received the same label, as indicated by the color, while the community in the top center has disappeared as it is no longer sufficiently connected to be identified.
This process of merging communities over time steps is robust to communities splitting or merging together as well, which can be seen in the third time step, where the community in the bottom left has now split into two distinct communities, but retains the same common label given their common origin in earlier time steps.