Skip to content

An overview of algorithms for community detection in networks.

Notifications You must be signed in to change notification settings

44rdvark/Networks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 

Repository files navigation

An overview of algorithms for community detection in networks

My BSc thesis in Computer Science - IT Analyst written at Jagiellonian University in 2019.

The paper outlines common properties of networks, formalizes the concept of community detection by introducing a measure known as modularity and provides descriptions of the most popular modularity maximization algorithms, and their pseudocodes. Finally, it includes multiple tests and their analysis, showcasing how different algorithms compare in terms of execution time and the quality of partitions detected. The paper also exemplifies the existence of a resolution limit - the fact that modularity maximization algorithms fail to detect communities smaller than a certain scale. Lastly, it includes a thorough analysis of how the vertex ordering of the network impacts the execution time of the Blondel et al. algorithm and provides a useful heuristic for improving it.

Contents

This project consists of:

  1. The thesis itself
  2. The implementations of algorithms described in it, which include:
  • the Blondel et al. algorithm
  • the Clauset-Newman-Moore algorithm
  • the Girvan-Newman algorithm
  • the hierarchical clustering algorithm
  1. The tests allowing for comparison of the algorithms' execution time and accuracy in finding proper community decomposition of networks. Those include:

Running the project

In order to test the algorithms, use Python 3 to run the scripts located in subdirectories of networks/test directory. They do not take any arguments but some require the networkx library to load contents of .gml files with network data. The descriptions of what the scripts do is provided in comments to respective test files.

The algorithms can also be run directly by executing corresponding functions in each of the four files in networks/algorithms directory. They each take two arguments - a list of integers from 0 to n representing nodes of the network and a list of pairs of integers from 0 to n representing its edges. Each of the algorithms returns two values: a list of lists representing communities found by the algorithm and the modularity value of the community partition found.

About

An overview of algorithms for community detection in networks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages