uwescience/GossipMap

Name: GossipMap

Owner: UW eScience Institute

Description: GossipMap: distributed parallel community detection algorithm

Created: 2015-09-03 00:12:11.0

Updated: 2018-04-19 14:46:16.0

Pushed: 2015-09-03 07:31:13.0

Homepage: null

Size: 156

Language: C++

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

GossipMap: A Distributed Community Detection Algorithm

GossipMap is a distributed parallel community detection algorithm to optimize flow-based information-theoretic objective function, called the map equation. GossipMap is under GNU General Public License, detailed information is in LICENSE.txt.

How to compile

GossipMap is implemented in C++ and uses GraphLab PowerGraph for distributed-memory parallelism, so you have to install GraphLab PowerGraph v2.2 before using GossipMap. You can find GraphLab PowerGraph from https://github.com/dato-code/PowerGraph.

You can compile GossipMap by following the instruction in the 'Writing Your Own Apps' Section in the GraphLab PowerGraph README. Below is a modified instruction from the 'Writing Your Own Apps' section for GossipMap application:

  1. Create a sub-directory in the apps/ directory of GraphLab installation, like apps/GossipMap.
  2. Copy GossipMap.cpp and CMakeLists.txt file from the GossipMap directory to apps/GossipMap.
  3. Running 'make' in the apps/ directory should compile GossipMap.
  4. If GossipMap does not show up, run 'touch apps/CMakeLists.txt' and rerun 'make'
How to run GossipMap
  1. on a single machine, you could run without using “mpiexec” command. Also, if you run './GossipMap –help' or './GossipMap' without any arguments, it will show the arguments list with pre-selected values.

    • [Usage] >./GossipMap –graph –thresh –maxiter –maxspiter –trials <# trials> –mode <1 or 2> –outmode <1 or 2> –ncpus
    • [e.g.] >./GossipMap –graph ~/graph-data/web-Stanford.txt –thresh 0.001 –maxiter 10 –maxspiter 3 –trials 1 –mode 1 –outmode 2 –ncpus 8
  2. on multiple machines, you could run GossipMap by using “mpiexec” command. All of the machines should be installed GraphLab PowerGraph and MPI.

    • [Usage] > mpiexec -f machines /path/to/GossipMap –graph –thresh –maxiter –maxspiter –trials <# trials> –mode <1 or 2> –outmode <1 or 2> –ncpus
    • 'machines' is a file which contains the hostnames of the machines used for running GossipMap.
    • The command will generate 1 process on each machine represented in 'machines' unless specified.
    • If you want to specify the number of MPI processes, you can add '-n ' options, such as “mpiexec -n 4 -f machines …”
    • We recommend to use the number of less than or equal to the number of machines for value when you use '-n' option for better performance.

The arguments for GossipMap are following:

There are also some arguments related to GraphLab options.

Reference

If you would like to add a reference for this application in documents, please put the following bibliography information:

Seung-Hee Bae and Bill Howe, “GossipMap: A Distributed Community Detection Algorithm for Billion-Edge Directe Graphs,” In Proceedings of International Conference on High Performance Computing, Networking, Storage and Analysis (SC'15), 2015 [accepted]

Contact Information

GossipMap is developed by Seung-Hee Bae and Bill Howe at the University of Washington. If you want to contact us about GossipMap, you can contact us at:

Copyright © since 2014, Seung-Hee Bae, Bill Howe, Database Group at the University of Washington


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.