Becksteinlab/Parallel-analysis-in-the-MDAnalysis-Library

Name: Parallel-analysis-in-the-MDAnalysis-Library

Owner: Becksteinlab

Description: Benchmarking MDAnalysis with Dask (and MPI). Supplementary Information for SciPy 2017 paper.

Created: 2017-02-27 20:18:31.0

Updated: 2017-08-31 20:56:36.0

Pushed: 2017-10-12 00:56:08.0

Homepage: http://conference.scipy.org/proceedings/scipy2017/mahzad_khoslessan.html

Size: 80

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Parallel analysis in the MDAnalysis Library

We present a benchmark suite that can be used to evaluate performance for parallel map-reduce type analysis and use it to investigate the performance of MDAnalysis with the Dask library for task-graph based computing (Khoslessan et al, 2017).

A range of commonly used MD file formats (CHARMM/NAMD DCD, Gromacs XTC, Amber NetCDF) and different trajectory sizes are tested on different high-performance computing (HPC) resources. Benchmarks are performed both on a single node and across multiple nodes.

For space reasons, not all data could be shown in the SciPy 2017 conference proceedings paper. For a full analysis see the Technical Report (Khoshlessan and Beckstein, 2017). The report is available on figshare at DOI 10.6084/m9.figshare.4695742.

Supplementary information for SciPy 2017 paper

This repository should be considered part of the Supplementary information to the SciPy 2017 Proceedings paper (Khoslessan et al, 2017).

Benchmarking code

The repository contain the code to benchmark parallelization of MDAnalysis:

Data files

The data files consist of a topology file adk4AKE.psf (in CHARMM PSF format; N = 3341 atoms) and a trajectory 1ake_007-nowater-core-dt240ps.dcd (DCD format) of length 1.004 ?s with 4187 frames; both are freely available under the CC-BY license from figshare at DOI 10.6084/m9.figshare.5108170

Files in XTC and NetCDF formats are generated from the DCD.

Tested libraries
Comments and Questions

Please raise issues in the issue tracker or ask on the MDAnalysis developer mailing list.

References

M. Khoshlessan, I. Paraskevakos, S. Jha, and O. Beckstein (2017). Parallel analysis in MDAnalysis using the Dask parallel computing library. In S. Benthall and S. Rostrup, editors, Proceedings of the 16th Python in Science Conference, Austin, TX, 2017. SciPy.

Khoshlessan, Mahzad; Beckstein, Oliver (2017): Parallel analysis in the MDAnalysis Library: Benchmark of Trajectory File Formats. Technical report, Arizona State University, Tempe, AZ, 2017. figshare. doi:10.6084/m9.figshare.4695742


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.