spectra-cluster/spectra-cluster

Name: spectra-cluster

Owner: spectra-cluster

Description: An open-source library for clustering MS spectra.

Created: 2015-03-18 14:29:35.0

Updated: 2016-11-14 11:00:41.0

Pushed: 2017-12-20 08:59:23.0

Homepage:

Size: 35871

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

spectra-cluster - A MS/MS spectrum clustering Java API

https://spectra-cluster.github.io provides a complete overview over all the tools we provide on spectrum clustering and the spectra-cluster algorithm.

Introduction

The spectra-cluster Java API is the central collection of algorithms used to develop and run the PRIDE Cluster project. The library was built to quickly test different combinations of clustering approaches and contains implementations of a variety of, for example, similarity metrics for MS/MS spectrum clustering.

It is currently used in two applications:

spectra-cluster is an open-source (Apache 2 licensed) library. It offers the following features out-of-box:

Changelog

1.0.10
1.0.9
1.0.8

Getting started

Installation

You will need to have Maven installed in order to build and use the spectra-cluster library.

Add the following snippets in your Maven pom file:

 spectra-cluster dependency -->
endency>
<groupId>uk.ac.ebi.pride.spectracluster</groupId>
<artifactId>spectra-cluster</artifactId>
<version>${current.version}</version>
pendency>
aven
- EBI repo -->
pository>
 <id>nexus-ebi-repo</id>
 <url>http://www.ebi.ac.uk/intact/maven/nexus/content/repositories/ebi-repo</url>
epository>

- EBI SNAPSHOT repo -->
apshotRepository>
<id>nexus-ebi-repo-snapshots</id>
<url>http://www.ebi.ac.uk/intact/maven/nexus/content/repositories/ebi-repo-snapshots</url>
napshotRepository>
Running the library

The clustering process itself is done by a clutering engine. The following examples use the implementations used for PRIDE Cluster.

t WINDOW_SIZE = 4.0F;
t FRAGMENT_TOLERANCE = 0.5F;
le CLUSTERING_PRECISION = 0.01;


his creates an incremental clustering engine that
ses the CombinedFisherIntensityTest with a fragment
on tolerance of 0.5 m/z as similarity metrics. The
lusterComparator is only used for sorting of the clusters
uring the clustering process. The WINDOW_SIZE of 4.0 m/z
eans that as soon as a new cluster is added, any cluster
ith an average precursor m/z lower than 4.0 m/z than the
ewly added cluster is automatically returned during the
lustering process. The CLUSTERING_PRECISION is the defined
ccuracy for the clustering process (benchmarked on the
RIDE Cluster test dataset). Finally, the FrationTICPeakFunction
s a peak filter function that is applied to every spectrum
efore comparison (in this case all peaks that represent
0% of the total ion current, but a minimum of 20 peaks).
or consensus spectrum building, the complete unfiltered
pectrum is used.

rementalClusteringEngine clusteringEngine = new GreedyIncrementalClusteringEngine(
new CombinedFisherIntensityTest(FRAGMENT_TOLERANCE),
ClusterComparator.INSTANCE,
WINDOW_SIZE,
CLUSTERING_PRECISION,
FractionTICPeakFunction(0.5f, 20));

uring clustering the clusters must be sorted
ccording to precursor m/z. Otherwise an
xception is thrown
(ICluster clusterToAdd : clusterIterable) {
// clusters are simply added through the 'addClusterIncremental'
// function. Clusters that have a lower precursor m/z
// than the added cluster (based on the set window size)
// are returned.
Collection<ICluster> removedClusters = clusteringEngine.addClusterIncremental(clusterToAdd);

if (!removedClusters.isEmpty()) {
    // use some method to save the removed and thereby
    // "final" clusters
    writeOutClusters(removedClusters);
}


fter all spectra were clustered, save the finally
emaining clusters still stored in the clustering 
ngine
ection<ICluster> clusters = clusteringEngine.getClusters();
eOutClusters(clusters);

Getting help

If you have questions or need additional help, please contact the PRIDE help desk at the EBI.

email: pride-support@ebi.ac.uk

Feedback

Please give us your feedback, including error reports, suggestions on improvements, new feature requests. You can do so by opening a new issue at our issues section

How to cite

Please cite this library using one of the following publications:

Contribute

We welcome all contributions submitted as pull request.

License

This project is available under the Apache 2 open source software (OSS) license.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.