Name: biojava-spark
Owner: BioJava
Description: :collision: Algorithms that are built around BioJava and run on Apache Spark
Created: 2016-04-29 18:06:39.0
Updated: 2017-09-05 13:33:48.0
Pushed: 2016-08-18 22:46:10.0
Size: 67803
Language: Java
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Algorithms that are built around BioJava and are running on Apache Spark
https://github.com/rcsb/mmtf-spark
http://mmtf.rcsb.org/v1.0/hadoopfiles/full.tar
-xvf full.tar
Or you can get a C-alpha, phosphate, ligand only version (~800 Mb download)
http://mmtf.rcsb.org/v1.0/hadoopfiles/reduced.tar
-xvf reduced.tar
<dependency>
<groupId>org.biojava</groupId>
<artifactId>biojava-spark</artifactId>
<version>0.2.1</version>
</dependency>
float maxResolution = 3.0f;
float maxRfree = 0.3f;
StructureDataRDD structureData = new StructureDataRDD("/path/to/file")
.filterResolution(maxResolution)
.filterRfree(maxRfree);
Map<String, Long> elementCountMap = BiojavaSparkUtils.findAtoms(structureData).countByElement();
Double mean = BiojavaSparkUtils.findContacts(structureData,
new AtomSelectObject()
.groupNameList(new String[] {"PRO","LYS"})
.elementNameList(new String[] {"C"})
.atomNameList(new String[] {"CA"}),
cutoff)
.getDistanceDistOfAtomInts("CA", "CA")
.mean();
System.out.println("\nMean PRO-LYS CA-CA distance: "+mean);