Chris Mattmann

Login: chrismattmann

Company: NASA

Location: Pasadena, CA

Bio: rocket scientist guy @NASAJPL adjunct prof @USCDataScience frmr Board of Directors @apache #opensource #apache #bigdata #fighton and all that jazz

Blog: http://irds.usc.edu/

Blog: http://irds.usc.edu/

Member of

  1. ESIP Federation
  2. NASA

Repositories

3d-greenland
data and code for 3d models
599-Mime-Diversity-Analysis
null
ace
Automated Concept Extraction from Search Engines
agdc
Repository for Australian Geoscience Data Cube (AGDC) code
amdmimescraper
null
apachestuff
null
apple
Automatic precondition, convert and publish remote sending data to the ESGF.
AutoAck
IRC bot created to respond to messages in the JPL XDATA team chatroom
autopsy
Autopsy® is a digital forensics platform and graphical interface to The Sleuth Kit® and other digital forensics tools. It can be used by law enforcement, military, and corporate examiners to investigate what happened on a computer. You can even use it to recover photos from your camera's memory card. Installers can be found at: http://www.sf.net/projects/autopsy/files/autopsy
bash-httpd
bash-httpd is a web server written in bash, the GNU bourne shell replacement.
bigtranslate
An Apache OODT, Apache Tika, and Apache Solr based system to automatically take large TSV file datasets, and to translate them from one language to another. Built and inspired by the DARPA XDATA Employment dataset.
breeze
Breeze is a numerical processing library for Scala.
CNTK
Computational Network Toolkit (CNTK)
ComputeFeatures
compute different kinds of features and generate matching graph
csci572-search-engine
Course project of CSCI 572
ctakesparser-utils
null
d3kit-timeline
A simple timeline component that labels do not overlap.
darpa_open_catalog
Meta information for the DARPA open catalog project.
datavis-hackathon
null
deeplearning-udacity
Chris's assignments from DeepLearning class on udacity.
dig-elasticsearch
Code to process datasets for elastic search
dig-extract
python-based repository for DIG extractors
disco
Data Intensive Software Connectors
earthcube
null
easybuild-easyconfigs
A collection of easyconfig files that describe which software to build using which build options with EasyBuild.
edward
A library for probabilistic modeling, inference, and criticism
elasticsearch
Open Source, Distributed, RESTful Search Engine
ENVIJava
null
esgf.github.io
ESGF Web Site
etllib
This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.
facetview
Port of FacetView (okfn/facetview) working towards Solr back end.
geo-bigdata.github.io
Big Data in the Geosciences Workshop Homepage (IEEE Big Data Conference 2015)
GeographicDR
null
GeoParsingNSF
Provide the support to perform content-based geoparsing/geographic entity extractions for any unstructured texts, resolved entities are WGS84 specified, made of Apache Tika, Lucene, and datasets from GeoNames.org, platform independent.
geothon
Series of Geo Python code example
geotopicparser-utils
null
girder
A data management platform for the web
glide
Grid-based Lightweight Infrastructure for Data-intensive Environments
go-tika
Go package for using Apache Tika
grobid
A machine learning software for extracting information from scholarly documents
grobidparser-resources
null
hive-probabilistic-utils
Probabilistic data structures and algorithms for hive
HyspIRI
null
imagecat
ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to extract metadata and OCR information from those files/images using Tika and Tesseract OCR.
image_solr
null
joshua
Joshua Statistical Machine Translation Toolkit
joshua-decoder.github.com
null
keras
Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on Theano and TensorFlow.
Khooshe
Big Data-Points Visualization Tool
labkey-client
null
labkey-dumper
null
libnd4j
The C++ engine that powers the scientific computing library ND4J - n-dimensional arrays for Java
loaded-language-linter
A small Node.JS library to detect loaded language.
lucene-geo-gazetteer
Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.
lucene-lda
Using latent Dirichlet allocation (LDA) in Apache Lucene
luke
This is mavenised Luke: Lucene Toolbox Project
marve
For extracting measurements and related entities from text
memex-autonomy
null
memex-explorer
Viewers for statistics and dashboarding of Domain Search Engine data
memex-weapons
null
myhicode
various examples of hi in various programming languages
NLTKRest
This is a REST Server endpoint built using Flask and Python.
notifico
My personal http://cia.vc replacement project. Now used by over 3000 projects.
NSFDataVizHackathon-2014
null
nutch
Mirror of Apache Nutch
nutchpy
For interacting with nutch via Python
nutch-python
Nutch-Python is a Python binding to the Apache Nutch? REST services allowing Nutch to be called natively in the Python community. ? Edit
nutch-selenium
null
NYU-BusTracker-Android
Android application used to track the NYU bus system.
oodt
Mirror of Apache OODT
oodt-pushpull-plugins
null
Open-Source-Catalog
contains the NASA open source software catalog for automatic deployment to code.nasa.gov
open-source-notes
null
open-speech-recording
Web application to record speech for an open data set
operation-sandberg
null
parser-indexer-py
Python tools for parsing documents and building the inverted index with enriched metadata
politics-hacking
Scripts to process & analyze web data regarding politics.
program-index
A list of memex-related tools and their repository URLs
project-open-data.github.io
Open Data Policy ? Managing Information as an Asset
pysauce
Python port to OpenSauce
python-tika
Python wrapper for Apache Tika, made to be easy_installed
ScalableLSH
An implementation of LSH that reads features from disk
SEN-and-TurboSoft
Shared scripts and code from the Sediment Experimentalist Network.
shangridocs
Document exploration tool
shangridocs-newskin
null
SMQTK
Python toolkit for pluggable algorithms and data structures for multimedia-based machine learning.
snorkel
A training data creation and management system focused on information extraction
solrcene
Spatial Branch of Apache Solr
spamscope
Fast Advanced Spam Analysis Tool
tensorflow
Computation using data flow graphs for scalable machine learning
Text.jl
Numerous tools for text processing
thefuck
Magnificent app which corrects your previous console command.
tika
Mirror of Apache Tika
tika-python
Tika-Python is a Python binding to the Apache Tika? REST services allowing Tika to be called natively in the Python community.
tika-similarity
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
topic_space
Topic modeling web application
tpipe
Python library to search radio interferometry data for dispersed (fast) transients
trec-dd.org
web site for TREC Dynamic Domain
trec-dd-polar
A dataset downloaded from the deep and scientific web across three major Polar data centers for use in research.
turbosoft
Turbosoft
twitter-researcher
Identifying and Analyzing Researchers on Twitter
TwoRavens
A web application for data exploration, statistical analysis, model construction and meta analysis tools, that integrates with Zelig and Dataverse.
videocat
null
whimsy
Apache Whimsy
whimsy-agenda
React implementation of the ASF Board agenda tool
wiseowl
This is a Fact based Question Answering System using Apache Solr as backend search engine, Wikipedia dumps as information source, Apache velocity , Html, Css for Web interface Design. The project also uses Linux bash script to perform its various functions like start,stop,training,indexing.
xdata_meta
Meta information about the XData project

Commits To

RepositoryMost Recent Commit# Commits
nasa-jpl-memex/topic_space2015-07-23 00:24:26.06
kristw/d3kit-timeline2018-02-23 23:35:06.01


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.