Name: termite-data-server
Owner: UW Interactive Data Lab
Description: Data Server for Topic Models
Created: 2014-01-03 05:21:34.0
Updated: 2018-01-14 07:31:34.0
Pushed: 2014-10-17 01:03:50.0
Size: 26814
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Termite is a visual analysis tool for exploring the output of statistical topic models.
This repository contains:
The web server includes various interactive visualizations:
This software is distributed under the BSD-3 license.
The Termite Data Server is developed and maintained by Jason Chuang with contributions from:
Termite requires on the use of the following software. We thank their respective authors for developing and distributing these tools.
Currently, this data server can import topic models from:
We are in the process of adding support for:
The data server can be deployed on various platforms supported by web2py. However, the copy included in the repository is customized for Apple's OSX.
At the time of writing, the following three tools need to be installed when this repository is first cloned. Execute the following commands at the root of the repository.
setup_corenlp.sh
setup_mallet.sh
-C utils/corenlp
To launch this data server, execute the following command. A dialogue box will appear. Click on “start server” to proceed.
art_server.sh
Several demos are included in this repository.
Executing the following command will download the 20newsgroups dataset (18828 documents), build an LDA topic model with 20 latent topics using MALLET, and launch the web server.
mo.py 20newsgroups
Executing the following command will download the InfoVis dataset (449 documents with metadata), build an LDA topic model with 20 latent topics using MALLET, and launch the web server.
mo.py infovis
To build an example topic model on the InfoVis dataset using Gensim:
mo.py infovis gensim
More generally, to build a topic model on dataset
using tool
:
mo.py [dataset] [tool]
To see more demo options:
mo.py --help
The resulting topic model(s) will be available at:
://127.0.0.1:8075/
This is an active research project. While we would like to support as many users as possible, we are constrained by available resources. Below are the system requirements, known issues as well as the API format, for developing additional visualizations and incorporating additional models to the data server.
The web server is based on the web2py framework. While web2py is designed to work on Windows, Mac, and most Unix platforms, we have only tested the system on OSX. The framework will not work under Cygwin on Windows.
A primary goal of developing this data server is to provide a common API (application programming interface), so that multiple topic model visualizations can interact with any number of topic modeling software, and with other visualizations.
All API calls to this web server are in following format.
:// [server] / [dataset] / [model] / [attribute]
The string [server]
is the base portion of the URL, such as http://localhost:8080
when running a local machine. As multiple projects can be hosted on the same server, [dataset]
is a string [A-Za-z0-9_]+
that uniquely identifies a project. A web-based visualization can access the content of a topic model by specifying the remaining URL [model]/[attribute]
, such as lda/TermTopicMatrix
and treetm/TermTopicConstraints
to retrieve the term-topic matrix and send user-defined constraints to the server, respectively.
Copyright (c) 2013, Leland Stanford Junior University Copyright (c) 2014, University of Washington All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL