Name: inspire-magpie
Owner: inspirehep
Description: Wrapper around magpie for InspireHEP
Created: 2016-04-04 15:12:27.0
Updated: 2017-11-17 17:30:36.0
Pushed: 2016-05-26 17:32:20.0
Size: 320233
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
A wrapper around magpie for Inspire that provides trained models and functions to learn from the High Energy Physics corpus.
t clone https://github.com/inspirehep/inspire-magpie.git
inspire-magpie
p install .
There exists a UI and REST API based on Flask that you can run with:
thon wsgi.py
Access the UI on http://localhost:5051 and the REST interface under http://localhost:5051/api.
rl -i -X POST -H 'Content-Type: application/json' -d '{"corpus": "keywords", "positive": ["lhc"]}' http://localhost:5051/api/word2vec
For the training, you can use two functions that the API provides: train()
and batch_train()
. The latter performs out-of-core training, but both of them take the same parameters:
om inspire_magpie.api import batch_train
tch_train('/path/to/the/training/set', test_dir='if/you/have/a/test/set', nn='cnn', nb_epochs=5, batch_size=64, persist=True, no_of_labels=10000, verbose=1)
test_dir
- is the path to the test set (optional)nn
- defines the NN model to use for training. Currently supported: cnn
and rnn
nb_epochs
- how many times should we feed the training set to the NNbatch_size
- size of the batch with which the training occurspersist
- whether to save to disk the final model after training (in the log directory)no_of_labels
- number of labels to train the model on. It defines whether we want to train keyword extraction (10k labels), experiment prediction (500 labels) or category assignment (14 labels).verbose
- the same values as in Keras. 1 is the most verbose with a progress barOther configuration variables might be found in the config file.