CD2H gitForager

Yoctol/word2vec-api

Name: word2vec-api

Owner: YOCTOL INFO INC.

Description: Simple web service providing a word embedding model

Forked from: 3Top/word2vec-api

Created: 2017-01-15 08:08:10.0

Updated: 2017-01-15 08:08:13.0

Pushed: 2016-12-22 19:43:20.0

Homepage: http://www.3top.com

Size: 22

Language: Python

GitHub Committers

User	Most Recent Commit	# Commits

Other Committers

User	Email	Most Recent Commit	# Commits

README

word2vec-api

Simple web service providing a word embedding API. The methods are based on Gensim Word2Vec implementation. Models are passed as parameters and must be in the Word2Vec text or binary format.

Install Depenencies
```
 install -r requirements.txt
```

Launching the service

on word2vec-api --model path/to/the/model [--host host --port 1234]

on word2vec-api.py --model /path/to/GoogleNews-vectors-negative300.bin --binary BINARY --path /word2vec --host 0.0.0.0 --port 5000

Example calls

 http://127.0.0.1:5000/word2vec/n_similarity?ws1=Sushi&ws1=Shop&ws2=Japanese&ws2=Restaurant
 http://127.0.0.1:5000/word2vec/similarity?w1=Sushi&w2=Japanese
 http://127.0.0.1:5000/word2vec/most_similar?positive=indian&positive=food[&negative=][&topn=]
 http://127.0.0.1:5000/word2vec/model?word=restaurant
 http://127.0.0.1:5000/word2vec/model_word_set

Note: The “model” method returns a base64 encoding of the vector. “model_word_set” returns a base64 encoded pickle of the model's vocabulary.

Where to get a pretrained model

In case you do not have domain specific data to train, it can be convenient to use a pretrained model. Please feel free to submit additions to this list through a pull request.

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.