Name: kgx
Owner: NCATS Data Translator Project - Tangerine Team
Description: knowledge graph exchange tools
Created: 2018-04-24 02:09:07.0
Updated: 2018-05-09 23:41:22.0
Pushed: 2018-05-09 23:41:23.0
Size: 82
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
A utility library and set of command line tools for exchanging data in knowledge graphs.
The tooling here is partly generic but intended primarily for building the translator-knowledge-graph.
For additional background see the Translator Knowledge Graph Drive
install -r requirements.txt
on setup.py install
Use the --help
flag to get help. Right now there is a single command:
e: kgx dump [OPTIONS] [INPUT]... OUTPUT
ansforms a knowledge graph from one representation to another
PUT : any number of files or endpoints
TPUT : the output file
ons:
input-type TEXT Extention type of input files: ttl, json, csv, rq, tsv,
graphml
output-type TEXT Extention type of output files: ttl, json, csv, rq, tsv,
graphml
help Show this message and exit.
CSV/TSV representation require two files, one that represents the vertex set and one for the edge set. JSON, TTL, and GRAPHML files represent a whole graph in a single file. For this reason when creating CSV/TSV representation we will zip the resulting files in a .tar file.
The format will be inferred from the file extention. But if this cannot be done
then the --input-type
and --output-type
flags are useful to tell the program
what formats to use. Currently not all conversions are supported.
Here are some examples that mirror the tests:
x dump --output-type=csv tests/resources/x1n.csv tests/resources/x1e.csv target/x1out
created at: target/x1out.tar
x dump tests/resources/x1n.csv tests/resources/x1e.csv target/x1n.graphml
created at: target/x1n.graphml
x dump tests/resources/monarch/biogrid_test.ttl target/bgcopy.csv
created at: target/bgcopy.csv.tar
x dump tests/resources/monarch/biogrid_test.ttl target/x1n.graphml
created at: target/x1n.graphml
x dump tests/resources/monarch/biogrid_test.ttl target/x1n.json
created at: target/x1n.json
Internal representation is networkx MultiDiGraph which is a property graph.
The structure of this graph is expected to conform to the tr-kg standard, briefly summarized here:
Intended to support
Neo4j implements property graphs out the box. However, some implementations use reification nodes. The transform should allow for de-reification.