Name: hive-json
Owner: Hortonworks Inc
Description: A rough prototype of a tool for discovering Apache Hive schemas from JSON documents.
Created: 2013-03-12 15:52:05.0
Updated: 2018-01-06 16:24:41.0
Pushed: 2017-02-20 22:48:17.0
Homepage: null
Size: 35
Language: Java
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This project is a rough prototype that I've written to analyze large collections of JSON documents and discover their Apache Hive schema. I've used it to anaylyze the githubarchive.org's log data.
To build the project, use Maven (3.0.x) from http://maven.apache.org/.
Building the jar:
% mvn package
Run the program:
% bin/find-json-schema *.json.gz
I've uploaded the discovered schema for githubarchive.org to https://gist.github.com/omalley/5125691.