Name: lancaster
Owner: Two Sigma
Description: A python extension wrapper for avro-c
Created: 2016-07-15 15:04:52.0
Updated: 2016-09-14 05:19:27.0
Pushed: 2016-09-04 01:05:27.0
Homepage: null
Size: 62
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
A python extension wrapper for avro-c.
Currently only supports reading a stream of avro serialized data. Does not support writing, nor the avro container format. For more details on what lancaster is willing to read, see the lancaster spec.
See also the Avro project page.
schema = '{ ... }'
with open('data.avro', 'rb') as f:
data = list(lancaster.read_stream(schema, f))
lancaster.read_stream()
accepts a json string describing the schema,
and a stream to read from, and returns a generator which will produce
python versions of the avro data (dicts, lists, ints, strings).
A conda package is provided at anaconda.org. This depends
on conda packages providing the C libraries required, libsnappy
,
jansson
, and libavro-c
.
conda create -n lancaster -c leif python lancaster
You can also install using just pip or setuptools, assuming you have
avro-c
and libsnappy
installed on your system, which you can
probably get from your OS package manager. On debian and ubuntu
systems, you can install libavro-dev
and libsnappy-dev
.
pip install lancaster
or
git clone https://github.com/twosigma/lancaster
cd lancaster
python setup.py install
Recursive structures (links) are not supported. Writing anything, and the avro container file format, are also not supported. Happy to accept pull requests but I don't need those features personally yet.
Copyright 2016 Two Sigma Open Source, LLC. MIT licensed.