IBMStreams/streamsx.parquet

Name: streamsx.parquet

Owner: IBM Streams

Description: (Incubation) Toolkit providing adapters to Parquet

Created: 2014-11-18 14:01:34.0

Updated: 2017-02-19 09:44:36.0

Pushed: 2016-07-25 11:43:34.0

Homepage: http://ibmstreams.github.io/streamsx.parquet

Size: 79634

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

streamsx.parquet

Support for Streams 4.0 and BI 4.0 is now available!

Parquet is a columnar storage format for Hadoop. Parquet becoming more and more popular due to its very efficient compression and encoding schemes. See more details at Parquet home page: http://parquet.io/

The Parquet toolkit allows to write data in Parquet format from streaming applications. The toolkit is implemented in Java and contains ParquetSink operator in its initial version.

Samples showing ParquetSink operator usages are available in a samples folder. The details about the installation, and configuration are about to be published soon.

Toolkit documentation is available at: http://ibmstreams.github.io/streamsx.parquet/


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.