2gis/kafka-connect-hdfs-ext

Name: kafka-connect-hdfs-ext

Owner: 2GIS

Description: Set of extensions for kafka connect hdfs

Created: 2016-07-05 05:38:28.0

Updated: 2018-05-16 09:03:01.0

Pushed: 2018-05-16 09:02:59.0

Homepage:

Size: 17

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Download

kafka-connect-hdfs-ext

This project provides extensions for kafka-connect-hdfs project.

Formats

Right now we provides two additional formats: ru.dgis.casino.plain.GzipTextFormat and ru.dgis.casino.plain.PlainTextFormat.

PlainTextFormat is format that saves each message to hdfs as string and separates messages with \n. PlainTextFormat drops key value.

GzipTextFormat is the same format as PlainTextFormat but also performs compression via gzip.

How to use

Here it is an example of topic config

ector.class=io.confluent.connect.hdfs.HdfsSinkConnector
at.class=ru.dgis.casino.plain.GzipTextFormat
itioner.class=io.confluent.connect.hdfs.partitioner.TimeBasedPartitioner
.format=YYYY/MM/dd
=SOME_NAME_HERE
cs=TOPIC
.url=hdfs://YOUR_HADOOP
.dir=LOGS_DIR
cs.dir=TOPICS_DIR

h.size=100000
le=ru_RU
zone=Asia/Novosibirsk

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.