Configuring Apache Flume with example

Posted on

prashantblogs4all

Apache Hadoop is becoming a standard data processing framework in large enterprises. The data is being generated in massive amount and which needs to be written on HDFS, this gave birth to a new project by the name of Apache Flume. Flume’s HDFS and HBase sinks provides a set of features that makes it possible to write data in any format that is supported by these systems and in a MapReduce/Hive/Impala/Pig friendly way. We are just covering how to install flume and get this working :

1. Download the apache flume from http://archive.apache.org/dist/flume/ to /usr/local/src

2. cd /usr/local/;tar -xzvf src/apache-flume-1.4.0-bin.tar.gz

3. cd /apache;ln -s /usr/local/apache-flume-1.4.0-bin/ flume

4. cd flume; vi conf/flume.conf

agent.sources = logstream
agent.channels = memoryChannel
agent.channels.memoryChannel.type = memory
agent.sources.logstream.channels = memoryChannel
agent.sources.logstream.type = exec
agent.sources.logstream.command = tail -f /apache/flume/test
agent.sinks = hdfsSink
agent.sinks.hdfsSink.type = hdfs
agent.sinks.hdfsSink.channel = memoryChannel
agent.sinks.hdfsSink.hdfs.path = hdfs://hacluster:8020/flumetest
agent.sinks.hdfsSink.hdfs.fileType = SequenceFile
agent.sinks.hdfsSink.hdfs.writeFormat = Text

View original post 203 more words

One thought on “Configuring Apache Flume with example

    Flume « Sireesh Hadoop Blog said:
    May 14, 2015 at 3:32 pm

    […] You can also refer another example of flume here […]

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s