Flume组件常用代理配置Hadoop sink

226次阅读
没有评论

一、Flume组件配置Hadoop sink

1.启动hadoop集群

[hadoop@admin2master etc]$ start-all.sh

2.创建agent代理文件

在/usr/local/src/flume/conf目录下创建hdfs_sink.conf文件

[hadoop@master ~]$ vim /usr/local/src/flume/conf/hdfs_sink.conf

添加如下内容

a1.sources = r1   

a1.sinks = k1     

a1.channels = c1 
 
# Describe/configure the source

a1.sources.r1.type = syslogtcp  

a1.sources.r1.port = 5140
    
a1.sources.r1.host = localhost  
 
# Describe the sink

a1.sinks.k1.type = hdfs     

a1.sinks.k1.hdfs.path = hdfs://mycluster/user/flume/syslogtcp  
a1.sinks.k1.hdfs.fileType=DataStream  #文本,生产中不要用

a1.sinks.k1.hdfs.filePrefix = Syslog    

a1.sinks.k1.hdfs.round = true    

a1.sinks.k1.hdfs.roundValue = 10  

a1.sinks.k1.hdfs.useLocalTimeStamp=true  

a1.sinks.k1.hdfs.roundUnit = minute  
 
# Use a channel which buffers events in memory

a1.channels.c1.type = memory  
# Bind the source and sink to the channel

a1.sources.r1.channels = c1     

a1.sinks.k1.channel = c1       

3.启动flume进程

[hadoop@master ~]$ /usr/local/src/flume/bin/flume-ng agent -c /usr/local/src/flume/conf/ -f /usr/local/src/flume/conf/hdfs_sink.conf -n a1 -Dflume.root.logger=DEBUG,console

执行后不要关闭终端

-c 的意思是在conf目录使用配置文件。指定配置文件放在上面目录

-f 指定一个配置文件

-n agent的名称(必填)

-D表示flume运行时动态修改flume.root.logger参数属性值,并将控制台日志打印级别设置为DEBUG级别。日志级别包括:log、info、warn、error。

4.向监听端口发送信息

(1)打开一个新的terminal终端 ,输入如下命令

[hadoop@master ~]$ telnet  localhost  5140

(2)编辑任意测试内容回车发送,agent终端的日志显示成功获取数据并创建hdfs文件记录。

[hadoop@master ~]$ telnet localhost 5140

Trying ::1…

telnet: connect to address ::1: Connection refused

Trying 127.0.0.1…

Connected to localhost.

Escape character is ‘^]’.

hello flume

在启动flume进程的终端中可以看到以下信息

20/06/21 23:46:35 WARN source.SyslogUtils: Event created from Invalid Syslog data.

20/06/21 23:46:36 INFO hdfs.HDFSSequenceFile: writeFormat = Writable, UseRawLocalFileSystem = false

20/06/21 23:46:37 INFO hdfs.BucketWriter: Creating hdfs://master:9000/user/flume/syslogtcp/Syslog.1592808396936.tmp

20/06/21 23:47:07 INFO hdfs.BucketWriter: Closing hdfs://master:9000/user/flume/syslogtcp/Syslog.1592808396936.tmp

20/06/21 23:47:07 INFO hdfs.BucketWriter: Renaming hdfs://master:9000/user/flume/syslogtcp/Syslog.1592808396936.tmp to hdfs://master:9000/user/flume/syslogtcp/Syslog.1592808396936

5.在hdfs中查看获取数据信息

[hadoop@master ~]$ hdfs dfs -ls /user/flume/syslogtcp

Found 1 items

-rw-r–r–   1 hadoop supergroup        127 2020-06-21 23:47 /user/flume/syslogtcp/Syslog.1592808396936

到点睡觉了
版权声明:本站原创文章,由 到点睡觉了2022-01-10发表,共计2196字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
评论(没有评论)