|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有帐号?立即注册
x
系统安全相关命令:passwd、su、umask、chgrp、chmod、chown、chattr、sudo、pswho
1、甚么是Flume?
flume作为cloudera开辟的及时日记搜集体系,遭到了业界的承认与普遍使用。Flume初始的刊行版本今朝被统称为FlumeOG(originalgeneration),属于cloudera。但跟着FLume功效的扩大,FlumeOG代码工程痴肥、中心组件计划分歧理、中心设置不尺度等弱点表露出来,特别是在FlumeOG的最初一个刊行版本0.94.0中,日记传输不不乱的征象尤其严峻,为懂得决这些成绩,2011年10月22号,cloudera完成了Flume-728,对Flume举行了里程碑式的修改:重构中心组件、中心设置和代码架构,重构后的版本统称为FlumeNG(nextgeneration);修改的另外一缘故原由是将Flume归入apache旗下,clouderaFlume更名为ApacheFlume。
flume的特性:
flume是一个散布式、牢靠、和高可用的海量日记收罗、聚合和传输的体系。撑持在日记体系中定制各种数据发送方,用于搜集数据;同时,Flume供应对数据举行复杂处置,并写到各类数据承受方(好比文本、HDFS、Hbase等)的才能。
flume的数据流由事务(Event)贯串一直。事务是Flume的基础数据单元,它照顾日记数据(字节数组情势)而且照顾有头信息,这些Event由Agent内部的Source天生,当Source捕捉事务后会举行特定的格局化,然后Source会把事务推进(单个或多个)Channel中。你能够把Channel看做是一个缓冲区,它将保留事务直到Sink处置完该事务。Sink卖力耐久化日记大概把事务推向另外一个Source。
flume的牢靠性
当节点呈现妨碍时,日记可以被传送到其他节点上而不会丧失。Flume供应了三种级其余牢靠性保证,从强到弱顺次分离为:end-to-end(收到数据agent起首将event写到磁盘上,当数据传送乐成后,再删除;假如数据发送失利,能够从头发送。),Storeonfailure(这也是scribe接纳的战略,当数据吸收方crash时,将数据写到当地,待恢复后,持续发送),Besteffort(数据发送到吸收方后,不会举行确认)。
flume的可恢复性:
仍是靠Channel。保举利用FileChannel,事务耐久化在当地文件体系里(功能较差)。
flume的一些中心观点:
Agent利用JVM运转Flume。每台呆板运转一个agent,可是能够在一个agent中包括多个sources和sinks。
Client临盆数据,运转在一个自力的线程。
Source从Client搜集数据,传送给Channel。
Sink从Channel搜集数据,运转在一个自力线程。
Channel毗连sources和sinks,这个有点像一个行列。
Events能够是日记纪录、avro工具等。
Flume以agent为最小的自力运转单元。一个agent就是一个JVM。单agent由Source、Sink和Channel三年夜组件组成,以下图:
值得注重的是,Flume供应了大批内置的Source、Channel和Sink范例。分歧范例的Source,Channel和Sink能够自在组合。组合体例基于用户设置的设置文件,十分天真。好比:Channel能够把事务暂存在内存里,也能够耐久化到当地硬盘上。Sink能够把日记写进HDFS,HBase,乃至是别的一个Source等等。Flume撑持用户创建多级流,也就是说,多个agent能够协同事情,而且撑持Fan-in、Fan-out、ContextualRouting、BackupRoutes,这也恰是NB的地方。以下图所示:
2、flume的官方网站在那里?
http://flume.apache.org/
3、在那里下载?
http://www.apache.org/dyn/closer.cgi/flume/1.5.0/apache-flume-1.5.0-bin.tar.gz
4、怎样安装?
1)将下载的flume包,解压到/home/hadoop目次中,你就已完成了50%:)复杂吧
2)修正flume-env.sh设置文件,次要是JAVA_HOME变量设置- root@m1:/home/hadoop/flume-1.5.0-bin#cpconf/flume-env.sh.templateconf/flume-env.shroot@m1:/home/hadoop/flume-1.5.0-bin#viconf/flume-env.sh#LicensedtotheApacheSoftwareFoundation(ASF)underone#ormorecontributorlicenseagreements.SeetheNOTICEfile#distributedwiththisworkforadditionalinformation#regardingcopyrightownership.TheASFlicensesthisfile#toyouundertheApacheLicense,Version2.0(the#"License");youmaynotusethisfileexceptincompliance#withtheLicense.YoumayobtainacopyoftheLicenseat##http://www.apache.org/licenses/LICENSE-2.0##Unlessrequiredbyapplicablelaworagreedtoinwriting,software#distributedundertheLicenseisdistributedonan"ASIS"BASIS,#WITHOUTWARRANTIESORCONDITIONSOFANYKIND,eitherexpressorimplied.#SeetheLicenseforthespecificlanguagegoverningpermissionsand#limitationsundertheLicense.#IfthisfileisplacedatFLUME_CONF_DIR/flume-env.sh,itwillbesourced#duringFlumestartup.#Enviromentvariablescanbesethere.JAVA_HOME=/usr/lib/jvm/java-7-oracle#GiveFlumemorememoryandpre-allocate,enableremotemonitoringviaJMX#JAVA_OPTS="-Xms100m-Xmx200m-Dcom.sun.management.jmxremote"#NotethattheFlumeconfdirectoryisalwaysincludedintheclasspath.#FLUME_CLASSPATH=""
复制代码 3)考证是不是安装乐成- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#
复制代码 呈现下面的信息,暗示安装乐成了
5、flume的案例
1)案例1:Avro
Avro能够发送一个给定的文件给Flume,Avro源利用AVRORPC机制。
a)创立agent设置文件- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c1
复制代码 b)启动flumeagenta1- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console
复制代码 c)创立指定文件- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.00
复制代码 d)利用avro-client发送文件- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.00
复制代码 f)在m1的把持台,能够看到以下信息,注重最初一行:- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}
复制代码 2)案例2:Spool
Spool监测设置的目次下新增的文件,并将文件中的数据读掏出来。必要注重两点:
1)拷贝到spool目次下的文件不成以再翻开编纂。
2)spool目次下不成包括响应的子目次
a)创立agent设置文件
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/spool.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=spooldira1.sources.r1.channels=c1a1.sources.r1.spoolDir=/home/hadoop/flume-1.5.0-bin/logsa1.sources.r1.fileHeader=true#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c1
复制代码 b)启动flumeagenta1
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/spool.conf-na1-Dflume.root.logger=INFO,console
复制代码 c)追加文件到/home/hadoop/flume-1.5.0-bin/logs目次
- root@m1:/home/hadoop#echo"spooltest1">/home/hadoop/flume-1.5.0-bin/logs/spool_text.log
复制代码 d)在m1的把持台,能够看到以下相干信息:
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#0
复制代码 3)案例3:Exec
EXEC实行一个给定的命令取得输入的源,假如要利用tail命令,必选使得file充足年夜才干看到输入内容
a)创立agent设置文件
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#1
复制代码 b)启动flumeagenta1
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#2
复制代码 c)天生充足多的内容在文件里
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#3
复制代码 e)在m1的把持台,能够看到以下信息:
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#4
复制代码 4)案例4:Syslogtcp
Syslogtcp监听TCP的端口做为数据源
a)创立agent设置文件
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#5
复制代码 b)启动flumeagenta1
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#6
复制代码 c)测试发生syslog
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#7
复制代码 d)在m1的把持台,能够看到以下信息:
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#8
复制代码 5)案例5:JSONHandler
a)创立agent设置文件
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#9
复制代码 b)启动flumeagenta1
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c10
复制代码 c)天生JSON格局的POSTrequest
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c11
复制代码 d)在m1的把持台,能够看到以下信息:
/- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c12
复制代码 6)案例6:Hadoopsink
个中关于hadoop2.2.0部分的安装部署,请参考文章《ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1散布式情况部署》
a)创立agent设置文件
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c13
复制代码 b)启动flumeagenta1
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c14
复制代码 c)测试发生syslog
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c15
复制代码 d)在m1的把持台,能够看到以下信息:
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c16
复制代码 e)在m1上再翻开一个窗口,往hadoop上反省文件是不是天生
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c17
复制代码 7)案例7:FileRollSink
a)创立agent设置文件
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c18
复制代码 b)启动flumeagenta1
- root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c19
复制代码 c)测试发生log
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console0
复制代码 d)检察/home/hadoop/flume-1.5.0-bin/logs下是不是天生文件,默许每30秒天生一个新文件
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console1
复制代码 8)案例8:ReplicatingChannelSelector
Flume撑持Fanout流从一个源到多个通道。有两种形式的Fanout,分离是复制和复用。在复制的情形下,流的事务被发送到一切的设置通道。在复用的情形下,事务被发送到可用的渠道中的一个子集。Fanout流必要指定源和Fanout通道的划定规矩。
此次我们必要用到m1,m2两台呆板
a)在m1创立replicating_Channel_Selector设置文件- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console2
复制代码 b)在m1创立replicating_Channel_Selector_avro设置文件
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console3
复制代码 c)在m1大将2个设置文件复制到m2上一份
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console4
复制代码 d)翻开4个窗口,在m1和m2上同时启动两个flumeagent
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console5
复制代码 e)然后在m1或m2的恣意一台呆板上,测试发生syslog
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#7
复制代码 f)在m1和m2的sink窗口,分离能够看到以下信息,这申明信息失掉了同步:
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console7
复制代码
9)案例9:MultiplexingChannelSelector
a)在m1创立Multiplexing_Channel_Selector设置文件- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console8
复制代码 b)在m1创立Multiplexing_Channel_Selector_avro设置文件
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console9
复制代码 c)将2个设置文件复制到m2上一份
- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.000
复制代码 d)翻开4个窗口,在m1和m2上同时启动两个flumeagent
- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.001
复制代码 e)然后在m1或m2的恣意一台呆板上,测试发生syslog- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.002
复制代码 f)在m1的sink窗口,能够看到以下信息:- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.003
复制代码 g)在m2的sink窗口,能够看到以下信息:- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.004
复制代码 能够看到,依据header中分歧的前提散布到分歧的channel上
10)案例10:FlumeSinkProcessors
failover的呆板是一向发送给个中一个sink,当这个sink不成用的时分,主动发送到下一个sink。
a)在m1创立Flume_Sink_Processors设置文件- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.005
复制代码 b)在m1创立Flume_Sink_Processors_avro设置文件- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.006
复制代码 c)将2个设置文件复制到m2上一份- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.007
复制代码 d)翻开4个窗口,在m1和m2上同时启动两个flumeagent- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.008
复制代码 e)然后在m1或m2的恣意一台呆板上,测试发生log- root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.009
复制代码 f)由于m2的优先级高,以是在m2的sink窗口,能够看到以下信息,而m1没有:- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.000
复制代码 g)这时候我们中断失落m2呆板上的sink(ctrl+c),再次输入测试数据:- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.001
复制代码 h)能够在m1的sink窗口,看到读取到了方才发送的两条测试数据:- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.002
复制代码 i)我们再在m2的sink窗口中,启动sink:
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.003
复制代码 j)输出两批测试数据:
- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.004
复制代码 k)在m2的sink窗口,我们能够看到以下信息,由于优先级的干系,log动静会再次落到m2上:- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.005
复制代码
11)案例11:LoadbalancingSinkProcessor
loadbalancetype和failover分歧的中央是,loadbalance有两个设置,一个是轮询,一个是随机。两种情形下假如被选择的sink不成用,就会主动实验发送到下一个可用的sink下面。
a)在m1创立Load_balancing_Sink_Processors设置文件- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.006
复制代码 b)在m1创立Load_balancing_Sink_Processors_avro设置文件- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.007
复制代码 c)将2个设置文件复制到m2上一份- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.008
复制代码 d)翻开4个窗口,在m1和m2上同时启动两个flumeagent- root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.009
复制代码 e)然后在m1或m2的恣意一台呆板上,测试发生log,一行一行输出,输出太快,简单落到一台呆板上- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}0
复制代码 f)在m1的sink窗口,能够看到以下信息:- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}1
复制代码 g)在m2的sink窗口,能够看到以下信息:- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}2
复制代码 申明轮询形式起到了感化。
12)案例12:Hbasesink
a)在测试之前,请先参考《ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1散布式情况部署》将hbase启动
b)然后将以下文件复制到flume中:- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}3
复制代码 c)确保test_idoall_org表在hbase中已存在,test_idoall_org表的格局和字段请参考《ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1散布式情况部署》中关于hbase部分的建表代码。
d)在m1创立hbase_simple设置文件- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}4
复制代码 e)启动flumeagent- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}5
复制代码 f)测试发生syslog- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}6
复制代码 g)这时候登录到hbase中,能够发明新数据已拔出- root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]UNBOUND-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.handleUpstream(NettyServer.java:171)][id:0x92464c4f,/192.168.1.50:59850:>/192.168.1.50:4141]CLOSED-08-1010:43:25,112(NewI/Oworker#1)[INFO-org.apache.avro.ipc.NettyServer$NettyServerAvroHandler.channelClosed(NettyServer.java:209)]Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO-org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event:{headers:{}body:68656C6C6F20776F726C64helloworld}7
复制代码 经由这么多flume的例子测试,假如你全体做完后,会发明flume的功效真的很壮大,能够举行各类搭配来完成你想要的事情,俗语说徒弟领进门,修行在团体,怎样可以分离你的产物营业,将flume更好的使用起来,快往下手理论吧。
这篇文章做为一个条记,但愿可以对刚进门的同砚起到匡助感化。
要多google,因为我不可能,也不可以给你解答所有内容,我只能告诉你一些关键点,甚至我会故意隐瞒答案,因为在寻找答案的过程中。 |
|