飘灵儿 发表于 2015-1-16 14:37:49

给大家带来Flume情况部署和设置详解及案例年夜全

系统安全相关命令:passwd、su、umask、chgrp、chmod、chown、chattr、sudo、pswho
  1、甚么是Flume?
  flume作为cloudera开辟的及时日记搜集体系,遭到了业界的承认与普遍使用。Flume初始的刊行版本今朝被统称为FlumeOG(originalgeneration),属于cloudera。但跟着FLume功效的扩大,FlumeOG代码工程痴肥、中心组件计划分歧理、中心设置不尺度等弱点表露出来,特别是在FlumeOG的最初一个刊行版本0.94.0中,日记传输不不乱的征象尤其严峻,为懂得决这些成绩,2011年10月22号,cloudera完成了Flume-728,对Flume举行了里程碑式的修改:重构中心组件、中心设置和代码架构,重构后的版本统称为FlumeNG(nextgeneration);修改的另外一缘故原由是将Flume归入apache旗下,clouderaFlume更名为ApacheFlume。

flume的特性:
  flume是一个散布式、牢靠、和高可用的海量日记收罗、聚合和传输的体系。撑持在日记体系中定制各种数据发送方,用于搜集数据;同时,Flume供应对数据举行复杂处置,并写到各类数据承受方(好比文本、HDFS、Hbase等)的才能。
  flume的数据流由事务(Event)贯串一直。事务是Flume的基础数据单元,它照顾日记数据(字节数组情势)而且照顾有头信息,这些Event由Agent内部的Source天生,当Source捕捉事务后会举行特定的格局化,然后Source会把事务推进(单个或多个)Channel中。你能够把Channel看做是一个缓冲区,它将保留事务直到Sink处置完该事务。Sink卖力耐久化日记大概把事务推向另外一个Source。

flume的牢靠性
  当节点呈现妨碍时,日记可以被传送到其他节点上而不会丧失。Flume供应了三种级其余牢靠性保证,从强到弱顺次分离为:end-to-end(收到数据agent起首将event写到磁盘上,当数据传送乐成后,再删除;假如数据发送失利,能够从头发送。),Storeonfailure(这也是scribe接纳的战略,当数据吸收方crash时,将数据写到当地,待恢复后,持续发送),Besteffort(数据发送到吸收方后,不会举行确认)。

flume的可恢复性:
  仍是靠Channel。保举利用FileChannel,事务耐久化在当地文件体系里(功能较差)。

  flume的一些中心观点:
Agent利用JVM运转Flume。每台呆板运转一个agent,可是能够在一个agent中包括多个sources和sinks。
Client临盆数据,运转在一个自力的线程。
Source从Client搜集数据,传送给Channel。
Sink从Channel搜集数据,运转在一个自力线程。
Channel毗连sources和sinks,这个有点像一个行列。
Events能够是日记纪录、avro工具等。

  Flume以agent为最小的自力运转单元。一个agent就是一个JVM。单agent由Source、Sink和Channel三年夜组件组成,以下图:

  值得注重的是,Flume供应了大批内置的Source、Channel和Sink范例。分歧范例的Source,Channel和Sink能够自在组合。组合体例基于用户设置的设置文件,十分天真。好比:Channel能够把事务暂存在内存里,也能够耐久化到当地硬盘上。Sink能够把日记写进HDFS,HBase,乃至是别的一个Source等等。Flume撑持用户创建多级流,也就是说,多个agent能够协同事情,而且撑持Fan-in、Fan-out、ContextualRouting、BackupRoutes,这也恰是NB的地方。以下图所示:

  2、flume的官方网站在那里?
  http://flume.apache.org/
  3、在那里下载?
  http://www.apache.org/dyn/closer.cgi/flume/1.5.0/apache-flume-1.5.0-bin.tar.gz
  4、怎样安装?
    1)将下载的flume包,解压到/home/hadoop目次中,你就已完成了50%:)复杂吧
    2)修正flume-env.sh设置文件,次要是JAVA_HOME变量设置
root@m1:/home/hadoop/flume-1.5.0-bin#cpconf/flume-env.sh.templateconf/flume-env.shroot@m1:/home/hadoop/flume-1.5.0-bin#viconf/flume-env.sh#LicensedtotheApacheSoftwareFoundation(ASF)underone#ormorecontributorlicenseagreements.SeetheNOTICEfile#distributedwiththisworkforadditionalinformation#regardingcopyrightownership.TheASFlicensesthisfile#toyouundertheApacheLicense,Version2.0(the#"License");youmaynotusethisfileexceptincompliance#withtheLicense.YoumayobtainacopyoftheLicenseat##http://www.apache.org/licenses/LICENSE-2.0##Unlessrequiredbyapplicablelaworagreedtoinwriting,software#distributedundertheLicenseisdistributedonan"ASIS"BASIS,#WITHOUTWARRANTIESORCONDITIONSOFANYKIND,eitherexpressorimplied.#SeetheLicenseforthespecificlanguagegoverningpermissionsand#limitationsundertheLicense.#IfthisfileisplacedatFLUME_CONF_DIR/flume-env.sh,itwillbesourced#duringFlumestartup.#Enviromentvariablescanbesethere.JAVA_HOME=/usr/lib/jvm/java-7-oracle#GiveFlumemorememoryandpre-allocate,enableremotemonitoringviaJMX#JAVA_OPTS="-Xms100m-Xmx200m-Dcom.sun.management.jmxremote"#NotethattheFlumeconfdirectoryisalwaysincludedintheclasspath.#FLUME_CLASSPATH=""
    3)考证是不是安装乐成
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#
    呈现下面的信息,暗示安装乐成了


  5、flume的案例
    1)案例1:Avro
    Avro能够发送一个给定的文件给Flume,Avro源利用AVRORPC机制。
      a)创立agent设置文件
root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c1
      b)启动flumeagenta1
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console
      c)创立指定文件
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.00
      d)利用avro-client发送文件
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.00
      f)在m1的把持台,能够看到以下信息,注重最初一行:
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}
    2)案例2:Spool
    Spool监测设置的目次下新增的文件,并将文件中的数据读掏出来。必要注重两点:
    1)拷贝到spool目次下的文件不成以再翻开编纂。
    2)spool目次下不成包括响应的子目次
      a)创立agent设置文件

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/spool.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=spooldira1.sources.r1.channels=c1a1.sources.r1.spoolDir=/home/hadoop/flume-1.5.0-bin/logsa1.sources.r1.fileHeader=true#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c1
      b)启动flumeagenta1

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/spool.conf-na1-Dflume.root.logger=INFO,console
      c)追加文件到/home/hadoop/flume-1.5.0-bin/logs目次

root@m1:/home/hadoop#echo"spooltest1">/home/hadoop/flume-1.5.0-bin/logs/spool_text.log
      d)在m1的把持台,能够看到以下相干信息:

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#0
    3)案例3:Exec
    EXEC实行一个给定的命令取得输入的源,假如要利用tail命令,必选使得file充足年夜才干看到输入内容
      a)创立agent设置文件

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#1
      b)启动flumeagenta1

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#2
      c)天生充足多的内容在文件里

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#3
      e)在m1的把持台,能够看到以下信息:

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#4
    4)案例4:Syslogtcp
    Syslogtcp监听TCP的端口做为数据源
      a)创立agent设置文件

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#5
      b)启动flumeagenta1

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#6
      c)测试发生syslog

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#7
      d)在m1的把持台,能够看到以下信息:

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#8
    5)案例5:JSONHandler
      a)创立agent设置文件

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#9
      b)启动flumeagenta1

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c10
      c)天生JSON格局的POSTrequest

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c11
      d)在m1的把持台,能够看到以下信息:
/
root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c12
    6)案例6:Hadoopsink
    个中关于hadoop2.2.0部分的安装部署,请参考文章《ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1散布式情况部署》
      a)创立agent设置文件

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c13
      b)启动flumeagenta1

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c14
      c)测试发生syslog

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c15
      d)在m1的把持台,能够看到以下信息:

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c16
      e)在m1上再翻开一个窗口,往hadoop上反省文件是不是天生

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c17
    7)案例7:FileRollSink
      a)创立agent设置文件

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c18
      b)启动flumeagenta1

root@m1:/home/hadoop#vi/home/hadoop/flume-1.5.0-bin/conf/avro.confa1.sources=r1a1.sinks=k1a1.channels=c1#Describe/configurethesourcea1.sources.r1.type=avroa1.sources.r1.channels=c1a1.sources.r1.bind=0.0.0.0a1.sources.r1.port=4141#Describethesinka1.sinks.k1.type=logger#Useachannelwhichbufferseventsinmemorya1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100#Bindthesourceandsinktothechannela1.sources.r1.channels=c1a1.sinks.k1.channel=c19
      c)测试发生log

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console0
      d)检察/home/hadoop/flume-1.5.0-bin/logs下是不是天生文件,默许每30秒天生一个新文件

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console1
    8)案例8:ReplicatingChannelSelector
    Flume撑持Fanout流从一个源到多个通道。有两种形式的Fanout,分离是复制和复用。在复制的情形下,流的事务被发送到一切的设置通道。在复用的情形下,事务被发送到可用的渠道中的一个子集。Fanout流必要指定源和Fanout通道的划定规矩。
    此次我们必要用到m1,m2两台呆板
      a)在m1创立replicating_Channel_Selector设置文件
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console2
      b)在m1创立replicating_Channel_Selector_avro设置文件

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console3
      c)在m1大将2个设置文件复制到m2上一份

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console4
      d)翻开4个窗口,在m1和m2上同时启动两个flumeagent

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console5
      e)然后在m1或m2的恣意一台呆板上,测试发生syslog

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngversionFlume1.5.0Sourcecoderepository:https://git-wip-us.apache.org/repos/asf/flume.gitRevision:8633220df808c4cd0c13d1cf0320454a94f1ea97CompiledbyhshreedharanonWedMay714:49:18PDT2014Fromsourcewithchecksuma01fe726e4380ba0c9f7a7d222db961froot@m1:/home/hadoop#7
      f)在m1和m2的sink窗口,分离能够看到以下信息,这申明信息失掉了同步:

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console7
    
9)案例9:MultiplexingChannelSelector
      a)在m1创立Multiplexing_Channel_Selector设置文件
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console8
      b)在m1创立Multiplexing_Channel_Selector_avro设置文件

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,console9
      c)将2个设置文件复制到m2上一份

root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.000
      d)翻开4个窗口,在m1和m2上同时启动两个flumeagent

root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.001
      e)然后在m1或m2的恣意一台呆板上,测试发生syslog
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.002
      f)在m1的sink窗口,能够看到以下信息:
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.003
      g)在m2的sink窗口,能够看到以下信息:
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.004
    能够看到,依据header中分歧的前提散布到分歧的channel上

    10)案例10:FlumeSinkProcessors
    failover的呆板是一向发送给个中一个sink,当这个sink不成用的时分,主动发送到下一个sink。

      a)在m1创立Flume_Sink_Processors设置文件
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.005
      b)在m1创立Flume_Sink_Processors_avro设置文件
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.006
      c)将2个设置文件复制到m2上一份
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.007
      d)翻开4个窗口,在m1和m2上同时启动两个flumeagent
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.008
      e)然后在m1或m2的恣意一台呆板上,测试发生log
root@m1:/home/hadoop#echo"helloworld">/home/hadoop/flume-1.5.0-bin/log.009
      f)由于m2的优先级高,以是在m2的sink窗口,能够看到以下信息,而m1没有:
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.000
      g)这时候我们中断失落m2呆板上的sink(ctrl+c),再次输入测试数据:
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.001
      h)能够在m1的sink窗口,看到读取到了方才发送的两条测试数据:
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.002
      i)我们再在m2的sink窗口中,启动sink:

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.003
      j)输出两批测试数据:

root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.004
     k)在m2的sink窗口,我们能够看到以下信息,由于优先级的干系,log动静会再次落到m2上:
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.005

    11)案例11:LoadbalancingSinkProcessor
    loadbalancetype和failover分歧的中央是,loadbalance有两个设置,一个是轮询,一个是随机。两种情形下假如被选择的sink不成用,就会主动实验发送到下一个可用的sink下面。

      a)在m1创立Load_balancing_Sink_Processors设置文件
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.006
      b)在m1创立Load_balancing_Sink_Processors_avro设置文件
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.007
      c)将2个设置文件复制到m2上一份
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.008
      d)翻开4个窗口,在m1和m2上同时启动两个flumeagent
root@m1:/home/hadoop#/home/hadoop/flume-1.5.0-bin/bin/flume-ngavro-client-c.-Hm1-p4141-F/home/hadoop/flume-1.5.0-bin/log.009
      e)然后在m1或m2的恣意一台呆板上,测试发生log,一行一行输出,输出太快,简单落到一台呆板上
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}0
      f)在m1的sink窗口,能够看到以下信息:
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}1
      g)在m2的sink窗口,能够看到以下信息:
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}2
    申明轮询形式起到了感化。

    12)案例12:Hbasesink

      a)在测试之前,请先参考《ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1散布式情况部署》将hbase启动

      b)然后将以下文件复制到flume中:
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}3
      c)确保test_idoall_org表在hbase中已存在,test_idoall_org表的格局和字段请参考《ubuntu12.04+hadoop2.2.0+zookeeper3.4.5+hbase0.96.2+hive0.13.1散布式情况部署》中关于hbase部分的建表代码。

      d)在m1创立hbase_simple设置文件
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}4
      e)启动flumeagent
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}5
      f)测试发生syslog
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}6
      g)这时候登录到hbase中,能够发明新数据已拔出
root@m1:/home/hadoop/flume-1.5.0-bin/conf#/home/hadoop/flume-1.5.0-bin/bin/flume-ngagent-c.-f/home/hadoop/flume-1.5.0-bin/conf/avro.conf-na1-Dflume.root.logger=INFO,consoleInfo:Sourcingenvironmentconfigurationscript/home/hadoop/flume-1.5.0-bin/conf/flume-env.shInfo:IncludingHadooplibrariesfoundvia(/home/hadoop/hadoop-2.2.0/bin/hadoop)forHDFSaccessInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-api-1.7.5.jarfromclasspathInfo:Excluding/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jarfromclasspath...-08-1010:43:25,112(NewI/Oworker#1)UNBOUND-08-1010:43:25,112(NewI/Oworker#1)CLOSED-08-1010:43:25,112(NewI/Oworker#1)Connectionto/192.168.1.50:59850disconnected.-08-1010:43:26,718(SinkRunner-PollingRunner-DefaultSinkProcessor)Event:{headers:{}body:68656C6C6F20776F726C64helloworld}7
    经由这么多flume的例子测试,假如你全体做完后,会发明flume的功效真的很壮大,能够举行各类搭配来完成你想要的事情,俗语说徒弟领进门,修行在团体,怎样可以分离你的产物营业,将flume更好的使用起来,快往下手理论吧。

    这篇文章做为一个条记,但愿可以对刚进门的同砚起到匡助感化。

要多google,因为我不可能,也不可以给你解答所有内容,我只能告诉你一些关键点,甚至我会故意隐瞒答案,因为在寻找答案的过程中。

只想知道 发表于 2015-1-17 18:04:08

这种补充有助于他人在邮件列表/新闻组/论坛中搜索对你有过帮助的完整解决方案,这可能对他们也很有用。

飘灵儿 发表于 2015-1-21 07:47:49

一些显而易见的小错误还是用vi改正比较方便。以后的大一点的程序就得在Linux下调试了,因为有的头文件在VC里面说找不到。?

因胸联盟 发表于 2015-1-30 11:52:07

对于英语不是很好的读者红旗 Linux、中标Linux这些中文版本比较适合。现在一些Linux网站有一些Linux版本的免费下载,这里要说的是并不适合Linux初学者。

蒙在股里 发表于 2015-2-16 01:13:47

其实老师让写心得我也没怎么找资料应付,自己想到什么就写些什么,所以不免有些凌乱;很少提到编程,因为那些在实验报告里已经说了,这里再写就多余了。

小女巫 发表于 2015-3-4 21:19:44

其中不乏很多IT精英的心血。我们学透以后更可以做成自己的OS!?

透明 发表于 2015-3-11 21:06:03

我是学习嵌入式方向的,这学期就选修了这门专业任选课。

若天明 发表于 2015-3-19 14:03:35

应对Linux的发展历史和特点有所了解,Linux是抢占式多任务多用户操作系统,Linux最大的优点在于其作为服务器的强大功能,同时支持多种应用程序及开发工具。

老尸 发表于 2015-3-28 13:01:02

随着Linux技术的更加成熟、完善,其应用领域和市场份额继续快速增大。目前,其主要应用领域是服务器系统和嵌入式系统。然而,它的足迹已遍布各个行业,几乎无处不在。
页: [1]
查看完整版本: 给大家带来Flume情况部署和设置详解及案例年夜全