做一个flume收集到另一个flume,再传给hdfs,但是现在flume连接hdfs出现如下错误

![图片说明
图片说明
错误主要是这个:Failed to start agent because dependencies were not found in classpath.上图是报错,麻烦大神解决
下面是配置文件
#master_agent
master_agent.channels = c2
master_agent.sources = s2
master_agent.sinks = k2

#master_agent avrosources
master_agent.sources.s2.type = avro
master_agent.sources.s2.bind = master1
master_agent.sources.s2.port = 41415
master_agent.sources.s2.channels = c2

#master_agent filechannels
master_agent.channels.c2.type = file
master_agent.channels.c2.capacity = 100000
master_agent.channels.c2.transactionCapacity = 1000

#master_agent hdfssinks
master_agent.sinks.k2.type = hdfs
master_agent.sinks.k2.channel = c2
master_agent.sinks.k2.hdfs.path = hdfs://master1:9000/hdfs
master_agent.sinks.k2.hdfs.filePrefix = test-
master_agent.sinks.k2.hdfs.inUsePrefix = _
master_agent.sinks.k2.hdfs.inUseSuffix = .tmp
master_agent.sinks.k2.hdfs.fileType = DataStream
master_agent.sinks.k2.hdfs.writeFormat = Text
master_agent.sinks.k2.hdfs.batchSize = 1000
master_agent.sinks.k2.hdfs.callTimeout = 6000

1个回答

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
flume采集数据到hdfs性能问题
本人目前遇到flume采集写入hdfs性能等各种问题,大致如下。在10上的xx/xx目录下的数据进行读取 sink到08上的flume 由08上的flume写到07的hdfs上 30多m的文件写了好久。有时候会内存溢出等问题![图片说明](https://img-ask.csdn.net/upload/201503/12/1426162664_624860.jpg) # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe/configure the source a1.sources.r1.type = avro a1.sources.r1.bind = r09n08 a1.sources.r1.port = 55555 a1.sources.r1.interceptors = i1 a1.sources.r1.interceptors.i1.type = timestamp #hdfs sink a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = hdfs://r09n07:8020/project/dame/input/%Y%m%d/%H a1.sinks.k1.hdfs.fileType = DataStream a1.sinks.k1.hdfs.filePrefix = hdfs- a1.sinks.k1.hdfs.rollInterval = 0 #a1.sinks.k1.hdfs.fileSuffix = .log #a1.sinks.k1.hdfs.round = true #a1.sinks.k1.hdfs.roundValue = 1 #a1.sinks.k1.hdfs.roundUnit = minute a1.sinks.k1.hdfs.rollSize = 67108864 a1.sinks.k1.hdfs.rollCount = 0 #a1.sinks.k1.hdfs.writeFormat = Text # Use a channel which buffers events in file a1.channels = c1 a1.channels.c1.type = memory #a1.channels.c1.checkpointDir=/home/nids/wg/apache-flume-1.5.2-bin/checkpoint #a1.channels.c1.dataDirs=/home/nids/wg/apache-flume-1.5.2-bin/datadir a1.sinks.k1.hdfs.batchSize = 10000 #a1.sinks.k1.hdfs.callTimeout = 6000 #a1.sinks.k1.hdfs.appendTimeout = 6000 #a1.channels.c1.type = memory a1.channels.c1.capacity = 100000 a1.channels.c1.transactionCapacity = 10000 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 上面是08机器上的配置文件 ``` 下面是10机器上的配置文件 # Name the components on this agent a1.sources = r1 a1.sinks = k1 a1.channels = c1 # Describe the sink a1.sinks.k1.type = logger #### a1.sources.r1.type = spooldir a1.sources.r1.spoolDir = /home/nids/wg/apache-flume-1.5.2-bin/ceshi12 a1.sources.r1.fileHeader =false a1.sources.r1.channels = c1 #### # Describe/configure the source #a1.sources.r1.type = avro a1.sources.r1.bind = localhost a1.sources.r1.port = 44444 # avro sink a1.sinks.k1.type = avro a1.sinks.k1.channel = c1 a1.sinks.k1.hostname = r09n08 a1.sinks.k1.port = 55555 # Use a channel which buffers events in file a1.channels = c1 a1.channels.c1.type = memory #a1.channels.c1.checkpointDir = /home/nids/wg/apache-flume-1.5.2-bin/checkpoint #a1.channels.c1.dataDirs = /home/nids/wg/apache-flume-1.5.2-bin/datadir a1.sinks.k1.hdfs.batchSize = 10000 #a1.channels.c1.type = memory a1.channels.c1.capacity = 100000 a1.channels.c1.transactionCapacity = 10000 # Bind the source and sink to the channel a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1 求各位高手解答。有时候只写了一部分数据就不再继续了,对单个文件执行时没有问题就是对目录扫描 channel是 memory类型时性能极差。不知道问题出在哪里 ```
flume可以监控hdfs上的指定目录吗
现在需要监控hdfs上的一个目录 把新增文件传到另一个hdfs上 目前想到的就是flume 大佬们求帮助,或者其他组建有能完成的吗。
flume+kafka+hdfs 整合问题
本想搭建一个 flume+hdfs+kafka+storm+mysql 的日志实时分析和存储的系统,但是flume日志收集这块一直不通,查看flume的日志也没有报错,不知道该怎么解决了,求大家帮帮忙,贴出集群配置和配置文件如下: 共5台机器:node1~node5,其中node3~node5为日志收集的agent,node1~node2为flume的collector,最终存储两份,一份到kafka,一份到hdfs。 agent的配置文件如下: #def agent.sources = src_spooldir agent.channels = file memory agent.sinks = collector_avro1 collector_avro2 # sources agent.sources.src_spooldir.type = spooldir agent.sources.src_spooldir.channels = file memory agent.sources.src_spooldir.spoolDir = /data/flume/spoolDir agent.sources.src_spooldir.selector.type = multiplexing agent.sources.src_spooldir.fileHeader = true # channels agent.channels.file.type = file agent.channels.file.checkpointDir = /data/flume/checkpoint agent.channels.file.dataDirs = /data/flume/data agent.channels.memory.type = memory agent.channels.memory.capacity = 10000 agent.channels.memory.transactionCapacity = 10000 agent.channels.memory.byteCapacityBufferPercentage = 20 agent.channels.memory.byteCapacity = 800000 # sinks agent.sinks.collector_avro1.type = avro agent.sinks.collector_avro1.channel = file agent.sinks.collector_avro1.hostname = node1 agent.sinks.collector_avro1.port = 45456 agent.sinks.collector_avro2.type = avro agent.sinks.collector_avro2.channel = memory agent.sinks.collector_avro2.hostname = node2 agent.sinks.collector_avro2.port = 4545 collector端的配置文件如下: #def agent.sources = src_avro agent.channels = file memory agent.sinks = hdfs kafka # sources agent.sources.src_avro.type = avro agent.sources.src_avro.channels = file memory agent.sources.src_avro.bind = node1 agent.sources.src_avro.port = 45456 agent.sources.src_avro.selector.type = replicating # channels agent.channels.file.type = file agent.channels.file.checkpointDir = /data/flume/checkpoint agent.channels.file.dataDirs = /data/flume/data agent.channels.memory.type = memory agent.channels.memory.capacity = 10000 agent.channels.memory.transactionCapacity = 10000 agent.channels.memory.byteCapacityBufferPercentage = 20 agent.channels.memory.byteCapacity = 800000 # sinks agent.sinks.hdfs.type = hdfs agent.sinks.hdfs.channel = file agent.sinks.hdfs.hdfs.path = hdfs://node1/flume/events/%y-%m-%d/%H%M/%S agent.sinks.hdfs.hdfs.filePrefix = log_%Y%m%d_%H agent.sinks.hdfs.hdfs.fileSuffix = .txt agent.sinks.hdfs.hdfs.useLocalTimeStamp = true agent.sinks.hdfs.hdfs.writeFormat = Text agent.sinks.hdfs.hdfs.rollCount = 0 agent.sinks.hdfs.hdfs.rollSize = 1024 agent.sinks.hdfs.hdfs.rollInterval = 0 agent.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink agent.sinks.kafka.channel = memory agent.sinks.kafka.kafka.topic = test agent.sinks.kafka.kafka.bootstrap.servers = node3:9092,node4:9092,node5:9092 agent.sinks.kafka.kafka.flumeBatchSize = 20 agent.sinks.kafka.kafka.producer.acks = 1 agent.sinks.kafka.kafka.producer.linger.ms = 1 agent.sinks.kafka.kafka.producer.compression.type = snappy 最终 hdfs和kafka都没有接收到数据。
flume 的hdfs sink效率低的问题
哈喽,大家好,我现在遇到了一个问题。 我的flume向hdfs中写文件时,效率比较低 大约1G/3分钟 我单独测试时用put方式 1分钟能达到8G 如果用file sink也能达到1分钟1G 日志没有任何异常 只是DEBUG的时候发现每次提交一个块用时将近20秒 有高手能帮忙分析下是什么原因么 client.sources = r1 client.channels = c1 client.sinks = k1 client.sources.r1.type = spooldir client.sources.r1.spoolDir = /var/data/tmpdata client.sources.r1.fileSuffix = .COMPLETED client.sources.r1.deletePolicy = never client.sources.r1.batchSize = 500 client.sources.r1.channels = c1 client.channels.c1.type = memory client.channels.c1.capacity = 1000000 client.channels.c1.transactionCapacity = 50000 client.channels.c1.keep-alive = 3 client.sinks.k1.type = hdfs client.sinks.k1.hdfs.path = /flume/events/%Y%m%d/%H client.sinks.k1.hdfs.useLocalTimeStamp = true client.sinks.k1.hdfs.rollInterval = 3600 client.sinks.k1.hdfs.rollSize = 1000000000 client.sinks.k1.hdfs.rollCount = 0 client.sinks.k1.hdfs.batchSize = 500 client.sinks.k1.hdfs.callTimeout = 30000 client.sinks.k1.hdfs.fileType = DataStream client.sinks.k1.channel = c1 12 Aug 2015 16:14:24,739 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:14:54,740 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:15:24,740 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:15:54,741 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:16:24,742 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:16:54,742 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:17:24,743 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:17:54,744 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:18:24,745 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:18:54,746 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 12 Aug 2015 16:19:24,746 DEBUG [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:126) - Checking file:../conf/flume-client.conf for changes 日志没有问题 就是慢
flume1.5.2希望将log4j的日志写入hdfs报错Unexpected exception from downstream.
1、conf文件如下 ``` agent1.sources = source1 agent1.channels = channel1 agent1.sinks = snik1 # source agent1.sources.source1.type = avro agent1.sources.source1.bind = nnode agent1.sources.source1.port = 44446 agent1.sources.source1.threads = 5 # channel agent1.channels.channel1.type = memory agent1.channels.channel1.capacity = 100000 agent1.channels.channel1.transactionCapacity = 1000 agent1.channels.channel1.keep-alive = 30 agent1.channels.channel1.byteCapacityBufferPercentage = 20 # agent1.channels.channel1.byteCapacity = 200M # sink agent1.sinks.sink1.type = hdfs agent1.sinks.sink1.hdfs.path = /flume/ agent1.sinks.sink1.hdfs.fileType = DataStream agent1.sinks.sink1.hdfs.filePrefix = event_%y-%m-%d_%H_%M_%S agent1.sinks.sink1.hdfs.fileSuffix = .log agent1.sinks.sink1.hdfs.writeFormat = Text agent1.sinks.sink1.hdfs.rollInterval = 30 agent1.sinks.sink1.hdfs.rollSize = 1024 agent1.sinks.sink1.hdfs.rollCount = 0 agent1.sinks.sink1.hdfs.idleTimeout = 20 agent1.sinks.sink1.hdfs.batchSize = 100 # agent1.sources.source1.channels = channel1 agent1.sinks.sink1.channel = channel1 ``` 2、hdfs集群为hdfs://cluster,两个namenode节点分别为:nnode、dnode1 3、java代码 ``` package com.invic.hdfs; import java.io.IOException; import java.util.Arrays; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.BlockLocation; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.FileStatus; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.LocatedFileStatus; import org.apache.hadoop.fs.Path; import org.apache.hadoop.fs.RemoteIterator; import org.apache.hadoop.fs.permission.FsAction; import org.apache.hadoop.fs.permission.FsPermission; import org.apache.log4j.Logger; /** * * @author lucl * */ public class MyHdfs { public static void main(String[] args) throws IOException { System.setProperty("hadoop.home.dir", "E:\\Hadoop\\hadoop-2.6.0\\hadoop-2.6.0\\"); Logger logger = Logger.getLogger(MyHdfs.class); Configuration conf = new Configuration(); conf.set("fs.defaultFS", "hdfs://cluster"); conf.set("dfs.nameservices", "cluster"); conf.set("dfs.ha.namenodes.cluster", "nn1,nn2"); conf.set("dfs.namenode.rpc-address.cluster.nn1", "nnode:8020"); conf.set("dfs.namenode.rpc-address.cluster.nn2", "dnode1:8020"); conf.set("dfs.client.failover.proxy.provider.cluster", "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider"); for (int i = 0; i < 500; i++) { String str = "the sequence is " + i; logger.info(str); } try { Thread.sleep(10); } catch (InterruptedException e) { e.printStackTrace(); } System.exit(0); } } ``` 4、log4j ``` log4j.rootLogger=info,stdout,flume ### stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.Target=System.out log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n ### flume log4j.appender.flume=org.apache.flume.clients.log4jappender.Log4jAppender log4j.appender.flume.layout=org.apache.log4j.PatternLayout log4j.appender.flume.Hostname=nnode log4j.appender.flume.Port=44446 log4j.appender.flume.UnsafeMode=true ``` 5、执行结果 ![后台报错](https://img-ask.csdn.net/upload/201505/26/1432569606_584763.png)
win下使用flume1.7上传hdfs “Files”不是内部或外部命令
![![图片说明](https://img-ask.csdn.net/upload/201705/15/1494819279_176572.png)图片说明](https://img-ask.csdn.net/upload/201705/15/1494819272_342762.png)如题所述,我在本机win10可以成功上传了已经,但是想试试别的机子,后面想测试两个flume agent上传,但是win7下一直报错,原因不明,配置都是直接拷贝win10过来的,检查很多次了应该不会错,具体报错看图,感谢大神们。
flume使用lzo压缩问题
目前使用flume抽取日志数据使用flume拦截器将日志数据发送到不同的 kafka的topic中,然后使用flume将kafka的topic中的数据使用LZO压缩 发送到hdfs中,在lzo压缩这里flume出现了问题,报错信息如下: ``` 2020-01-30 19:38:12,842 (conf-file-poller-0) [WARN - org.apache.hadoop.util.NativeCodeLoader.<clinit>(NativeCodeLoader.java:62)] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2020-01-30 19:38:13,294 (conf-file-poller-0) [ERROR - org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:426)] Sink k1 has been removed due to an error during configuration java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) at org.apache.flume.sink.hdfs.HDFSEventSink.getCodec(HDFSEventSink.java:313) at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:237) at org.apache.flume.conf.Configurables.configure(Configurables.java:41) at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132) ... 13 more 2020-01-30 19:38:13,297 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: k2, type: hdfs 2020-01-30 19:38:13,356 (conf-file-poller-0) [ERROR - org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:426)] Sink k2 has been removed due to an error during configuration java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found. at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:139) at org.apache.flume.sink.hdfs.HDFSEventSink.getCodec(HDFSEventSink.java:313) at org.apache.flume.sink.hdfs.HDFSEventSink.configure(HDFSEventSink.java:237) at org.apache.flume.conf.Configurables.configure(Configurables.java:41) at org.apache.flume.node.AbstractConfigurationProvider.loadSinks(AbstractConfigurationProvider.java:411) at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:102) at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:141) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found ``` kafka配置文件如下: ``` ## 组件 a1.sources=r1 r2 a1.channels=c1 c2 a1.sinks=k1 k2 ## source1 a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource a1.sources.r1.batchSize = 5000 a1.sources.r1.batchDurationMillis = 2000 a1.sources.r1.kafka.bootstrap.servers = bigdata01:9092,bigdata02:9092,bigdata03:9092 a1.sources.r1.kafka.topics=topic_start ## source2 a1.sources.r2.type = org.apache.flume.source.kafka.KafkaSource a1.sources.r2.batchSize = 5000 a1.sources.r2.batchDurationMillis = 2000 a1.sources.r2.kafka.bootstrap.servers = bigdata01:9092,bigdata02:9092,bigdata03:9092 a1.sources.r2.kafka.topics=topic_event ## channel1 a1.channels.c1.type = file a1.channels.c1.checkpointDir = /opt/modules/apache-flume-1.7.0-bin/checkpoint/behavior1 a1.channels.c1.dataDirs = /opt/modules/apache-flume-1.7.0-bin/data/behavior1/ a1.channels.c1.maxFileSize = 2146435071 a1.channels.c1.capacity = 1000000 a1.channels.c1.keep-alive = 6 ## channel2 a1.channels.c2.type = file a1.channels.c2.checkpointDir = /opt/modules/apache-flume-1.7.0-bin/checkpoint/behavior2 a1.channels.c2.dataDirs = /opt/modules/apache-flume-1.7.0-bin/data/behavior2/ a1.channels.c2.maxFileSize = 2146435071 a1.channels.c2.capacity = 1000000 a1.channels.c2.keep-alive = 6 ## sink1 a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = hdfs://bigdata01:8020/origin_data/gmall/log/topic_start/%Y-%m-%d a1.sinks.k1.hdfs.filePrefix = logstart- a1.sinks.k1.hdfs.round = true a1.sinks.k1.hdfs.roundValue = 10 a1.sinks.k1.hdfs.roundUnit = second ##sink2 a1.sinks.k2.type = hdfs a1.sinks.k2.hdfs.path = hdfs://bigdata01:8020/origin_data/gmall/log/topic_event/%Y-%m-%d a1.sinks.k2.hdfs.filePrefix = logevent- a1.sinks.k2.hdfs.round = true a1.sinks.k2.hdfs.roundValue = 10 a1.sinks.k2.hdfs.roundUnit = second ## 不要产生大量小文件 a1.sinks.k1.hdfs.rollInterval = 10 a1.sinks.k1.hdfs.rollSize = 134217728 a1.sinks.k1.hdfs.rollCount = 0 a1.sinks.k2.hdfs.rollInterval = 10 a1.sinks.k2.hdfs.rollSize = 134217728 a1.sinks.k2.hdfs.rollCount = 0 ## 控制输出文件是原生文件。 a1.sinks.k1.hdfs.fileType = CompressedStream a1.sinks.k2.hdfs.fileType = CompressedStream a1.sinks.k1.hdfs.codeC = lzop a1.sinks.k2.hdfs.codeC = lzop ## 拼装 a1.sources.r1.channels = c1 a1.sinks.k1.channel= c1 a1.sources.r2.channels = c2 a1.sinks.k2.channel= c2 ``` hadoop中的core-site.xml配置文件如下: ``` <property> <name>io.compression.codecs</name> <value> org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.SnappyCodec, com.hadoop.compression.lzo.LzoCodec, com.hadoop.compression.lzo.LzopCodec </value> </property> <property> <name>io.compression.codec.lzo.class</name> <value>com.hadoop.compression.lzo.LzoCodec</value> </property> ``` lzo的包已经放入到hadoop对应的目录下: ``` /opt/modules/hadoop-2.7.2/share/hadoop/common/hadoop-lzo-0.4.20.jar ``` 不知道是不是环境变量问题,急急急,在线等。。。。。。。
flume上传文件到hadoop,没有文件时正常,有文件时抛出DistributeFileSystem not found?
flume配置好了,分布式。没有文件的时候运行正常,往它查询的目录上传一个文件就报错 Unable to deliver event. Exception follows. org.apache.flume.EventDeliveryException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.hdfs.DistributeFileSystem not found 。但是hadoop-hdfs-2.7.3.jar我已经导入了。为啥会找不到,请指教。
flume自定义source采集到的数据出现了空行
flume自定义source后,采集到hdfs上的数据出现了空行,有谁遇见过么?
flume 读取csv数据时,发生数据被截取
flume---hdfs sink写的文件。 大概是由于一行数据大小超过16个字节,导致flum在event时,把一条数据截取成两段 ![图片说明](https://img-ask.csdn.net/upload/202001/16/1579146530_754802.png) 大致去网上查找了一些资料,都说是EventHelper 中的DEFAULT_MAX_BYTES问题,但都没给出解决问题答案!我想让event body读取整行数据!麻烦各路大神,帮帮忙!急!!! ``` private static final int DEFAULT_MAX_BYTES = 16; ``` 相关资料: https://www.maiyewang.com/archives/23888
实时数据导入hdfs,怎么样缓解写入压力?
我现在的需求是用flume进行数据源监控和传输,kafka作为中间件作为写入压力缓冲,最后导入hdfs,为后面的大数据分析。刚才问了一个大神,他说kafka和hdfs之间用stream,想问问各位大神怎么设计缓解hdfs写入压力。
Flume运行报错,显示没有配置过滤器和正则表达式无效
Flume的配置文件: a1.sources=s1 a1.channels=c1 a1.sinks=k1 a1.sources.s1.type = spooldir a1.sources.s1.channels = c1 a1.sources.s1.spoolDir =/home/frankyu/serverlogs a1.source.s1.ignorePattern= ^(.)*\\.tmp$ a1.sinks.k1.type = hdfs a1.sinks.k1.hdfs.path = hdfs://hadoop1:9000/flume a1.sinks.k1.hdfs.writeFormat = Text a1.sinks.k1.hdfs.rollInterval = 0 a1.sinks.k1.hdfs.rollSize = 0 a1.sinks.k1.hdfs.rollCount = 10 a1.sinks.k1.channel = c1 a1.channels.c1.type = memory 报错信息: 2019-03-14 02:27:33,799 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateConfigFilterSet(FlumeConfiguration.java:623)] Agent configuration for 'a1' has no configfilters. 2019-03-14 02:27:33,790 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty(FlumeConfiguration.java:1161)] Invalid property specified: source.s1.ignorePattern 2019-03-14 02:27:33,796 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:126)] Configuration property ignored: a1.source.s1.ignorePattern = ^(.)*\.tmp$ 感谢帮助!
能否用spark streaming和flume或kafka对实时网络数据进行检测
目前已经有一个训练好的机器学习分类模型,存在于HDFS上,可以对LibSVMFile格式的数据进行检测。它是对很多的一段时间内的流量数据(比如1s,很多个1s)提取特征训练之后得到的。 我们知道streaming是将输入流分成微切片,微切片能否可以是从pcap文件读取呢?因为提取特征包括训练模型的时候是需要对pcap文件操作的。 flume和kafka都是可以传输txt的,能不能传输pcap文件呢?要将输入的网络数据流像tcpdump一样可以存为pcap文件,又有像kafka一样的缓存功能可以用哪些技术呢? 最后就是能否用spark streaming利用分类模型对网络数据流进行提特征并预测,而且与防火墙联动,这在技术上是否可行?
cloudera manager 离线安装安装agent时,向主节点下载资源超时
错误日志: [19/Nov/2018 16:16:04 +0000] 2789 MainThread stacks_collection_manager INFO Using max_uncompressed_file_size_bytes: 5242880 [19/Nov/2018 16:16:04 +0000] 2789 MainThread __init__ INFO Importing metric schema from file /opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/monitor/schema.json [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'CMF_PACKAGE_DIR': '/opt/cloudera-manager/cm-5.10.2/lib64/cmf/service', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/opt/cloudera-manager/cm-5.10.2/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'PATH': '/sbin:/usr/sbin:/bin:/usr/bin', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'KEYTRUSTEE_KP_HOME': '/usr/share/keytrustee-keyprovider', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'KEYTRUSTEE_SERVER_HOME': '/usr/lib/keytrustee-server', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/opt/cloudera-manager/cm-5.10.2/share/cmf/lib/postgresql-9.0-801.jdbc4.jar', 'CDH_KMS_HOME': '/usr/lib/hadoop-kms', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HUE_HOME': '/usr/lib/hue', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CDH_SOLR_HOME': '/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SENTRY_HOME': '/usr/lib/sentry', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_LLAMA_HOME': '/usr/lib/llama/', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'ROOT': '/opt/cloudera-manager/cm-5.10.2/lib64/cmf', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'CDH_HIVE_HOME': '/usr/lib/hive', 'ORACLE_HOME': '/usr/share/oracle/instantclient', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_KAFKA_HOME': '/usr/lib/kafka', 'CDH_SPARK_HOME': '/usr/lib/spark', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_FLUME_HOME': '/usr/lib/flume-ng'} [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Created /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/process [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Chmod'ing /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/process to 0751 [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Created /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/supervisor [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Chmod'ing /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/supervisor to 0751 [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Created /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/flood [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Chowning /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/flood to cloudera-scm (498) cloudera-scm (498) [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Chmod'ing /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/flood to 0751 [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Created /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/supervisor/include [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent INFO Chmod'ing /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/supervisor/include to 0751 [19/Nov/2018 16:16:04 +0000] 2789 MainThread agent ERROR Failed to connect to previous supervisor. Traceback (most recent call last): File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/agent.py", line 2073, in find_or_start_supervisor self.configure_supervisor_clients() File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/agent.py", line 2254, in configure_supervisor_clients supervisor_options.realize(args=["-c", os.path.join(self.supervisor_dir, "supervisord.conf")]) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 1599, in realize Options.realize(self, *arg, **kw) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 333, in realize self.process_config() File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 341, in process_config self.process_config_file(do_usage) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 376, in process_config_file self.usage(str(msg)) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/supervisor-3.0-py2.6.egg/supervisor/options.py", line 164, in usage self.exit(2) SystemExit: 2 [19/Nov/2018 16:16:04 +0000] 2789 MainThread tmpfs INFO Successfully mounted tmpfs at /opt/cloudera-manager/cm-5.10.2/run/cloudera-scm-agent/process [19/Nov/2018 16:16:05 +0000] 2789 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 1) [19/Nov/2018 16:16:05 +0000] 2789 MainThread agent INFO Supervisor version: 3.0, pid: 2821 [19/Nov/2018 16:16:05 +0000] 2789 MainThread agent INFO Successfully connected to supervisor [19/Nov/2018 16:16:05 +0000] 2789 MainThread status_server INFO Using maximum impala profile bundle size of 1073741824 bytes. [19/Nov/2018 16:16:05 +0000] 2789 MainThread status_server INFO Using maximum stacks log bundle size of 1073741824 bytes. [19/Nov/2018 16:16:05 +0000] 2789 MainThread _cplogging INFO [19/Nov/2018:16:16:05] ENGINE Bus STARTING [19/Nov/2018 16:16:05 +0000] 2789 MainThread _cplogging INFO [19/Nov/2018:16:16:05] ENGINE Started monitor thread '_TimeoutMonitor'. [19/Nov/2018 16:16:06 +0000] 2789 MainThread _cplogging INFO [19/Nov/2018:16:16:06] ENGINE Serving on yingzhi01.com:9000 [19/Nov/2018 16:16:06 +0000] 2789 MainThread _cplogging INFO [19/Nov/2018:16:16:06] ENGINE Bus STARTED [19/Nov/2018 16:16:06 +0000] 2789 MainThread __init__ INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x2990c50>,) [19/Nov/2018 16:16:06 +0000] 2789 MonitorDaemon-Scheduler __init__ INFO Monitor ready to report: ('HostMonitor',) [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Setting default socket timeout to 30 [19/Nov/2018 16:16:06 +0000] 2789 Monitor-HostMonitor network_interfaces INFO NIC iface eth0 doesn't support ETHTOOL (95) [19/Nov/2018 16:16:06 +0000] 2789 Monitor-HostMonitor throttling_logger ERROR Error getting directory attributes for /opt/cloudera-manager/cm-5.10.2/log/cloudera-scm-agent Traceback (most recent call last): File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/monitor/dir_monitor.py", line 90, in _get_directory_attributes name = pwd.getpwuid(uid)[0] KeyError: 'getpwuid(): uid not found: 1106' [19/Nov/2018 16:16:06 +0000] 2789 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.22 min:0.22 mean:0.22 max:0.22 LIFE_MAX:0.22 [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO CM server guid: dceeafae-a884-42f1-ba7b-4ee187ef3bef [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Using parcels directory from server provided value: /opt/cloudera/parcels [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent WARNING Expected user root for /opt/cloudera/parcels but was cloudera-scm [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent WARNING Expected group root for /opt/cloudera/parcels but was cloudera-scm [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Created /opt/cloudera/parcel-cache [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Chowning /opt/cloudera/parcel-cache to root (0) root (0) [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Chmod'ing /opt/cloudera/parcel-cache to 0755 [19/Nov/2018 16:16:06 +0000] 2789 MainThread parcel INFO Agent does create users/groups and apply file permissions [19/Nov/2018 16:16:06 +0000] 2789 MainThread downloader INFO Downloader path: /opt/cloudera/parcel-cache [19/Nov/2018 16:16:06 +0000] 2789 MainThread parcel_cache INFO Using /opt/cloudera/parcel-cache for parcel cache [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Flood daemon (re)start attempt [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Created /opt/cloudera/parcels/.flood [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Chowning /opt/cloudera/parcels/.flood to cloudera-scm (498) cloudera-scm (498) [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Chmod'ing /opt/cloudera/parcels/.flood to 0755 [19/Nov/2018 16:16:06 +0000] 2789 MainThread agent INFO Triggering supervisord update. [19/Nov/2018 16:16:36 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:16:36 +0000] 2789 MainThread agent INFO Active parcel list updated; recalculating component info. [19/Nov/2018 16:16:36 +0000] 2789 MainThread throttling_logger WARNING CMF_AGENT_JAVA_HOME environment variable host override will be deprecated in future. JAVA_HOME setting configured from CM server takes precedence over host agent override. Configure JAVA_HOME setting from CM server. [19/Nov/2018 16:16:36 +0000] 2789 MainThread throttling_logger INFO Identified java component java8 with full version JAVA_HOME=/opt/modules/jdk1.8.0_144 java version "1.8.0_144" Java(TM) SE Runtime Environment (build 1.8.0_144-b01) Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) for requested version . [19/Nov/2018 16:16:36 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.6659779549 [19/Nov/2018 16:16:36 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:16:44 +0000] 2789 Monitor-HostMonitor throttling_logger ERROR Timeout with args ['ntpdc', '-np'] None [19/Nov/2018 16:16:44 +0000] 2789 Monitor-HostMonitor throttling_logger ERROR Failed to collect NTP metrics Traceback (most recent call last): File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/monitor/host/ntp_monitor.py", line 48, in collect self.collect_ntpd() File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/monitor/host/ntp_monitor.py", line 66, in collect_ntpd result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/monitor/host/ntp_monitor.py", line 38, in _subprocess_with_timeout return subprocess_with_timeout(args, timeout) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/subprocess_timeout.py", line 94, in subprocess_with_timeout raise Exception("timeout with args %s" % args) Exception: timeout with args ['ntpdc', '-np'] [19/Nov/2018 16:17:06 +0000] 2789 DnsResolutionMonitor throttling_logger INFO Using java location: '/opt/modules/jdk1.8.0_144/bin/java'. [19/Nov/2018 16:17:06 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:17:06 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1082139015 [19/Nov/2018 16:17:06 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:17:36 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:17:36 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1235852242 [19/Nov/2018 16:17:36 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:18:07 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:18:07 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1040799618 [19/Nov/2018 16:18:07 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:18:37 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:18:37 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1849529743 [19/Nov/2018 16:18:37 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:19:07 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:19:07 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1211960316 [19/Nov/2018 16:19:07 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:19:37 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:19:37 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1215620041 [19/Nov/2018 16:19:37 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:20:01 +0000] 2789 CP Server Thread-4 _cplogging INFO 192.168.164.35 - - [19/Nov/2018:16:20:01] "GET /heartbeat HTTP/1.1" 200 2 "" "NING/1.0" [19/Nov/2018 16:20:04 +0000] 2789 CP Server Thread-5 _cplogging INFO 192.168.164.35 - - [19/Nov/2018:16:20:04] "GET /heartbeat HTTP/1.1" 200 2 "" "NING/1.0" [19/Nov/2018 16:20:07 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:20:07 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1212861538 [19/Nov/2018 16:20:07 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:20:37 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:20:37 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1753029823 [19/Nov/2018 16:20:37 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:20:37 +0000] 2789 Thread-13 downloader INFO Fetching torrent: http://yingzhi01.com:7180/cmf/parcel/download/CDH-5.10.2-1.cdh5.10.2.p0.5-el6.parcel.torrent [19/Nov/2018 16:20:37 +0000] 2789 Thread-13 downloader INFO Starting download of: http://yingzhi01.com:7180/cmf/parcel/download/CDH-5.10.2-1.cdh5.10.2.p0.5-el6.parcel [19/Nov/2018 16:21:07 +0000] 2789 Thread-13 downloader ERROR Unexpected exception during download Traceback (most recent call last): File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/cmf/downloader.py", line 279, in download self.client.AddTorrent(torrent_url) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/flood/util/cmd.py", line 159, in __call__ return self.fn.__get__(self.binding)(*args, **kwargs) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/flood/util/rpc.py", line 68, in <lambda> return lambda *pargs, **kwargs: self._invoke(*pargs, **kwargs) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/flood/util/rpc.py", line 77, in _invoke return rpcClient.requestor.request(self.schema.name, msg) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/flood/util/rpc.py", line 129, in requestor return avro.ipc.Requestor(self.SCHEMA, self.transceiver) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.10.2-py2.6.egg/flood/util/rpc.py", line 125, in transceiver return avro.ipc.HTTPTransceiver(self.server.host, self.server.port) File "/opt/cloudera-manager/cm-5.10.2/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 469, in __init__ self.conn.connect() File "/usr/lib64/python2.6/httplib.py", line 771, in connect self.timeout) File "/usr/lib64/python2.6/socket.py", line 567, in create_connection raise error, msg timeout: timed out [19/Nov/2018 16:21:07 +0000] 2789 Thread-13 downloader INFO Finished download [ url: http://yingzhi01.com:7180/cmf/parcel/download/CDH-5.10.2-1.cdh5.10.2.p0.5-el6.parcel, state: exception, total_bytes: 0, downloaded_bytes: 0, start_time: 2018-11-19 16:20:37, download_end_time: , end_time: 2018-11-19 16:21:07, code: 600, exception_msg: timed out, path: None ] [19/Nov/2018 16:21:07 +0000] 2789 MainThread downloader ERROR Failed rack peer update: timed out [19/Nov/2018 16:21:07 +0000] 2789 MainThread agent WARNING Long HB processing time: 30.1247620583 [19/Nov/2018 16:21:07 +0000] 2789 MainThread agent WARNING Delayed HB: 15s since last [19/Nov/2018 16:21:07 +0000] 2789 Thread-13 downloader INFO Fetching torrent: http://yingzhi01.com:7180/cmf/parcel/download/CDH-5.10.2-1.cdh5.10.2.p0.5-el6.parcel.torrent [19/Nov/2018 16:21:08 +0000] 2789 Thread-13 downloader INFO Starting download of: http://yingzhi01.com:7180/cmf/parcel/download/CDH-5.10.2-1.cdh5.10.2.p0.5-el6.parcel [19/Nov/2018 16:21:38 +0000] 2789 Thread-13 downloader ERROR Unexpected exception during download 然后就是不断重复超时错误求大神指点。。。
动态规划入门到熟悉,看不懂来打我啊
持续更新。。。。。。 2.1斐波那契系列问题 2.2矩阵系列问题 2.3跳跃系列问题 3.1 01背包 3.2 完全背包 3.3多重背包 3.4 一些变形选讲 2.1斐波那契系列问题 在数学上,斐波纳契数列以如下被以递归的方法定义:F(0)=0,F(1)=1, F(n)=F(n-1)+F(n-2)(n&gt;=2,n∈N*)根据定义,前十项为1, 1, 2, 3...
终于明白阿里百度这样的大公司,为什么面试经常拿ThreadLocal考验求职者了
点击上面↑「爱开发」关注我们每晚10点,捕获技术思考和创业资源洞察什么是ThreadLocalThreadLocal是一个本地线程副本变量工具类,各个线程都拥有一份线程私...
对计算机专业来说学历真的重要吗?
我本科学校是渣渣二本,研究生学校是985,现在毕业五年,校招笔试、面试,社招面试参加了两年了,就我个人的经历来说下这个问题。 这篇文章很长,但绝对是精华,相信我,读完以后,你会知道学历不好的解决方案,记得帮我点赞哦。 先说结论,无论赞不赞同,它本质就是这样:对于技术类工作而言,学历五年以内非常重要,但有办法弥补。五年以后,不重要。 目录: 张雪峰讲述的事实 我看到的事实 为什么会这样 ...
Java学习的正确打开方式
在博主认为,对于入门级学习java的最佳学习方法莫过于视频+博客+书籍+总结,前三者博主将淋漓尽致地挥毫于这篇博客文章中,至于总结在于个人,实际上越到后面你会发现学习的最好方式就是阅读参考官方文档其次就是国内的书籍,博客次之,这又是一个层次了,这里暂时不提后面再谈。博主将为各位入门java保驾护航,各位只管冲鸭!!!上天是公平的,只要不辜负时间,时间自然不会辜负你。 何谓学习?博主所理解的学习,它是一个过程,是一个不断累积、不断沉淀、不断总结、善于传达自己的个人见解以及乐于分享的过程。
程序员必须掌握的核心算法有哪些?
由于我之前一直强调数据结构以及算法学习的重要性,所以就有一些读者经常问我,数据结构与算法应该要学习到哪个程度呢?,说实话,这个问题我不知道要怎么回答你,主要取决于你想学习到哪些程度,不过针对这个问题,我稍微总结一下我学过的算法知识点,以及我觉得值得学习的算法。这些算法与数据结构的学习大多数是零散的,并没有一本把他们全部覆盖的书籍。下面是我觉得值得学习的一些算法以及数据结构,当然,我也会整理一些看过
大学四年自学走来,这些私藏的实用工具/学习网站我贡献出来了
大学四年,看课本是不可能一直看课本的了,对于学习,特别是自学,善于搜索网上的一些资源来辅助,还是非常有必要的,下面我就把这几年私藏的各种资源,网站贡献出来给你们。主要有:电子书搜索、实用工具、在线视频学习网站、非视频学习网站、软件下载、面试/求职必备网站。 注意:文中提到的所有资源,文末我都给你整理好了,你们只管拿去,如果觉得不错,转发、分享就是最大的支持了。 一、电子书搜索 对于大部分程序员...
Python 植物大战僵尸代码实现(2):植物卡片选择和种植
这篇文章要介绍的是: - 上方植物卡片栏的实现。 - 点击植物卡片,鼠标切换为植物图片。 - 鼠标移动时,判断当前在哪个方格中,并显示半透明的植物作为提示。
防劝退!数据结构和算法难理解?可视化动画带你轻松透彻理解!
大家好,我是 Rocky0429,一个连数据结构和算法都不会的蒟蒻… 学过数据结构和算法的都知道这玩意儿不好学,没学过的经常听到这样的说法还没学就觉得难,其实难吗?真难! 难在哪呢?当年我还是个小蒟蒻,初学数据结构和算法的时候,在忍着枯燥看完定义原理,之后想实现的时候,觉得它们的过程真的是七拐八绕,及其难受。 在简单的链表、栈和队列这些我还能靠着在草稿上写写画画理解过程,但是到了数论、图...
【搞定 Java 并发面试】面试最常问的 Java 并发基础常见面试题总结!
本文为 SnailClimb 的原创,目前已经收录自我开源的 JavaGuide 中(61.5 k Star!【Java学习 面试指南】 一份涵盖大部分Java程序员所需要掌握的核心知识。欢迎 Star!)。 另外推荐一篇原创:终极推荐!可能是最适合你的Java学习路线 方法 网站 书籍推荐! Java 并发基础常见面试题总结 1. 什么是线程和进程? 1.1. 何为进程? 进程是程...
西游记团队中如果需要裁掉一个人,会先裁掉谁?
2019年互联网寒冬,大批企业开始裁员,下图是网上流传的一张截图: 裁员不可避免,那如何才能做到不管大环境如何变化,自身不受影响呢? 我们先来看一个有意思的故事,如果西游记取经团队需要裁员一名,会裁掉谁呢,为什么? 西游记团队组成: 1.唐僧 作为团队teamleader,有很坚韧的品性和极高的原则性,不达目的不罢休,遇到任何问题,都没有退缩过,又很得上司支持和赏识(直接得到唐太宗的任命,既给
shell脚本:备份数据库、代码上线
备份MySQL数据库 场景: 一台MySQL服务器,跑着5个数据库,在没有做主从的情况下,需要对这5个库进行备份 需求: 1)每天备份一次,需要备份所有的库 2)把备份数据存放到/data/backup/下 3)备份文件名称格式示例:dbname-2019-11-23.sql 4)需要对1天以前的所有sql文件压缩,格式为gzip 5)本地数据保留1周 6)需要把备份的数据同步到远程备份中心,假如...
iOS Bug 太多,苹果终于坐不住了!
开源的 Android 和闭源的 iOS,作为用户的你,更偏向哪一个呢? 整理 | 屠敏 出品 | CSDN(ID:CSDNnews) 毋庸置疑,当前移动设备操作系统市场中,Android 和 iOS 作为两大阵营,在相互竞争的同时不断演进。不过一直以来,开源的 Android 吸引了无数的手机厂商涌入其中,为其生态带来了百花齐放的盛景,但和神秘且闭源的 iOS 系统相比,不少网友...
神经⽹络可以计算任何函数的可视化证明
《Neural Networks and Deep Learning》读书笔记第四篇本章其实和前面章节的关联性不大,所以大可将本章作为小短文来阅读,当然基本的深度学习基础还是要有的。主要介绍了神经⽹络拥有的⼀种普遍性,比如说不管目标函数是怎样的,神经网络总是能够对任何可能的输入,其值(或者说近似值)是网络的输出,哪怕是多输入和多输出也是如此,我们大可直接得出一个结论:不论我们想要计算什么样的函数,...
聊聊C语言和指针的本质
坐着绿皮车上海到杭州,24块钱,很宽敞,在火车上非正式地聊几句。 很多编程语言都以 “没有指针” 作为自己的优势来宣传,然而,对于C语言,指针却是与生俱来的。 那么,什么是指针,为什么大家都想避开指针。 很简单, 指针就是地址,当一个地址作为一个变量存在时,它就被叫做指针,该变量的类型,自然就是指针类型。 指针的作用就是,给出一个指针,取出该指针指向地址处的值。为了理解本质,我们从计算机模型说起...
为什么你学不过动态规划?告别动态规划,谈谈我的经验
动态规划难吗?说实话,我觉得很难,特别是对于初学者来说,我当时入门动态规划的时候,是看 0-1 背包问题,当时真的是一脸懵逼。后来,我遇到动态规划的题,看的懂答案,但就是自己不会做,不知道怎么下手。就像做递归的题,看的懂答案,但下不了手,关于递归的,我之前也写过一篇套路的文章,如果对递归不大懂的,强烈建议看一看:为什么你学不会递归,告别递归,谈谈我的经验 对于动态规划,春招秋招时好多题都会用到动态...
程序员一般通过什么途径接私活?
二哥,你好,我想知道一般程序猿都如何接私活,我也想接,能告诉我一些方法吗? 上面是一个读者“烦不烦”问我的一个问题。其实不止是“烦不烦”,还有很多读者问过我类似这样的问题。 我接的私活不算多,挣到的钱也没有多少,加起来不到 20W。说实话,这个数目说出来我是有点心虚的,毕竟太少了,大家轻喷。但我想,恰好配得上“一般程序员”这个称号啊。毕竟苍蝇再小也是肉,我也算是有经验的人了。 唾弃接私活、做外...
字节跳动面试官这样问消息队列:分布式事务、重复消费、顺序消费,我整理了一下
你知道的越多,你不知道的越多 点赞再看,养成习惯 GitHub上已经开源 https://github.com/JavaFamily 有一线大厂面试点脑图、个人联系方式和人才交流群,欢迎Star和完善 前言 消息队列在互联网技术存储方面使用如此广泛,几乎所有的后端技术面试官都要在消息队列的使用和原理方面对小伙伴们进行360°的刁难。 作为一个在互联网公司面一次拿一次Offer的面霸...
如何安装 IntelliJ IDEA 最新版本——详细教程
IntelliJ IDEA 简称 IDEA,被业界公认为最好的 Java 集成开发工具,尤其在智能代码助手、代码自动提示、代码重构、代码版本管理(Git、SVN、Maven)、单元测试、代码分析等方面有着亮眼的发挥。IDEA 产于捷克,开发人员以严谨著称的东欧程序员为主。IDEA 分为社区版和付费版两个版本。 我呢,一直是 Eclipse 的忠实粉丝,差不多十年的老用户了。很早就接触到了 IDEA...
面试还搞不懂redis,快看看这40道面试题(含答案和思维导图)
Redis 面试题 1、什么是 Redis?. 2、Redis 的数据类型? 3、使用 Redis 有哪些好处? 4、Redis 相比 Memcached 有哪些优势? 5、Memcache 与 Redis 的区别都有哪些? 6、Redis 是单进程单线程的? 7、一个字符串类型的值能存储最大容量是多少? 8、Redis 的持久化机制是什么?各自的优缺点? 9、Redis 常见性...
大学四年自学走来,这些珍藏的「实用工具/学习网站」我全贡献出来了
知乎高赞:文中列举了互联网一线大厂程序员都在用的工具集合,涉及面非常广,小白和老手都可以进来看看,或许有新收获。
为什么要推荐大家学习字节码?
配套视频: 为什么推荐大家学习Java字节码 https://www.bilibili.com/video/av77600176/ 一、背景 本文主要探讨:为什么要学习 JVM 字节码? 可能很多人会觉得没必要,因为平时开发用不到,而且不学这个也没耽误学习。 但是这里分享一点感悟,即人总是根据自己已经掌握的知识和技能来解决问题的。 这里有个悖论,有时候你觉得有些技术没用恰恰是...
互联网公司的裁员,能玩出多少种花样?
裁员,也是一门学问,可谓博大精深!以下,是互联网公司的裁员的多种方法:-正文开始-135岁+不予续签的理由:千禧一代网感更强。95后不予通过试用期的理由:已婚已育员工更有责任心。2通知接下来要过苦日子,让一部分不肯同甘共苦的员工自己走人,以“兄弟”和“非兄弟”来区别员工。3强制996。员工如果平衡不了工作和家庭,可在离婚或离职里二选一。4不布置任何工作,但下班前必须提交千字工作日报。5不给活干+...
【超详细分析】关于三次握手与四次挥手面试官想考我们什么?
在面试中,三次握手和四次挥手可以说是问的最频繁的一个知识点了,我相信大家也都看过很多关于三次握手与四次挥手的文章,今天的这篇文章,重点是围绕着面试,我们应该掌握哪些比较重要的点,哪些是比较被面试官给问到的,我觉得如果你能把我下面列举的一些点都记住、理解,我想就差不多了。 三次握手 当面试官问你为什么需要有三次握手、三次握手的作用、讲讲三次三次握手的时候,我想很多人会这样回答: 首先很多人会先讲下握...
新程序员七宗罪
当我发表这篇文章《为什么每个工程师都应该开始考虑开发中的分析和编程技能呢?》时,我从未想到它会对读者产生如此积极的影响。那些想要开始探索编程和数据科学领域的人向我寻求建议;还有一些人问我下一篇文章的发布日期;还有许多人询问如何顺利过渡到这个职业。我非常鼓励大家继续分享我在这个旅程的经验,学习,成功和失败,以帮助尽可能多的人过渡到一个充满无数好处和机会的职业生涯。亲爱的读者,谢谢你。 -罗伯特。 ...
活到老,学到老,程序员也该如此
全文共2763字,预计学习时长8分钟 图片来源:Pixabay 此前,“网传阿里巴巴要求尽快实现P8全员35周岁以内”的消息闹得沸沸扬扬。虽然很快被阿里辟谣,但苍蝇不叮无缝的蛋,无蜜不招彩蝶蜂。消息从何而来?真相究竟怎样?我们无从而知。我们只知道一个事实:不知从何时开始,程序猿也被划在了“吃青春饭”行业之列。 饱受“996ICU”摧残后,好不容易“头秃了变强了”,即将步入为“高...
Vue快速实现通用表单验证
本文开篇第一句话,想引用鲁迅先生《祝福》里的一句话,那便是:“我真傻,真的,我单单知道后端整天都是CRUD,我没想到前端整天都是Form表单”。这句话要从哪里说起呢?大概要从最近半个月的“全栈工程师”说起。项目上需要做一个城市配载的功能,顾名思义,就是通过框选和拖拽的方式在地图上完成配载。博主选择了前后端分离的方式,在这个过程中发现:首先,只要有依赖jQuery的组件,譬如Kendoui,即使使用...
2019年Spring Boot面试都问了什么?快看看这22道面试题!
Spring Boot 面试题 1、什么是 Spring Boot? 2、Spring Boot 有哪些优点? 3、什么是 JavaConfig? 4、如何重新加载 Spring Boot 上的更改,而无需重新启动服务器? 5、Spring Boot 中的监视器是什么? 6、如何在 Spring Boot 中禁用 Actuator 端点安全性? 7、如何在自定义端口上运行 Sprin...
【图解】记一次手撕算法面试:字节跳动的面试官把我四连击了
字节跳动这家公司,应该是所有秋招的公司中,对算法最重视的一个了,每次面试基本都会让你手撕算法,今天这篇文章就记录下当时被问到的几个算法题,并且每个算法题我都详细着给出了最优解,下面再现当时的面试场景。看完一定让你有所收获 一、小牛试刀:有效括号 大部分情况下,面试官都会问一个不怎么难的问题,不过你千万别太开心,因为这道题往往可以拓展出更多有难度的问题,或者一道题看起来很简单,但是给出最优解,确实很...
关于裁员几点看法及建议
最近网易裁员事件引起广泛关注,昨天网易针对此事,也发了声明,到底谁对谁错,孰是孰非?我们作为吃瓜观众实在是知之甚少,所以不敢妄下定论。身处软件开发这个行业,近一两年来,对...
面试官:关于Java性能优化,你有什么技巧
通过使用一些辅助性工具来找到程序中的瓶颈,然后就可以对瓶颈部分的代码进行优化。 一般有两种方案:即优化代码或更改设计方法。我们一般会选择后者,因为不去调用以下代码要比调用一些优化的代码更能提高程序的性能。而一个设计良好的程序能够精简代码,从而提高性能。 下面将提供一些在JAVA程序的设计和编码中,为了能够提高JAVA程序的性能,而经常采用的一些方法和技巧。 1.对象的生成和大小的调整。 J...
【图解算法面试】记一次面试:说说游戏中的敏感词过滤是如何实现的?
版权声明:本文为苦逼的码农原创。未经同意禁止任何形式转载,特别是那些复制粘贴到别的平台的,否则,必定追究。欢迎大家多多转发,谢谢。 小秋今天去面试了,面试官问了一个与敏感词过滤算法相关的问题,然而小秋对敏感词过滤算法一点也没听说过。于是,有了下下事情的发生… 面试官开怼 面试官:玩过王者荣耀吧?了解过敏感词过滤吗?,例如在游戏里,如果我们发送“你在干嘛?麻痹演员啊你?”,由于“麻痹”是一个敏感词,...
程序员需要了解的硬核知识之汇编语言(一)
之前的系列文章从 CPU 和内存方面简单介绍了一下汇编语言,但是还没有系统的了解一下汇编语言,汇编语言作为第二代计算机语言,会用一些容易理解和记忆的字母,单词来代替一个特定的指令,作为高级编程语言的基础,有必要系统的了解一下汇编语言,那么本篇文章希望大家跟我一起来了解一下汇编语言。 汇编语言和本地代码 我们在之前的文章中探讨过,计算机 CPU 只能运行本地代码(机器语言)程序,用 C 语言等高级语...
GitHub 标星 1.6w+,我发现了一个宝藏项目,作为编程新手有福了!
大家好,我是 Rocky0429,一个最近老在 GitHub 上闲逛的蒟蒻… 特别惭愧的是,虽然我很早就知道 GitHub,但是学会逛 GitHub 的时间特别晚。当时一方面是因为菜,看着这种全是英文的东西难受,不知道该怎么去玩,另一方面是一直在搞 ACM,没有做一些工程类的项目,所以想当然的以为和 GitHub 也没什么关系(当然这种想法是错误的)。 后来自己花了一个星期看完了 Pyt...
java知识体系整理,学会了,月入过万不是梦
欢迎关注个人公众号:程序猿学社 前言: 一转眼,工作4年了,正式写博客也有一年多了,之前就有整理和总结的习惯,只是都记录在有道云,感觉知识点都是很凌乱,花时间系统整理下,该文会一直同步更新,有不足之处,希望各位同行指正,既然,选择做技术这行,就得有分享的精神,而不是抱着别人会超过你的心理。希望各位博友们互相交流,互相进步。 目录 java系统学习 小白也能...
2020年去一线大厂面试先过SSM框架源码这一关!
SSM框架介绍 (1)持久层(Mybatis):Dao层(mapper) DAO层:DAO层主要是做数据持久层的工作,负责与数据库进行联络的一些任务都封装在此。 DAO层的设计首先是设计DAO的接口。 然后在Spring的配置文件中定义此接口的实现类。 然后就可在模块中调用此接口来进行数据业务的处理,而不用关心此接口的具体实现类是哪个类,显得结构非常清晰。 DAO层的数据源配置,以及有...
相关热词 c#开发的dll注册 c#的反射 c# grid绑定数据源 c#多线程怎么循环 c# 鼠标左键 c# char占位符 c# 日期比较 c#16进制转换为int c#用递归求顺序表中最大 c#小型erp源代码
立即提问