timruning 2016-09-15 15:52 采纳率: 50%
浏览 2835

Spark读取错误PrematureEOFfrominputStream

:主要问题java.io.EOFException: Premature EOF from inputStream
使用textFile或者newAPIHadoopFile都出现这个错误
写spark读取数据的时候一直报这个错误。
连count,repartition都过不去。数据读的比平常慢的多。
看数据文件,应该是很均匀的,应该不是数据倾斜的问题了吧。
下面是报错信息:

 16/09/15 23:27:57 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 41 in stage 0.0 failed 4 times, most recent failure: Lost task 41.3 in stage 0.0 (TID 5736, dn076179.heracles.sohuno.com): java.io.EOFException: Premature EOF from inputStream
    at com.hadoop.compression.lzo.LzopInputStream.readFully(LzopInputStream.java:75)
    at com.hadoop.compression.lzo.LzopInputStream.readHeader(LzopInputStream.java:114)
    at com.hadoop.compression.lzo.LzopInputStream.<init>(LzopInputStream.java:54)
    at com.hadoop.compression.lzo.LzopCodec.createInputStream(LzopCodec.java:83)
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:102)
    at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:133)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)
    Driver stacktrace:
org.apache.spark.SparkException: Job aborted due to stage failure: Task 41 in stage 0.0 failed 4 times, most recent failure: Lost task 41.3 in stage 0.0 (TID 5736, dn076179.heracles.sohuno.com): java.io.EOFException: Premature EOF from inputStream
    at com.hadoop.compression.lzo.LzopInputStream.readFully(LzopInputStream.java:75)
    at com.hadoop.compression.lzo.LzopInputStream.readHeader(LzopInputStream.java:114)
    at com.hadoop.compression.lzo.LzopInputStream.<init>(LzopInputStream.java:54)
    at com.hadoop.compression.lzo.LzopCodec.createInputStream(LzopCodec.java:83)
    at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.initialize(LineRecordReader.java:102)
    at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:133)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:104)
    at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:66)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:70)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.run(Task.scala:70)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:744)

  • 写回答

1条回答

  • devmiao 2016-09-15 15:56
    关注
    评论

报告相同问题?

悬赏问题

  • ¥15 找一个网络防御专家,外包的
  • ¥100 能不能让两张不同的图片md5值一样,(有尝)
  • ¥15 informer代码训练自己的数据集,改参数怎么改
  • ¥15 请看一下,学校实验要求,我需要具体代码
  • ¥50 pc微信3.6.0.18不能登陆 有偿解决问题
  • ¥20 MATLAB绘制两隐函数曲面的交线
  • ¥15 求TYPCE母转母转接头24PIN线路板图
  • ¥100 国外网络搭建,有偿交流
  • ¥15 高价求中通快递查询接口
  • ¥15 解决一个加好友限制问题 或者有好的方案