m0_37608723 2018-08-09 08:14 采纳率: 40%
浏览 6070
已采纳

多线程向flink集群提交任务失败

Caused by: org.apache.flink.client.program.ProgramInvocationException: The program execution failed: Could not upload the program's JAR files to the JobManager.
at org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:454)
at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:99)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:400)
at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:76)
at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:345)
... 14 common frames omitted
Caused by: org.apache.flink.runtime.client.JobSubmissionException: Could not upload the program's JAR files to the JobManager.
at org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:410)
at org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:451)
... 19 common frames omitted
Caused by: java.io.IOException: Could not retrieve the JobManager's blob port.
at org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:745)
at org.apache.flink.runtime.jobgraph.JobGraph.uploadUserJars(JobGraph.java:565)
at org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:407)
... 20 common frames omitted
Caused by: java.io.IOException: PUT operation failed: Could not transfer error message
at org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:512)
at org.apache.flink.runtime.blob.BlobClient.put(BlobClient.java:374)
at org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:771)
at org.apache.flink.runtime.blob.BlobClient.uploadJarFiles(BlobClient.java:740)
... 22 common frames omitted
Caused by: java.io.IOException: Could not transfer error message
at org.apache.flink.runtime.blob.BlobClient.readExceptionFromStream(BlobClient.java:799)
at org.apache.flink.runtime.blob.BlobClient.receivePutResponseAndCompare(BlobClient.java:537)
at org.apache.flink.runtime.blob.BlobClient.putInputStream(BlobClient.java:508)
... 25 common frames omitted
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.ipc.RemoteException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:64)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:501)
at java.lang.Throwable.readObject(Throwable.java:914)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:290)
at org.apache.flink.runtime.blob.BlobClient.readExceptionFromStream(BlobClient.java:795)
... 27 common frames omitted

  • 写回答

2条回答 默认 最新

  • m0_37608723 2018-09-11 02:45
    关注

    自己回复一下 对于flink提交任务时 会将任务对应的jar文件上传至远程主机(如何上传因集群部署方式不同而不同),最终存储到hdfs上,然后taskmanager会去hdfs上下载此文件。
    上传文件时,会生成对应的文件名,而文件名是根据jar包的字节码生成的(极端的说,即便jar包对应的源代码中多了一个空格,生成的文件名都不会相同)。
    所以,同一个jar会生成同样的文件名,而它又在同样的路径中,这时就会出现多线程对同一文件读写,典型的多线程访问同一资源的问题。这也就是导致上述问题的根源。

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?
  • ¥15 为什么使用javacv转封装rtsp为rtmp时出现如下问题:[h264 @ 000000004faf7500]no frame?
  • ¥15 乘性高斯噪声在深度学习网络中的应用
  • ¥15 运筹学排序问题中的在线排序
  • ¥15 关于docker部署flink集成hadoop的yarn,请教个问题 flink启动yarn-session.sh连不上hadoop,这个整了好几天一直不行,求帮忙看一下怎么解决
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛