落花 2024-05-15 10:14 采纳率: 0%
浏览 49

docker 部署 flink on yarn 集群任务提交报错

docker 部署 flink on yarn 集群任务提交报错

flink配置文件

blob:
  server:
    port: '6124'
taskmanager:
  memory:
    process:
      size: 1728m
  bind-host: 0.0.0.0
  numberOfTaskSlots: 3
jobmanager:
  execution:
    failover-strategy: region
  rpc:
    address: jobmanager
    port: 6123
  memory:
    process:
      size: 1600m
  bind-host: 0.0.0.0
state:
  savepoints:
    dir: hdfs://master:8020/flink/savepoints
  backend: filesystem
  checkpoints:
    dir: hdfs://master:8020/flink/checkpoints
query:
  server:
    port: '6125'
rest:
  bind-address: 0.0.0.0
  address: 0.0.0.0
parallelism:
  default: 1
fs:
  hdfs:
    hadoopconf: /hadoop/etc/hadoop/
env:
  java:
    opts:
      all: --add-exports=java.base/sun.net.util=ALL-UNNAMED --add-exports=java.rmi/sun.rmi.registry=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.base/java.time=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.locks=ALL-UNNAMED

flink 启动命令

 docker run \
 -itd \
 --name=jobmanager \
 --hostname=jobmanager \
 --privileged=true \
 --restart always \
 --net hadoop \
 -p 8081:8081 \
 -p 8082:8082 \
 -e FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager" \
 -e HADOOP_CONF_DIR=/hadoop/etc/hadoop \
 -e JAVA_TOOL_OPTIONS="-Djava.security.auth.login.config=/secret/jaas.conf" \
 -v /hadoop/master/conf/yarn-site.xml:/hadoop/etc/hadoop/yarn-site.xml \
 -v /hadoop/master/conf/hdfs-site.xml:/hadoop/etc/hadoop/hdfs-site.xml \
 -v /hadoop/master/conf/core-site.xml:/hadoop/etc/hadoop/core-site.xml \
 -v /flink/conf/config.yaml:/opt/flink/conf/config.yaml \
 -v /flink/conf/jaas.conf:/secret/jaas.conf \
 -v /flink/log:/opt/flink/log \
 -v /flink/data:/tmp \
 -v /flink/conf/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:/opt/flink/lib/flink-shaded-hadoop-2-uber-2.8.3-10.0.jar \
 flink:1.19.0-scala_2.12 jobmanager

hadoop配置文件
yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.application.classpath</name>
        <value>/usr/local/hadoop/etc/hadoop, /usr/local/hadoop/share/hadoop/common/*, /usr/local/hadoop/share/hadoop/common/lib/*, /usr/local/hadoop/share/hadoop/hdfs/*, /usr/local/hadoop/share/hadoop/hdfs/lib/*, /usr/local/hadoop/share/hadoop/mapreduce/*, /usr/local/hadoop/share/hadoop/mapreduce/lib/*, /usr/local/hadoop/share/hadoop/yarn/*, /usr/local/hadoop/share/hadoop/yarn/lib/*</value>
    </property>

    <property>
        <description>
            Number of seconds after an application finishes before the nodemanager's
            DeletionService will delete the application's localized file directory
            and log directory.

            To diagnose Yarn application problems, set this property's value large
            enough (for example, to 600 = 10 minutes) to permit examination of these
            directories. After changing the property's value, you must restart the
            nodemanager in order for it to have an effect.

            The roots of Yarn applications' work directories is configurable with
            the yarn.nodemanager.local-dirs property (see below), and the roots
            of the Yarn applications' log directories is configurable with the
            yarn.nodemanager.log-dirs property (see also below).
        </description>
        <name>yarn.nodemanager.delete.debug-delay-sec</name>
        <value>600</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
    </property>
    <property>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
    </property>
    <property>
       <name>yarn.nodemanager.pmem-check-enabled</name>
       <value>false</value>
    </property>
    <property>
        <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
        <value>80</value>
    </property>
    <property>
        <name>yarn.resourcemanager.am.max-attempts</name>
        <value>4</value>
        <description>
        The maximum number of application master execution attempts.
        </description>
    </property>
    <property>
       <name>yarn.resourcemanager.address</name>
       <value>master:8032</value>
    </property>
    <property>
       <name>yarn.resourcemanager.scheduler.address</name>
       <value>master:8030</value>
    </property>
    <property>
       <name>yarn.resourcemanager.webapp.address</name>
       <value>master:8088</value>
    </property>
    <property>
       <name>yarn.resourcemanager.resource-tracker.address</name>
       <value>master:8031</value>
    </property>
    <property>
       <name>yarn.resourcemanager.admin.address</name>
       <value>master:8033</value>
    </property>
</configuration>

hadoop启动命令

docker run -itd \
--name master \
--privileged=true \
--hostname master \
--restart always \
--net hadoop \
-v /hadoop/web-basic/hadoop-http-auth-signature-secret:/home/huser/hadoop/hadoop-http-auth-signature-secret \
-v /hadoop/master/conf/hdfs-site.xml:/hadoop/etc/hadoop/hdfs-site.xml \
-v /hadoop/master/conf/yarn-site.xml:/hadoop/etc/hadoop/yarn-site.xml \
-v /hadoop/master/conf/slaves:/hadoop/etc/hadoop/slaves \
-v /hadoop/master/data:/root/hdfs \
-v /hadoop/master/log:/hadoop/logs \
-p 50070:50070 \
-p 8020:8020 \
-p 8088:8088 \
hadoop:2.8.2-jdk1.8
docker run -itd \
--name slave1 \
--privileged=true \
--hostname slave1 \
--restart always \
--net hadoop \
-p 50071:50075 \
-p 8042:8042 \
-v /hadoop/master/conf/hdfs-site.xml:/hadoop/etc/hadoop/hdfs-site.xml \
-v /hadoop/master/conf/yarn-site.xml:/hadoop/etc/hadoop/yarn-site.xml \
-v /hadoop/slave1/data:/root/hdfs \
-v /hadoop/slave1/log:/hadoop/logs \
hadoop:2.8.2-jdk1.8

flink提交任务报错

2024-05-15 02:05:23,208 INFO  org.apache.flink.runtime.jobmaster.JobMaster                 [] - Could not resolve ResourceManager address pekko.tcp://flink@slave1:46545/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address pekko.tcp://flink@slave1:46545/user/rpc/resourcemanager_*.
2024-05-15 02:05:43,249 INFO  org.apache.flink.runtime.jobmaster.JobMaster                 [] - Could not resolve ResourceManager address pekko.tcp://flink@slave1:46545/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address pekko.tcp://flink@slave1:46545/user/rpc/resourcemanager_*.
2024-05-15 02:05:53,696 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:05:54,708 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:05:55,714 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:05:56,717 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:05:57,722 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:05:58,738 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:05:59,747 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:06:00,760 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:06:01,776 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:06:02,789 INFO  org.apache.hadoop.ipc.Client                                 [] - Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2024-05-15 02:06:02,790 WARN  org.apache.hadoop.ipc.Client                                 [] - Failed to connect to server: 0.0.0.0/0.0.0.0:8030: retries get failed due to exceeded maximum allowed retries number: 10
java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_151]
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_151]
    at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:685) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:788) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.Client$Connection.access$3500(Client.java:410) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1550) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.Client.call(Client.java:1381) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.Client.call(Client.java:1345) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at com.sun.proxy.$Proxy31.registerApplicationMaster(Unknown Source) ~[?:?]
    at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_151]
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_151]
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]
    at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at com.sun.proxy.$Proxy32.registerApplicationMaster(Unknown Source) ~[?:?]
    at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:236) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:228) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:168) ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0]
    at org.apache.flink.yarn.YarnResourceManagerDriver.registerApplicationMaster(YarnResourceManagerDriver.java:580) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.yarn.YarnResourceManagerDriver.initializeInternal(YarnResourceManagerDriver.java:189) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.resourcemanager.active.AbstractResourceManagerDriver.initialize(AbstractResourceManagerDriver.java:92) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager.initialize(ActiveResourceManager.java:196) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.resourcemanager.ResourceManager.startResourceManagerServices(ResourceManager.java:283) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.resourcemanager.ResourceManager.onStart(ResourceManager.java:254) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStart(RpcEndpoint.java:198) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor$StoppedState.lambda$start$0(PekkoRpcActor.java:618) ~[flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor$StoppedState.start(PekkoRpcActor.java:617) ~[flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleControlMessage(PekkoRpcActor.java:190) ~[flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction.applyOrElse(PartialFunction.scala:127) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) [?:1.8.0_151]
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) [?:1.8.0_151]
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) [?:1.8.0_151]
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) [?:1.8.0_151]
2024-05-15 02:06:03,297 INFO  org.apache.flink.runtime.jobmaster.JobMaster                 [] - Could not resolve ResourceManager address pekko.tcp://flink@slave1:46545/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address pekko.tcp://flink@slave1:46545/user/rpc/resourcemanager_*.
2024-05-15 02:06:05,905 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.
2024-05-15 02:06:05,907 INFO  org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting YarnApplicationClusterEntryPoint down with application status UNKNOWN. Diagnostics Cluster entrypoint has been closed externally..
2024-05-15 02:06:05,909 INFO  org.apache.flink.runtime.blob.BlobServer                     [] - Stopped BLOB server at 0.0.0.0:6124
2024-05-15 02:06:05,912 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shutting down rest endpoint.
2024-05-15 02:06:05,929 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Removing cache directory /tmp/flink-web-91ab7e74-b196-463f-9321-cd732e69018a/flink-web-ui
2024-05-15 02:06:05,930 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - http://172.18.0.3:45696 lost leadership
2024-05-15 02:06:05,930 INFO  org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shut down complete.
2024-05-15 02:06:05,930 INFO  org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent [] - Closing components.
2024-05-15 02:06:05,931 INFO  org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess [] - Stopping SessionDispatcherLeaderProcess.
2024-05-15 02:06:05,931 INFO  org.apache.flink.runtime.resourcemanager.ResourceManagerServiceImpl [] - Stopping resource manager service.
2024-05-15 02:06:05,933 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Stopping dispatcher pekko.tcp://flink@slave1:46545/user/rpc/dispatcher_0.
2024-05-15 02:06:05,933 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Stopping all currently running jobs of dispatcher pekko.tcp://flink@slave1:46545/user/rpc/dispatcher_0.
2024-05-15 02:06:05,936 INFO  org.apache.flink.runtime.jobmaster.JobMaster                 [] - Stopping the JobMaster for job 'CarTopSpeedWindowingExample' (04ac0e11213894671170f793c7dc69ad).
2024-05-15 02:06:05,937 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Job 04ac0e11213894671170f793c7dc69ad reached terminal state SUSPENDED.
2024-05-15 02:06:05,940 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Job CarTopSpeedWindowingExample (04ac0e11213894671170f793c7dc69ad) switched from state RUNNING to SUSPENDED.
org.apache.flink.util.FlinkException: Scheduler is being stopped.
    at org.apache.flink.runtime.scheduler.SchedulerBase.closeAsync(SchedulerBase.java:665) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.jobmaster.JobMaster.stopScheduling(JobMaster.java:1083) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.jobmaster.JobMaster.stopJobExecution(JobMaster.java:1046) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.jobmaster.JobMaster.onStop(JobMaster.java:451) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.rpc.RpcEndpoint.internalCallOnStop(RpcEndpoint.java:239) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor$StartedState.lambda$terminate$0(PekkoRpcActor.java:574) ~[flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.flink.runtime.concurrent.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83) ~[flink-dist-1.19.0.jar:1.19.0]
    at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor$StartedState.terminate(PekkoRpcActor.java:573) ~[flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.flink.runtime.rpc.pekko.PekkoRpcActor.handleControlMessage(PekkoRpcActor.java:196) ~[flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:33) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:29) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction.applyOrElse(PartialFunction.scala:127) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction.applyOrElse$(PartialFunction.scala:126) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:29) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:175) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:176) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.AbstractActor.aroundReceive(AbstractActor.scala:229) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:280) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:241) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:253) [flink-rpc-akka6778bd91-f9f9-42c3-9931-74761b6b8337.jar:1.19.0]
    at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289) [?:1.8.0_151]
    at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056) [?:1.8.0_151]
    at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692) [?:1.8.0_151]
    at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157) [?:1.8.0_151]
2024-05-15 02:06:05,948 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: Car data generator source -> Timestamps/Watermarks (1/1) (79bf2a5120430db0548b2253abb42dec_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from SCHEDULED to CANCELING.
2024-05-15 02:06:05,948 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Source: Car data generator source -> Timestamps/Watermarks (1/1) (79bf2a5120430db0548b2253abb42dec_cbc357ccb763df2852fee8c4fc7d55f2_0_0) switched from CANCELING to CANCELED.
2024-05-15 02:06:05,948 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Discarding the results produced by task execution 79bf2a5120430db0548b2253abb42dec_cbc357ccb763df2852fee8c4fc7d55f2_0_0.
2024-05-15 02:06:05,949 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - GlobalWindows -> Sink: Print to Std. Out (1/1) (79bf2a5120430db0548b2253abb42dec_90bea66de1c231edf33913ecd54406c1_0_0) switched from SCHEDULED to CANCELING.
2024-05-15 02:06:05,949 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - GlobalWindows -> Sink: Print to Std. Out (1/1) (79bf2a5120430db0548b2253abb42dec_90bea66de1c231edf33913ecd54406c1_0_0) switched from CANCELING to CANCELED.
2024-05-15 02:06:05,949 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Discarding the results produced by task execution 79bf2a5120430db0548b2253abb42dec_90bea66de1c231edf33913ecd54406c1_0_0.
2024-05-15 02:06:05,950 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Stopping checkpoint coordinator for job 04ac0e11213894671170f793c7dc69ad.
2024-05-15 02:06:05,952 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Job 04ac0e11213894671170f793c7dc69ad has been suspended.
2024-05-15 02:06:05,953 INFO  org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Closing SourceCoordinator for source Source: Car data generator source.
2024-05-15 02:06:05,954 INFO  org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Source coordinator for source Source: Car data generator source closed.
2024-05-15 02:06:05,955 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - Shutting down
2024-05-15 02:06:05,961 WARN  org.apache.flink.client.deployment.application.ApplicationDispatcherBootstrap [] - Application has been cancelled because the ApplicationDispatcherBootstrap is being stopped.
2024-05-15 02:06:05,962 INFO  org.apache.flink.runtime.dispatcher.StandaloneDispatcher     [] - Stopped dispatcher pekko.tcp://flink@slave1:46545/user/rpc/dispatcher_0.

一直显示连接不上,配置文件配置的是master:8030,他为什么连接的是slave,而且我的hadoop是nameode和datanode分开部署的,我在namenode节点启动了一个datanode后他就可以连接了,这是为什么

  • 写回答

2条回答 默认 最新

  • 阿里嘎多学长 2024-05-15 10:14
    关注

    以下内容由CHATGPT及阿里嘎多学长共同生成、有用望采纳:


    您想要解决的问题

    您希望解决的问题是在Docker中部署Flink到YARN集群时,任务提交出现连接错误,导致任务无法正常启动。

    问题出现原因

    1. 配置不一致:您的Flink配置文件中指定的ResourceManager地址是master:8030,但是错误日志显示Flink尝试连接到slave1:46545,这表明可能存在配置不一致的问题。
    2. 网络问题:Docker容器的网络配置可能有问题,导致Flink无法正确解析或连接到YARN ResourceManager。
    3. Hadoop配置yarn-site.xml配置文件中的ResourceManager地址设置为master,但没有指定端口号,这可能与Flink配置中的端口号不匹配。

    问题解决方案

    1. 检查网络配置:确保Docker容器的网络设置允许Flink与YARN ResourceManager通信。如果您使用的是自定义网络,请检查网络配置和DNS解析。
    2. 同步配置:确保Flink配置文件中的ResourceManager地址与Hadoop配置文件中的地址和端口一致。
    3. 更新启动命令:在Flink启动命令中,使用环境变量或参数明确指定ResourceManager的地址和端口。

    提供代码

    以下是针对Flink启动命令的修改建议,以确保它使用正确的ResourceManager地址和端口:

    docker run \
      ...
      -e FLINK_PROPERTIES="jobmanager.rpc.address: jobmanager, yarn.resourcemanager.hostname: master, yarn.resourcemanager.port: 8030" \
      ...
      flink:1.19.0-scala_2.12 jobmanager
    

    代码运行方式

    1. 确保您的Docker环境已启动,并且Hadoop和Flink的Docker容器正在运行。
    2. 根据上述修改更新Flink的启动命令。
    3. 使用修改后的命令重新启动Flink JobManager容器。

    代码预期运行结果

    预期中,修改后的Flink启动命令将能够正确连接到YARN ResourceManager,并且任务提交将不再出现连接错误。

    推荐相关链接

    评论 编辑记录

报告相同问题?

问题事件

  • 修改了问题 5月15日
  • 创建了问题 5月15日

悬赏问题

  • ¥15 如何在vue.config.js中读取到public文件夹下window.APP_CONFIG.API_BASE_URL的值
  • ¥50 浦育平台scratch图形化编程
  • ¥20 求这个的原理图 只要原理图
  • ¥15 vue2项目中,如何配置环境,可以在打完包之后修改请求的服务器地址
  • ¥20 微信的店铺小程序如何修改背景图
  • ¥15 UE5.1局部变量对蓝图不可见
  • ¥15 一共有五道问题关于整数幂的运算还有房间号码 还有网络密码的解答?(语言-python)
  • ¥20 sentry如何捕获上传Android ndk 崩溃
  • ¥15 在做logistic回归模型限制性立方条图时候,不能出完整图的困难
  • ¥15 G0系列单片机HAL库中景园gc9307液晶驱动芯片无法使用硬件SPI+DMA驱动,如何解决?