Ayingwa 2025-05-16 12:48 采纳率: 0%
浏览 23

flink检查点报错

flink的并行度设置为4,kafka分区设置为10,运行2天之后报错如下,请问怎么处理。flink版本是1.12,kafka版本是3.6

java.io.IOException: Could not perform checkpoint 2926 for operator 指令解析 -> Sink: Unnamed (4/4)#0.
    at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:895)
    at org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:113)
    at org.apache.flink.streaming.runtime.io.SingleCheckpointBarrierHandler.processBarrier(SingleCheckpointBarrierHandler.java:127)
    at org.apache.flink.streaming.runtime.io.CheckpointedInputGate.handleEvent(CheckpointedInputGate.java:174)
    at org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:151)
    at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:157)
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:67)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:372)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:186)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:575)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:539)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.runtime.checkpoint.CheckpointException: Could not complete snapshot 2926 for operator 指令解析 -> Sink: Unnamed (4/4)#0. Failure reason: Checkpoint was declined.
    at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:226)
    at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:158)
    at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:343)
    at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointStreamOperator(SubtaskCheckpointCoordinatorImpl.java:603)
    at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.buildOperatorSnapshotFutures(SubtaskCheckpointCoordinatorImpl.java:529)
    at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.takeSnapshotSync(SubtaskCheckpointCoordinatorImpl.java:496)
    at org.apache.flink.streaming.runtime.tasks.SubtaskCheckpointCoordinatorImpl.checkpointState(SubtaskCheckpointCoordinatorImpl.java:266)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$9(StreamTask.java:924)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:914)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:882)
    ... 13 more
Caused by: java.lang.IllegalStateException: Pending record count must be zero at this point: 1
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.flush(FlinkKafkaProducer.java:1048)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.preCommit(FlinkKafkaProducer.java:961)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.preCommit(FlinkKafkaProducer.java:100)
    at org.apache.flink.streaming.api.functions.sink.TwoPhaseCommitSinkFunction.snapshotState(TwoPhaseCommitSinkFunction.java:321)
    at org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer.snapshotState(FlinkKafkaProducer.java:1057)
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:120)
    at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:101)
    at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:90)
    at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.snapshotState(StreamOperatorStateHandler.java:197)
    ... 23 more
  • 写回答

3条回答 默认 最新

  • 阿里嘎多学长 2025-05-16 12:48
    关注

    阿里嘎多学长整理AIGC生成,因移动端显示问题导致当前答案未能完全显示,请使用PC端查看更加详细的解答过程

    Flink检查点报错解决方案

    Flink检查点报错可能是由于Flink在处理数据时,检查点机制无法正确地保存状态导致的。下面是一些可能的解决方案:

    1. 检查Flink的检查点间隔:Flink的检查点间隔默认为5分钟,可以通过CheckpointInterval参数来设置。可以尝试增加检查点间隔来减少检查点的频率。
    env.getCheckpointConfig().setCheckpointInterval(300000); // 5 minutes
    
    1. 检查Flink的检查点存储目录:Flink的检查点存储在checkpoint目录下。如果这个目录的存储空间不足,Flink可能无法正确地保存检查点。可以尝试增加存储空间或清理无用文件。
    2. 检查Kafka的分区数量:Flink的并行度设置为4,Kafka的分区数量设置为10。如果Flink的并行度大于Kafka的分区数量,可能会导致检查点报错。可以尝试减少Flink的并行度或增加Kafka的分区数量。
    3. 检查Flink的状态大小:Flink的状态大小可能会导致检查点报错。如果Flink的状态大小太大,可以尝试减少状态大小或使用Flink的状态压缩机制。
    4. 检查Flink的配置文件:Flink的配置文件可能会导致检查点报错。可以尝试检查Flink的配置文件是否正确。

    如果以上解决方案都不能解决问题,可以尝试查看Flink的日志文件来获取更多的错误信息。

    评论

报告相同问题?

问题事件

  • 创建了问题 5月16日