请看下面的代码
object KeybyTest {
def main(args: Array[String]): Unit = {
val env = StreamExecutionEnvironment.getExecutionEnvironment
val random = new Random()
env.setParallelism(4)
val parallelism = env.getParallelism
val stream = env.addSource(new SensorSource)
.map(v=>{
(v.id,(random.nextInt().abs % parallelism).toString) //_.2 为分组字段
})
.keyBy(1)
.print()
env.execute()
}
}
设置并行度为4
随机生成0到3的数作为分区字段,发现有一个subtask的没有数据进来.
输出结果如下:
2> (sensor_0,3)
2> (sensor_1,3)
2> (sensor_2,1)
2> (sensor_4,3)
2> (sensor_5,1)
2> (sensor_1,1)
3> (sensor_0,0)
1> (sensor_2,2)
3> (sensor_4,0)
3> (sensor_5,0)
1> (sensor_2,2)
有两个key值跑到同一个分区去了,这怎么解决?