测试环境:spark 2.4 + python 3.7
生产环境:spark 2.2 + python 3.7 (环境不能修改)
测试环境能跑通,生产环境报错如下:
21/08/19 16:17:17 ERROR Executor: Exception in task 0.0 in stage 18.0 (TID 18)
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/rdd.py", line 1339, in takeUpToNumLeft
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/worker.py", line 177, in main
process()
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/worker.py", line 172, in process
serializer.dump_stream(func(split_index, iterator), outfile)
File "/usr/bch/1.5.0/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 268, in dump_stream
vs = list(itertools.islice(iterator, batch))
RuntimeError: generator raised StopIteration
现在如何在生产环境的情况下 解决这个报错?
感谢指点