proplume
2019-03-01 09:09
采纳率: 100%
浏览 1.0k

pycharm 执行有关spark代码出现错误

版本情况:
win10
spark-2.3.0-bin-hadoop2.6
python3.5
jdk1.8.0_161
同样的代码在Jupyter 完全可以执行
执行代码如下

        try:
            sc.stop()
        except:
            pass
        from pyspark import SparkContext
        sc = SparkContext()
        # sc.master
        rdd = sc.textFile("rating2.csv")
        ratings = rdd.map(lambda line: line.split(";"))
        ratingsRDD = ratings.map(lambda x: (x[0], x[1], x[2]))
        ratings.persist()
        # #训练模型
        from pyspark.mllib.recommendation import ALS
        model = ALS.train(ratings, 5, 5, 0.01)
        # # #基于book推荐
        user_com = model.recommendUsers(int(id_book), 6)

pycharm 报错:

19/03/01 08:50:45 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[main,5,main]
java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CALLBACK_HOST
    at scala.collection.MapLike$class.default(MapLike.scala:228)
    at scala.collection.AbstractMap.default(Map.scala:59)
    at scala.collection.MapLike$class.apply(MapLike.scala:141)
    at scala.collection.AbstractMap.apply(Map.scala:59)
    at org.apache.spark.api.python.PythonGatewayServer$$anonfun$main$1.apply$mcV$sp(PythonGatewayServer.scala:50)
    at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1302)
    at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:37)
    at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Process finished with exit code -1073740791 (0xC0000409)

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • proplume 2019-03-10 23:46
    已采纳

    找到spark安装环境 下的python中lib包 例如我的是 F:\spark\spark-2.3.0-bin-hadoop2.6\python\lib

    图片说明

    将两个压缩文件解压 复制 放到你的pycharm中python环境Lib\site-packages中

    (打开pycharm-(左上角)file—settings–Project Interpreter–可查看你的python环境)

    例如我的是 C:\Users\boos\PycharmProjects\untitled\venv\Lib\site-packages

    图片说明

    pycharm中可成功运行spark

    图片说明

    说明:如果你通过其他方式Lib\site-packages中已经有了pyspark、py4j包,最好移除并通过此方法将两个文件包粘贴到site-packages下

    点赞 评论

相关推荐 更多相似问题