我在虚拟机使用sparksql查询数据
以下是我的代码
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.sql.Encoder
import spark.implicits._
case class cars(city: String, keyword: String, totals: String, title: String, time: String, mileage: String, output: String, gearbox: String, price: String, standard: String, nprice: String, transfer: String, output1: String, gearbox1: String, color: String, keys: String)
val carsDF = spark.sparkContext.textFile("file:///usr/local/bigdatacase/dataset/cars.txt").map(_.split("\t")).map(attributes => cars(attributes(0), attributes(1) ,attributes(2), attributes(3), attributes(4), attributes(5), attributes(6), attributes(7), attributes(8), attributes(9), attributes(10), attributes(11), attributes(12), attributes(13), attributes(14), attributes(15).trim.toString)).toDF()
carsDF.createOrReplaceTempView("cars")
carsDF.groupBy("city").count().show()
但结果是这样的
+----+-----+
|city|count|
+----+-----+
| ?沈阳| 1|
| ?大同| 1|
| ?宁波| 1|
| ?徐州| 2|
| 南京| 41|
| ?厦门| 3|
| ?青岛| 1|
| ?长春| 5|
| ?珠海| 3|
| ?烟台| 1|
| ?邯郸| 1|
| 徐州| 915|
| 晋江| 1043|
| 长沙| 174|
| 沈阳| 984|
| 张家口| 894|
| ?武汉| 1|
| 哈尔冰| 194|
| 西安| 159|
| ?晋江| 4|
+----+-----+
only showing top 20 rows
我想让查询结果不带问号,该怎么做?