qq_53401077
2022-05-11 17:38
采纳率: 0%
浏览 47

knn.score(X_train,Y_train)显示“包含多个元素的数组的真值不明确。使用a.any()或a.all()。”是什么原因

问题遇到的现象和发生背景

第一次学习使用sklearn库,库自带的数据可以,但用了我自己爬取的数据就不行了,希望有人能指点一下,非常感谢。

问题相关代码,请勿粘贴截图

代码

encoder = OneHotEncoder()
data_X = encoder.fit_transform(data_X)
data_Y = encoder.fit_transform(np.array(data_Y).reshape(-1,1))
X_train, X_test, Y_train, Y_test = train_test_split(data_X, data_Y, test_size=0.3, random_state=41)
print(X_train)


knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, Y_train)
#knn.predict(Y_test)
knn.score(X_train,Y_train)

报错

Traceback (most recent call last):
  File "C:\Users\20673\AppData\Local\Programs\Python\Python39\lib\code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm 2021.3.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2021.3.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/untitled/3.py", line 31, in <module>
    knn.score(X_train,Y_train)
  File "D:\untitled\venv\lib\site-packages\sklearn\base.py", line 500, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)
  File "D:\untitled\venv\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "D:\untitled\venv\lib\site-packages\sklearn\metrics\_classification.py", line 202, in accuracy_score
    y_type, y_true, y_pred = _check_targets(y_true, y_pred)
  File "D:\untitled\venv\lib\site-packages\sklearn\metrics\_classification.py", line 85, in _check_targets
    type_pred = type_of_target(y_pred)
  File "D:\untitled\venv\lib\site-packages\sklearn\utils\multiclass.py", line 261, in type_of_target
    if is_multilabel(y):
  File "D:\untitled\venv\lib\site-packages\sklearn\utils\multiclass.py", line 163, in is_multilabel
    labels = np.unique(y)
  File "<__array_function__ internals>", line 5, in unique
  File "D:\untitled\venv\lib\site-packages\numpy\lib\arraysetops.py", line 272, in unique
    ret = _unique1d(ar, return_index, return_inverse, return_counts)
  File "D:\untitled\venv\lib\site-packages\numpy\lib\arraysetops.py", line 333, in _unique1d
    ar.sort()
  File "D:\untitled\venv\lib\site-packages\scipy\sparse\base.py", line 283, in __bool__
    raise ValueError("The truth value of an array with more than one "
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

我的解答思路和尝试过的方法

因为数据中有汉字我在网上搜到可以使用独热编码,但是我一直不明白使用后数据变成有括号这种是代表了什么意思
(0, 188) 1.0
(0, 1175) 1.0
(1, 565) 1.0
(1, 1802) 1.0
(2, 328) 1.0
(2, 1827) 1.0

我想要达到的结果

希望有人能帮我解答一下问题

2条回答 默认 最新

相关推荐 更多相似问题