2 jiushizheweidao jiushizheweidao 于 2015.07.15 20:10 提问

python 向量空间模型 相似度计算 求大神 运行总是通不过
  #用向量空间模型计算两个字符串s和s1之间的相似度

from math import sqrt
from collections import Counter
import re

def vsm_distance(s,s1):

      #将s,s1转化为字典格式(dictionary{词:词频})
mylist=re.findall(r"\w+",s)
ss=Counter( mylist)
mylist1=re.findall(r"\w+",s1)
ss1=Counter( mylist1)
    #向量空间计算
c = set(ss.keys())&set(ss1.keys())
if not c:
    return 0
x = sum([ss.get(i)*ss1.get(i) for i in c])
sq1 = sqrt(sum([pow(ss.get(i),2) for i in ss.values()]))
sq2 = sqrt(sum([pow(ss1.get(i),2) for i in ss1.values()]))
p = float(x)/(sq1*sq2)
return p

s="KBA is to give a chance to non-popular entities information to be updated as soon as a useful information is published on the internet. The KBA organizershave built up a stream-corpus which is a huge corpus of timestamped web documents that can be processed chronologically. Hence it is possible to simulate a real time system. The documents come from newswires, blogs, forums, review, memetracker….. In addition, a set of target entities, coming from wikipedia or from twitter, has been selected for their ambiguity or unpopularity. And last but not least, more than 60000 documents have been annotated so that systems can train on it. The train period starts on documents published from october 2011 until februray, and the test period starts from februray 2012 to februray 2013."

s1="The KBA track is divided in two tasks:CCR(Cumulative Citation Recommendation) and SSF(Streaming Slot Filling). CCR task is to filter out documents worth citing in a profile of an entity(e.g., wikipedia or freebase article). SSF task is to detect changes on given slots for each of the target entities. This article is focused only on CCR task."

vsm_distance(s,s1)

6个回答

oyljerry
oyljerry   Ds   Rxr 2015.07.15 20:54

运行通不过是有什么语法错误还是结果不正确?

jiushizheweidao
jiushizheweidao 能帮我解决下吗
2 年多之前 回复
jiushizheweidao
jiushizheweidao 下面是它的运行后的结果
2 年多之前 回复
jiushizheweidao
jiushizheweidao   2015.07.16 09:27

Traceback (most recent call last):
File "", line 1, in
File "D:\Python\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "D:\Python\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "E:/我的文档/Python Scripts/filefour.py", line 23, in
File "E:/我的文档/Python Scripts/filefour.py", line 17, in vsm_distance
TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'

jiushizheweidao
jiushizheweidao   2015.07.16 09:27

Traceback (most recent call last):
File "", line 1, in
File "D:\Python\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "D:\Python\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "E:/我的文档/Python Scripts/filefour.py", line 23, in
File "E:/我的文档/Python Scripts/filefour.py", line 17, in vsm_distance
TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'

jiushizheweidao
jiushizheweidao   2015.07.16 09:27

Traceback (most recent call last):
File "", line 1, in
File "D:\Python\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 682, in runfile
execfile(filename, namespace)
File "D:\Python\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 71, in execfile
exec(compile(scripttext, filename, 'exec'), glob, loc)
File "E:/我的文档/Python Scripts/filefour.py", line 23, in
File "E:/我的文档/Python Scripts/filefour.py", line 17, in vsm_distance
TypeError: unsupported operand type(s) for ** or pow(): 'NoneType' and 'int'

oyljerry
oyljerry   Ds   Rxr 2015.07.16 09:31

看上去你传进来的s,s1等数据有问题,导致后面处理出错了,你现在函数中一进来打印一下看看

CSDNXIAON
CSDNXIAON   2015.07.16 15:31

向量空间模型文档相似度计算实现(C#)
----------------------同志你好,我是CSDN问答机器人小N,奉组织之命为你提供参考答案,编程尚未成功,同志仍需努力!

Csdn user default icon
上传中...
上传图片
插入图片