qq_15768629 2017-11-30 04:51 采纳率: 0%
浏览 2734
已结题

python item2vec的实现问题

 from gensim.models import Word2Vec   
import logging  
import sys
reload(sys)
sys.setdefaultencoding('utf8')
from sklearn.model_selection import train_test_split
c = []

def load_sequence(from_path):
    with open(from_path) as fp:
        [c.append(line.strip().split(",")) for line in fp]

def main():
    logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)  
    load_sequence('E:\\wordpython\\1105\\to666.txt') # 加载语料  
    c_train,c_text = train_test_split(c,test_size=0.2)
    model = Word2Vec(c_train, size=20, window=3, min_count=1, workers=1, iter=3, sample=1e-4, negative=20)  # 训练skip-gram模型; 默认window=5  
    test_size = float(len(c_text))
    hit = 0.0
    for current_pattern in c_text:
        if len(current_pattern) < 2:
            test_size -= 1.0
            continue
        # Reduce the current pattern in the test set by removing the last item
        last_item = current_pattern.pop()

        # Keep those items in the reduced current pattern, which are also in the models vocabulary
        items = [it for it in current_pattern if it in model.wv.vocab]
        if len(items) <= 2:
            test_size -= 1.0
            continue

        # Predict the most similar items to items
        prediction = model.most_similar(positive=items,topn=20)

        # Check if the item that we have removed from the test, last_item, is among
        # the predicted ones.
        for predicted_item, score in prediction:
            if predicted_item == last_item:
                hit += 1.0
    print 'Accuracy like measure: {}'.format(hit / test_size)

if __name__ == "__main__":  
    main()

No handlers could be found for logger "gensim.models.doc2vec"是什么回事?也没用doc2vec啊

  • 写回答

1条回答 默认 最新

  • qq_36873653 2017-11-30 06:07
    关注

    # Check if the item that we have removed from the test, last_item, is among
    # the predicted ones.adaddad

    评论

报告相同问题?

悬赏问题

  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题
  • ¥15 C#算法问题, 不知道怎么处理这个数据的转换
  • ¥15 YoloV5 第三方库的版本对照问题
  • ¥15 请完成下列相关问题!