分毫析厘 2023-10-10 19:41 采纳率: 50%
浏览 31
已结题

LDA主题分析及可视化

最近在学LDA主题分析,写了一个小demo。在最后可视化的时候,运行到这里vis_data = gensimvis.prepare(lda_model, corpus, dictionary),出现了报错A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.


```python

import gensim
from gensim import corpora
from gensim.models import LdaModel
from gensim.models.coherencemodel import CoherenceModel
import matplotlib.pyplot as plt
import pyLDAvis.gensim_models as gensimvis
import pyLDAvis

# 导入必要的库


# 假设你有一些文档集合,每个文档都是一个词列表
documents = [
    ["apple", "banana", "orange", "fruit", "juice"],
    ["car", "vehicle", "drive", "road", "traffic"],
    ["python", "programming", "code", "language", "computer"],
    ["car","apple","computer","orange","apple"],
    ["apple","car","orange","car","computer"]
    # 添加更多文档...
]

# 创建字典(词袋模型)
dictionary = corpora.Dictionary(documents)

# 创建文档-词频矩阵
corpus = [dictionary.doc2bow(doc) for doc in documents]

# 训练LDA模型
num_topics = 3  # 指定主题数量
lda_model = LdaModel(corpus, num_topics=num_topics, id2word=dictionary, passes=15)

# 打印主题和对应的词
topics = lda_model.print_topics(num_words=5)
for topic in topics:
    print(topic)

# 计算主题一致性得分(可选)
coherence_model = CoherenceModel(model=lda_model, texts=documents, dictionary=dictionary, coherence='c_v')
coherence_score = coherence_model.get_coherence()
print("Coherence Score:", coherence_score)

# 可视化主题


# 转换为pyLDAvis格式并可视化
vis_data = gensimvis.prepare(lda_model, corpus, dictionary)
pyLDAvis.display(vis_data)

这是为什么啊?数据集也不多啊,电脑就运行了一个jupyter,请各位专业人士帮忙解答一下,困扰我好几天了

  • 写回答

2条回答 默认 最新

  • 分毫析厘 2023-10-20 21:34
    关注

    更新或降低joblib库的版本,使其与pyLDAvis库相匹配

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

问题事件

  • 系统已结题 10月28日
  • 已采纳回答 10月20日
  • 创建了问题 10月10日

悬赏问题

  • ¥15 无法输出helloworld
  • ¥15 高通uboot 打印ubi init err 22
  • ¥20 PDF元数据中的XMP媒体管理属性
  • ¥15 R语言中lasso回归报错
  • ¥15 网站突然不能访问了,上午还好好的
  • ¥15 有没有dl可以帮弄”我去图书馆”秒选道具和积分
  • ¥15 semrush,SEO,内嵌网站,api
  • ¥15 Stata:为什么reghdfe后的因变量没有被发现识别啊
  • ¥15 振荡电路,ADS仿真
  • ¥15 关于#c语言#的问题,请各位专家解答!