download12749 2009-03-02 17:21
浏览 18

如何分组类似新闻[关闭]

I'm trying to build a rss-news fetching server to collect all news of a few sites about a topic. Often these sites have similar news with nearly the same information. How would it be possible to group such news. For example display the first and then a summary of other links?

Does anybody have experince with this?

  • 写回答

3条回答 默认 最新

  • duandi6531 2009-03-02 17:25
    关注

    Look for keywords (e.g., split the description into words and remove any of the 100 or so most common words) then clump them by cooccurance of these. Often just looking at the longest word will give you a good quick approximation.

    In other words, if you have a table with "topic groups" you can assign each item to a new or existing topic group as it comes in. First, see if any of the existing topic groups share enough keywords with the new item; if one does, put it there. If none does, create a new topic group with its keywords and add it as the first member of that topic group.

    -- MarkusQ

    评论

报告相同问题?

悬赏问题

  • ¥15 基于单片机的靶位控制系统
  • ¥15 AT89C51控制8位八段数码管显示时钟。
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错