dongshuzui0335 2009-08-17 17:11
浏览 45
已采纳

只读XML数据时更新

I'm able to parse RSS with PHP - What I'm looking for is to be able to get only the updated content, and do nothing if there's no new update to the RSS.

For example, I have this RSS File, and if there's no new content, nothing happens, but if there's a new content, I want to send my users the latest RSS update, and not resend what they already have. I'm parsing and sending the title and link only.

I use cronjob to check every hour for update. My question is how can I tell that the feed is now updated and not the same as the last one? Here's the PHP file that I'm using to read the RSS. Do I write the last content to file and compare them or is there any other way to determine that the content is now different from the last?

Update: I had to resurrect this post because I'm still trying to get it to work. Although I accepted a few answers, they have been very hard to implement, for example the hashing option looked like a good idea initially, but as thousands of RSS would be checked, it would be almost impossible to hash them all.

Again, someone suggested HTTP Cache - I couldn't find a simple demo so I'm practically stuck.

Any further suggest would be highly appreciated.

  • 写回答

5条回答 默认 最新

  • douya6229 2009-10-13 22:46
    关注

    You could use hashes for this, in two ways:

    1. To ease updating - When requesting an update, you hash the whole feed and compare the result with the hash from the last time - if they are identical, you know that the feed did not change and can stop before even parsing it.
    2. To identify changes - On parsing, you hash each item and compare it to the hashes stored from previous runs. If it matches one, you know that you've seen it before.

    If the feed in question offers guids for its items you could refine this process by storing guid<>hash pairs. This would make the comparison quicker, as you would only compare items to known previous versions instead of comparing to all previous items.

    You'd still need some expiration/purge mechanism to keep the amount of stored hashes within bounds, but given that you only store relatively short strings (depending on the chosen hash algorithm), you should be able to keep quite a backlog before getting performance problems.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(4条)

报告相同问题?

悬赏问题

  • ¥15 有没有可以帮我搞一个微信建群链接,包括群名称和群资料群头像那种,不会让你白忙
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用
  • ¥15 C++ yoloV5改写遇到的问题
  • ¥20 win11修改中文用户名路径
  • ¥15 win2012磁盘空间不足,c盘正常,d盘无法写入
  • ¥15 用土力学知识进行土坡稳定性分析与挡土墙设计
  • ¥70 PlayWright在Java上连接CDP关联本地Chrome启动失败,貌似是Windows端口转发问题