duancaiyi7567 2011-12-15 22:01
浏览 28

内容聚合服务的策略

I have built RSS, twitter, and other content aggregators for clients using php/Mysql. It typically involves a cron job, some feed parsing and inserting data into a database for storing and later re-publishing, or deleting, or archiving, etc. Nothing ground-breaking.

But now I am tasked with building an aggregator service for a public audience. I imagine this will need to scale quickly as each person with access to the service can add dozens, if not hundred of source feeds. Within a few months we may be regularly parsing 1000's of feeds and maybe 100,000 within a year, or more with any luck.

I guess the ultimate model is something similar to what google reader does.

So, what is a good strategy for this? Multiple, overlapping crons, continuously running and reading feeds and connecting to APIs to pull content? Should I plan to run multiple instances of Elastic Cloud or something as need grows?

  • 写回答

3条回答 默认 最新

  • doulu4413 2011-12-15 22:55
    关注

    I would not overlap crons, will get really nasty at the end. I guess you should have one system that sends information with Ajax and multiple servers accepting and rendering it, returning action and results if needed. On the other hand, there are many cloud solutions available worldwide, might work even better.

    评论

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度