douqianbiao4216 2009-11-02 00:11
浏览 15
已采纳

PHP和Twitter | 创建索引引擎

Here is what I have in mind:

1) Create a service that will run every hour or so and search for twits using a specific criteria

2) I also need to filter out garbage (index engine needs to be smart enough, kind of like anti-spam service)

What are the best strategies/ideas to accomplish this?

PS

Any ideas if there is anti-spam engine already created for twitter?

  • 写回答

2条回答 默认 最新

  • douqingzhi0980 2009-11-02 00:30
    关注

    Well for starters probably the best place to begin is the Twitter API (2nd link from Google )and get your search working. If your server stack is of the *nix persuasion, using cron to schedule a wget/curl request to your search page would probably be the simplest strategy. Unfortunately my windows task scheduling knowledge is sorely lacking, but I'm certain there are better ways than using the crusty Task Scheduler.

    Finally, for your filtering, writing a Bayesian classifier may be overkill as there may be services your can subscribe to but none that I am aware of for Twitter. Bayesian classifiers are quite common and I'm certain with a little research from your favorite search engine should result in either a canned solution or at least direction as to how to create your own. Keep in mind that spam is relative so you have to train your classifier, which at the start is a bit time consuming. And in fact PHP might not be the best language for the task, but something that your crontab can call periodically as well to do the training.

    I realize that this is very high level, but the links should be enough of a jumping off point to get you started in the right direction.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 公交车和无人机协同运输
  • ¥15 stm32代码移植没反应
  • ¥15 matlab基于pde算法图像修复,为什么只能对示例图像有效
  • ¥100 连续两帧图像高速减法
  • ¥15 组策略中的计算机配置策略无法下发
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作
  • ¥15 用友U8:向一个无法连接的网络尝试了一个套接字操作,如何解决?
  • ¥30 我的代码按理说完成了模型的搭建、训练、验证测试等工作(标签-网络|关键词-变化检测)