编写Web机器人[关闭]

Today it came to my mind to write a web bot/crawler/spider/etc in PHP that only crawls News websites. First of all I read articles about crawlers and then encountered with this issue:

How can a bot recognize a URL/post/article/text as it's related to News!

The only soultion I came with, is to check them for some particular keywords, but No! I don't think that's a good and workable practice. At least not perfect!

So any ideas about better sloutions, is appreciated.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dougan1465 2013-06-30 13:21
关注
You could use preg_match for matching the keywords and the technique is pretty awesome and working:

$text = "News: Flooding is expected today" ; $news_found = preg_match("/(news|sensation|discovery)/i", $text);

No reason to think that is not a good solution.
解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

微信PHP服务器端调用图灵机器人接口失败 php
2015-03-27 14:02

回答 3 已采纳已解决了，不过还是谢谢你！刚开始学着用PHP,犯了一个很低级的错误！
php curl中电报机器人的自签名证书 php
2017-04-11 07:57

回答 4 已采纳 I have the same problem and can be solved by copying file cacert.pem into my server and then inser
机器人的位置移动的问题，用C语言编写算法解决，谢谢 c语言算法
2018-12-11 03:00

回答 1 已采纳 https://www.zybang.com/question/0ba54901797e155f3dc2687090c2d32c.html
基于 NoneBot2 编写的 Wordle 机器人.zip
2024-02-17 17:45

包括STM32、ESP8266、PHP、QT、Linux、iOS、C++、Java、python、web、C#、EDA、proteus、RTOS等项目的源码。【项目质量】：所有源码都经过严格测试，可以直接运行。功能在确认正常工作后才上传。【适用人群】...
模板中的Joomla php echo元机器人标签？ php
2017-12-31 18:13

回答 1 已采纳 Robots configuration is part of the main Joomla! configuration hence you can retrieve it with: $j
机器人的容器
2017-11-26 13:19

回答 1 已采纳 http://blog.csdn.net/u013761036/article/details/23786377
错误白色列出域使用php的messenger机器人 php
2017-04-24 11:48

回答 1 已采纳 You missed setting_type, which is required: { "setting_type": "domain_whitelisting", "whiteli
WAFPHP:Web应用程序防火墙的php框架
2021-05-16 15:05

旨在提供一个与现有代码互不冲突干扰PHP级Web应用防护框架，可基于此框架之上开发各种诸如防机器人恶意采集等Web应用防护插件，即插即用，乃居家旅行必备良药。PS:当然，这只是一种思路，适用于某些特殊场景，它并不...
机器人的项链
2017-09-24 09:57

回答 1 已采纳 https://www.zybang.com/question/0eae93d005ffece9b5cde3af8833309b.html
通过php淘汰垃圾邮件发送者/机器人 html php
2014-02-18 19:30

回答 5 已采纳 This is because you always call the mail form first. You need to validate it before you call it. C
使用PHP的Facebook messenger机器人 - 回发示例？ php
2016-07-11 12:35

回答 2 已采纳 Is this what you mean? https://github.com/pimax/fb-messenger-php-example/blob/master/index.php $
binance-php-trader:用PHP编写的具有交易视图技术分析和电报通知的Binance.com交易所上的自动交易实验机器人
2021-05-24 06:59

用PHP编写的Binance.com交易所上的自动交易实验机器人，带有Tradingview技术分析和电报通知。要求 PHP PHP的卷曲作曲家 GIT 安装要求（Ubuntu） $ sudo apt-get update & apt-get upgrade $ sudo apt-get ...
PHP表单邮件使用隐形表单字段来过滤机器人 html php
2013-03-02 23:12

回答 2 已采纳 if(other_email == "") //If other_email form section is blank then... { run all the code above
php钉钉机器人,php实现钉钉业务报警机器人
2021-03-23 17:47

守望之鹰的博客使用场景，服务器报异常错误，想要及时收到报警信息并处理环境介绍，本博使用yaf框架+php，仅仅提供思路，参考，具体根据自己实际情况进行编写1，每十分钟执行一次任务脚本# 每10分钟执行一次的任务if [ "0" -eq "$...
telegram_bot_blog:简短PHP脚本，使Telegram机器人可以从您的博客发送新消息
2021-05-24 17:59

您博客的电报机器人这是一个用PHP编写的Telegram Bot，用于向您的关注者发送最新文章。例如，只需遵循。这是我的个人博客 [德语]的机器人。如果您有任何疑问，请随时询问。要求带API密钥的电报机器人网络服务器...
没有解决我的问题, 去提问

悬赏问题

¥15 表达式必须是可修改的左值
¥15 如何绘制动力学系统的相图
¥15 对接wps接口实现获取元数据
¥20 给自己本科IT专业毕业的妹m找个实习工作
¥15 用友U8：向一个无法连接的网络尝试了一个套接字操作，如何解决？
¥30 我的代码按理说完成了模型的搭建、训练、验证测试等工作(标签-网络|关键词-变化检测)
¥50 mac mini外接显示器画质字体模糊
¥15 TLS1.2协议通信解密
¥40 图书信息管理系统程序编写
¥20 Qcustomplot缩小曲线形状问题

编写Web机器人[关闭]

2条回答 默认 最新

悬赏问题

2条回答默认最新