I'm asking for your oppinion / experiences about this.
Our CMS is fetching info from the HTTP_USER_AGENT string. Recently we have discovered a bug in the code - forgot to check if HTTP_USER_AGENT is present (which is possible, but honestly: we simply skipped that, didn't expected that to happen) or not - these cases resulted in an error. So we have corrected it, and installed a tracking there: if HTTP_USER_AGENT is not set an alert is sent to our tracking system.
Now we have data/statistics from many websites from the past months. Now our stats show this is really rare. ~ 0.05-0.1%
Another interesting observation: these requests are single. Didn't find any case where this "user" has multiple pageviews in the same session...
This forced us thinking... Should we treat these requests as robots? And simply block them out... Or that would be a serious mistake?
Googlebot and other "good robots" are always sending HTTP_USER_AGENT info.
I know it is possible that firewalls or proxy servers MAY alter (or remove) this user-agent info. But according to our stats I can not clarify this...
What are your experiences? Is here anyone else who made any research about this topic?
Other posts I found on stackoverflow are simply accepting the fact "it is possible this info is not sent". But why don't we question that for a moment? Is it really normal??
没有设置HTTP_USER_AGENT - 这是正常的吗? 或者可能是机器人?
- 写回答
- 好问题 0 提建议
- 追加酬金
- 关注问题
- 邀请回答
-
2条回答 默认 最新
- doupafu6980 2013-02-15 11:35关注
I would consider the lack of user-agent abnormal for genuine users, however it is still a [rare] possibility which may be caused by a firewall, proxy or privacy software stripping the user-agent.
A request missing a user-agent is most likely a bot or script (not necessarily a search engine crawler). Although you can't say for sure of course.
Other factors that may indicate a bot/script:
- Only requesting the page itself, the failure to request resources on the page such as images, CSS and Javascript
- A very short space of time between requests from page-page (such as within the same second).
- The failure to send cookies or session IDs on subsequent requests where a cookie should have been set, but keep in mind genuine users may have cookies disabled.
本回答被题主选为最佳回答 , 对您是否有帮助呢?解决 无用评论 打赏 举报
悬赏问题
- ¥20 删除和修改功能无法调用
- ¥15 kafka topic 所有分副本数修改
- ¥15 小程序中fit格式等运动数据文件怎样实现可视化?(包含心率信息))
- ¥15 如何利用mmdetection3d中的get_flops.py文件计算fcos3d方法的flops?
- ¥40 串口调试助手打开串口后,keil5的代码就停止了
- ¥15 电脑最近经常蓝屏,求大家看看哪的问题
- ¥60 高价有偿求java辅导。工程量较大,价格你定,联系确定辅导后将采纳你的答案。希望能给出完整详细代码,并能解释回答我关于代码的疑问疑问,代码要求如下,联系我会发文档
- ¥50 C++五子棋AI程序编写
- ¥30 求安卓设备利用一个typeC接口,同时实现向pc一边投屏一边上传数据的解决方案。
- ¥15 SQL Server analysis services 服务安装失败