为抓取工具跳过javascript代码段

I have a website in php, that pass certain php variables to javascript variables, google crawled me, which generates errors and duplicate content. Is there any way to make the google crawler to ignore the declaration of these variables in javascript?

    echo '<script language="javascript">var '.$item['Nombre'].'="'.$descripcion.'";</script>';

Sorry for my english,

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

4条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douhang1913 2013-11-13 12:44
关注
Google crawling javascript code and considering it duplicate? I have never heard of this problem before. Some of my pages have inlined javascript (if the content is small), that means the same <script>...</script> on every page.

There are also cases where I output javascript variables more-or-less the same way you do. Google never marked it as "duplicate content".

Description from here:

Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin. Examples of non-malicious duplicate content could include:

Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices

Store items shown or linked via multiple distinct URLs

Printer-only versions of web pages

You can get this kind of errors if you have the same content on more than one of your pages, but google does not parse javascript as content. (Although you can never know for sure what google does or does not). The same way that google will not mark your <head> tag as duplicate, or there is no penalty for having the same layout (menu, footer, etc) on every page.

You can put that <script> tag in an <aside> tag just to be sure.

The HTML Element represents a section of a page that consists of content that is tangentially related to the content around it, which could be considered separate from that content. Such sections are often represented as sidebars or as inserts. They often contain side explanations, like a glossary definition; more loosely related stuff, like advertisements; the biography of the author; or in web-applications, profile information or related blog links.

This means that the content will be more or less ignored by google when indexing the page. It will not mark it as a duplicate since it could be a commertial.

Also drop the language="javascript" attribute from your script tags. I doubt that it would confuse google in any way, since that attribute is deprecated (use type instead) and nothing takes it into account nowadays. But if google bot does, the correct value would be text/javascript instead of simply javascript. It is possible that google does not recognise the value javascript and parses it as unknown type of text content.

The default type of the script is text/javascript, so it is safe to omit.

Above all I suspect that the problem is not the existence of JS variables, but some other thing like GET parameters in your URL. GET parameters can be dealt with by configuring URL Parameters correctly in Webmaster Tools.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(3条)

报告相同问题？

关注问题

为抓取工具跳过javascript代码段 html javascript php
2013-11-13 11:35

回答 4 已采纳 Google crawling javascript code and considering it duplicate? I have never heard of this problem b
搞过推特爬虫的进，抓取推特出现问题 javascript python 有问必答爬虫
2022-02-08 17:27

回答 2 已采纳对的，失效了，关键词还可以抓
抓取图片代码的语句解析 python 爬虫
2022-10-03 20:45

回答 1 已采纳您好，大致含义是：python open() 函数用于打开一个文件，创建一个 file 对象，相关的方法才可以调用它进行读写。open(name[, mode[, buffering]])参数说明：n
JS前端绕过
2022-06-26 17:31

看着博客敲代码的博客 web应用对用户上传的文件进行了校验，该校验是通过前端JavaScript代码完成。恶意用户对前端JavaScript进行修改或通过抓包软件篡改上传文件的格式，就能绕过基于JS的前端校验。
代码抓取工具抓取异步GA代码 php
2012-04-16 00:55

回答 1 已采纳 It's an imperfect solution, but the Google Analytics Account ID is almost unique enough to find it
在移动端IOS上获取video视频的第一帧为黑屏 ios javascript 前端
2022-11-10 14:47

回答 1 已采纳有个功能：获取视频的第一个有效帧。你搜下
JavaScript函数同步抓取HTML和JS javascript php
2016-07-23 16:39

回答 1 已采纳 There is no Javascript environment where it is recommended to use synchronous networking to retrie
前端项目代码加密教程
2019-06-25 15:05

暮　色的博客首先，告诉你的老板，严格意义上的加密是不存在的，能够实现的只有对前端代码进行压缩混淆，增加阅读难度。本篇教程全篇描述的，就是对代码进行混淆的手段，从而满足老板提出的加密需求。为了保证本篇教...
数据抓取都用什么工具好些？ python
2020-04-10 12:09

回答 1 已采纳使用selenium吧字数补丁
爬取网页时遇到网页代码为编码形式如何进行反编码？ python 前端
2022-05-07 11:36

回答 3 已采纳典型的被反爬机制检查到了。补全请求头或者是cookie再进行访问。可以发一下网站让大家伙们练练手看看反而更快的解决问题。
python如何抓取类型为EventStream的数据 php python 有问必答
2023-02-13 09:35

回答 4 已采纳使用stream参数和iter_content方法 s="" resp=requests.get(url,stream=True) print(resp.headers) for chunk in r
Web 前端基础知识面试大全
2022-04-01 18:29

studyer网的博客 (因此this就指向了这个新对象) 执行构造函数中的代码（为这个新对象添加属性）；如果该函数没有返回对象，则返回this。 8.闭包有权访问另一个函数作用域中的变量的函数；第一，闭包是一个函数，而且存在于另一个...
js抓取元素，显示是抓取到了一个集合，但是为什么遍历不出来呢？ javascript
2023-03-19 17:17

回答 1 已采纳返回的的确是列表，但里面是空的啊，你可以看看我的截图，对比一下上边和下边
前端面试八股文（详细版）—上
2022-11-13 17:06

旺旺大力包的博客前端面试八股文，知识点广而全，内容会及时更新
CSS，HTML，JS 以及Vue前端面试题八股文总结【看完你就变高手】
2023-01-30 21:08

蒙奇不想敲代码的博客集合了前端大部分基础知识以及常见面试题，看完过初试完全没问题
没有解决我的问题, 去提问

悬赏问题

¥15 安卓adb backup备份应用数据失败
¥15 eclipse运行项目时遇到的问题
¥15 关于#c##的问题：最近需要用CAT工具Trados进行一些开发
¥15 南大pa1 小游戏没有界面，并且报了如下错误，尝试过换显卡驱动，但是好像不行
¥15 没有证书，nginx怎么反向代理到只能接受https的公网网站
¥50 成都蓉城足球俱乐部小程序抢票
¥15 yolov7训练自己的数据集
¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)（相关搜索：51单片机|单片机|测试代码）
¥15 电力市场出清matlab yalmip kkt 双层优化问题
¥30 ros小车路径规划实现不了，如何解决？(操作系统-ubuntu)

为抓取工具跳过javascript代码段

4条回答 默认 最新

悬赏问题

4条回答默认最新