doutuo3899 2012-07-15 03:21
浏览 59
已采纳

如何停止谷歌抓取未来页面

When I was developing my site. I made a typo in one place, for example, all my pages are dir1/dir2/page.htm/par1-par2, but my typo was dir1/dir2/page/par1-par2 (note: without .htm).

It was in production for 1 day only, but Google is keep crawling those links. How to stop Google doing that?

By the way, that's not 1 page, but hundreds or thousands of pages.

  • 写回答

3条回答 默认 最新

  • doujia3441 2012-07-15 03:35
    关注

    Try use robots.txt to deny access to this page (url)

    http://www.robotstxt.org/robotstxt.html

    http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449

    test robots.txt here : http://www.frobee.com/robots-txt-check/

    patterns must begin with / because robots.txt patterns always match absolute URLs. 
    * matches zero or more of any character. 
    $ at the end of a pattern matches the end of the URL; elsewhere $ matches itself. 
    * at the end of a pattern is redundant, because robots.txt patterns always match any URL which begins with the pattern.
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?