doutuo3899 2012-07-15 03:21
When I was developing my site. I made a typo in one place, for example, all my pages are dir1/dir2/page.htm/par1-par2, but my typo was dir1/dir2/page/par1-par2 (note: without .htm).

It was in production for 1 day only, but Google is keep crawling those links. How to stop Google doing that?

By the way, that's not 1 page, but hundreds or thousands of pages.

  • doujia3441 2012-07-15 03:35

    Try use robots.txt to deny access to this page (url)

    test robots.txt here :

    patterns must begin with / because robots.txt patterns always match absolute URLs. 
    * matches zero or more of any character. 
    $ at the end of a pattern matches the end of the URL; elsewhere $ matches itself. 
    * at the end of a pattern is redundant, because robots.txt patterns always match any URL which begins with the pattern.
