duan2428 2014-12-29 00:10
浏览 77
已采纳

如何处理不存在的Google索引“页面”

I build dynamic websites where structure is hierarchically saved in the database (Own CMS). I am using the Adjacency model to manage this database tables (PHP and Mysql through PDO)

I detected that Google is indexing pages that it should not.

An example of a tree structure used for navigation:

home
  about us
  products
    productgroup 1
    productgroup 2
  contact
    support
    sales

Imagine this structure in a pulldown menu with links to the pages. When I select products->productgroup 1 I get a url like www.domain.com/products/productgroup-1 which pulls the data from the database (based on the last uri element: productgroup-1, a slug version of the title) and shows it in my template. I do not query all elements, only the last (I should, I know).

So far so good. Google is indexing this page as expected:

http://www.domain.com/products/productgroup-1

But... When I use Google webmaster tools I see a lot of pages indexed with 404's, like:

http://www.domain.com/products
http://www.domain.com/contact

And so fort.

These pages are empty and have no link in the navigation structure.

I have designed my structure so that these pages return a 404 error. Webmastertools confirms this but keeps indexing these pages. I know I can use robots.txt to disallow Google's search bot to keep it drom indexing url's. Is there another way to do this? Should I generate a 403 instead of a 404?

I am in the dark here.

  • 写回答

2条回答 默认 最新

  • dqdt45183 2014-12-29 06:51
    关注

    You should do a few things:

    1. Use 301 Permanent Redirection to direct this empty pages to a relevant page:

    2. Submit a sitemap to google webmaster tools.

      • This is a definitive list of URLs in your site.

      • Having a sitemap will note remove the list of 404 URLs already indexed on Google, but will inform Google of all your "official" URLs in your site and the intended crawl frequency.

      • Read more from Google webmaster tools here.

    3. Check your HTML code for references to "/products" or "/contact". Googlebot will not be crawling these URLs otherwise.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 js,页面2返回页面1时定位进入的设备
  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥15 绘制多分类任务的roc曲线时只画出了一类的roc,其它的auc显示为nan
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号
  • ¥50 安装pyaudiokits失败
  • ¥15 计组这些题应该咋做呀