dpkrh2444 2013-09-29 18:42
浏览 81

抓取后重复数据库中的重复数据[重复]

This question already has an answer here:

I am using simple html dom to crawl data from a site into my database and display on my web page. But every time i run the file, duplicate data is also inserted into database.How can i keep check on whether data is already present in database or not? Here is my file for crawling:

 <?php

 $con=mysqli_connect("localhost","root","","crawling");\

 mysql_connect("localhost", "root", "")or die("cannot connect"); 
 mysql_select_db("crawling")or die("cannot select DB");


 include "domcrawl.php";
 $url="http://www.bgr.in/category/reviews/";
 $html=file_get_html($url);
 //$arr=$html->find('table[class=findList] tbody tr td[class=result_text]');
 $m=$html->find('img');

 $b=$html->find('a');

 $c=$html->find('p');  

 $imghead = $b[21]->innertext;

 $img = $m[3];

 $imgtext = $c[0];


 $sql = sprintf("INSERT INTO image1
 ( head, image, text, name)
 VALUES
 ( '%s', '%s', '%s', '%s')",

 mysql_real_escape_string($imghead),
 mysql_real_escape_string($img),
 mysql_real_escape_string($imgtext),
mysql_real_escape_string("gm")
);
 mysql_query($sql);





 $sql = "SELECT head FROM image1 WHERE name='gm'";
 $sql1 = "SELECT image FROM image1 WHERE name='gm'";
 $sql2 = "SELECT text FROM image1 WHERE name='gm'";
 $result = mysql_query("$sql");
 $result1 = mysql_query("$sql1");
 $result2 = mysql_query("$sql2");

  $head_get= mysql_result($result, 0);
 $img_get= mysql_result($result1, 0);
 $text_get= mysql_result($result2, 0);
 echo "<br><br>";

 echo $head_get;
 echo "<br><br>";
 echo $img_get;
 echo $text_get;


  ?>
</div>
  • 写回答

2条回答 默认 最新

  • doubo7131 2013-09-29 18:46
    关注

    Assuming 'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue is line 11, there seems to be no element with tag pubDate, which is why $node->getElementsByTagName('pubDate')->item(0) returns either null or false.

    评论

报告相同问题?

悬赏问题

  • ¥15 关于#python#的问题:求帮写python代码
  • ¥15 LiBeAs的带隙等于0.997eV,计算阴离子的N和P
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘
  • ¥15 来真人,不要ai!matlab有关常微分方程的问题求解决,
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算
  • ¥15 java如何提取出pdf里的文字?