douba9425 2015-10-05 17:25 采纳率: 100%
浏览 22
已采纳

MySQL PHP如果表中的关键字短语在句子表显示结果中不存在

I have two tables: Sentences, Negatives.

I would like to select column Sentences.sentence that does not contain any record in Negatives.negphrase.

There are 200k records in Sentences and 50k records in Negatives.

Sentences.sentence Sample Data
=============================

 - university lab on campus
 - laboratory designs
 - lab coats
 - math lab
 - methane production
 - meth lab

Negatives.negphrase Sample Data
======================================

 - coats
 - math lab
 - meth

Desired Result Set
==================

 - university lab on campus
 - laboratory designs
 - methane production

I tried using the result of a different question of mine but the database timed out:

SELECT Sentences.sentence
FROM Sentences, Negatives
GROUP BY Sentences.sentence
HAVING (((Max(InStr(" " & sentence & " "," " & negphrase & " ")))=0));

MY ANSWER

So I'll give the correct answer to Richard b/c his solution does work for smaller record sets, but not large ones. Here is the PHP code I used to put all negative keywords in an array, then loop through that array with an UPDATE clause to mark a new column 'negmatch' in the Sentences table. I'll use that in another WHERE clause to select Sentences.sentence WHERE negmatch <> 1.

I only have to run this code once for all negphrases, then when I add additional keywords I use the same code but without a loop to search the sentences again (code not shown below). The code takes 6.5 minutes to loop through 2800 UPDATE clauses so the initial load is pretty long, but once it is done it doesn't have to be done again.

<?php
$mysqli = new mysqli("localhost", "myuser", "myuserpassword", "database");

/* check connection */
if ($mysqli->connect_errno) {
    printf("Connect failed: %s
", $mysqli->connect_error);
    exit();
}

if ($result = $mysqli->query("SELECT negphrase FROM negatives")) {  
    $row_cnt = $result->num_rows;
    printf("Negative keywords have %d rows.
", $row_cnt); //print count of rows

    while($row = $result->fetch_assoc()){ //loop through all results by row
        foreach( $row  AS $value ) {
        $negative[] = $value;
    }
}


    /* free result set */
    $result->close();

    $data = array_values($negative); // get only values
    $data = array_filter($data);
    $datacount = 1;
    foreach($data as $val) { //loop through array to build MySQL WHERE clause


            $updatequery = "UPDATE Sentences SET negmatch=1 WHERE sentence REGEXP '[[:<:]]" . trim($val) . "[[:>:]]'";
            echo $updatequery  . "<br />";

            mysqli_query($mysqli,$updatequery) or die (mysqli_error($mysqli));
            echo $datacount . " " . trim($val) ."<br />";
            $datacount++;

        }

}
$mysqli->close();


    unset($result, $row, $mysqli,$value,$negative,$data,$val,$updatequery,$datacount,$row_cnt);

?>

  • 写回答

2条回答 默认 最新

  • douhan8610 2015-10-05 18:28
    关注

    Use negative left join, this will return only rows from Senteces table that doesn't match Negatives table based on rule

    select * from Sentences s 
    left join Negatives n 
    on (concat(" ",s.sentence," ") like concat("% ",n.negphrase," %"))
    where n.negphrase is null
    

    tested on data bellow

    CREATE TABLE IF NOT EXISTS `Negatives` (
      `negphrase` varchar(255) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    INSERT INTO `Negatives` (`negphrase`) VALUES
    ('coats'),
    ('math lab'),
    ('meth');
    
    CREATE TABLE IF NOT EXISTS `Sentences` (
      `sentence` varchar(255) NOT NULL
    ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
    
    INSERT INTO `Sentences` (`sentence`) VALUES
    ('university lab on campus'),
    ('laboratory designs'),
    ('lab coats'),
    ('math lab'),
    ('methane production'),
    ('meth lab'),
    ('testing sentence');  
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥30 YOLO检测微调结果p为1
  • ¥20 求快手直播间榜单匿名采集ID用户名简单能学会的
  • ¥15 DS18B20内部ADC模数转换器
  • ¥15 做个有关计算的小程序
  • ¥15 MPI读取tif文件无法正常给各进程分配路径
  • ¥15 如何用MATLAB实现以下三个公式(有相互嵌套)
  • ¥30 关于#算法#的问题:运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题 求各位帮我解答一下
  • ¥15 setInterval 页面闪烁,怎么解决
  • ¥15 如何让企业微信机器人实现消息汇总整合
  • ¥50 关于#ui#的问题:做yolov8的ui界面出现的问题