dongpu2694 2014-12-07 11:45
浏览 29

比较一个mysql数据库与php的逐个字符

I'm trying to compare in my DB a row with another character by character and give as a result the id which best fits the given data. For example I have on my DB the user David with a AAA sequence and I want to compare it with one I give in which is a ABA so I'd like to receive a percentage (66.6% in this case) of match, I have done until here but don't know how to go on:

$uname = $_POST['sequence'];
$query = "SELECT name FROM dna WHERE sequence = '$uname'";

$result = mysql_query($query);

while($row = mysql_fetch_array($result))
{
    echo $row['name'];
}   
  • 写回答

1条回答 默认 最新

  • drflkphi675447 2014-12-07 12:12
    关注

    In order to get the similarity in percent, you might use the PHP function similar_text(). The two strings are compared and the similarity percentage is returned, if the third parameter is passed to the function.

    $string_1 = 'AAA'; 
    $string_2 = 'ABA'; 
    
    similar_text($string_1, $string_2, $percent); 
    
    echo $percent; 
    // 66.666666666667
    

    The database part is a bit more work. A very basic implementation could look like this.

    Keep in mind, that the real problem is, that you compare a string against 1 million rows. In general: one wouldn't do that, because instead of chars, there a bits. And to compare bits, you would use simply bit-shifts. Anyway... Here, when working with chars/strings, a rolling row requests or limited query could help, too. That would mean, that you ask the db for chunks of let's say 500 rows and do the calc work. It depends on the number of rows and the memory use of the dataset.

    // incomming via user input
    $string_1 = $_POST['sequence'];
    
    // temporary var to store the highest similarity percentage and it's row_id
    $bestValue = array('row_id' => 0, 'similarity' => '0');
    
    // iterate over the "total number of rows" in the database
    foreach($rows as $id => $row)
    {
        // get a new string_2 from db
        $string_2 = $row['name'];
    
        // calculate similarity
        similar_text($string_1, $string_2, $percent);
    
        // if calculated similarity is higher, then update the "best" value 
        if($percent > $bestValue['similarity']) {      
             $bestValue = array('row_id' = $id, 'similiarity' = $percent);
        }
    }  
    
    var_dump($bestValue);
    

    After all db rows are processed, bestValue will containg the highest percentage and it's row id.

    You can do all kinds of things here, for instance:

    • switch from first match update (<) to last match update (<=)
    • stop iteration on first match
    • store row_id's, which have the same similarity (multi row match)
    • if you don't need multi row match, you might drop the array and use two vars for row and percent
    • proper error handling, escaping, mysqli usage

    Be warned: this isn't the most efficient approach, especially not, when working with large datasets. If you need this on a level, which is not hobby or homework, then simply pull a tool, which is optimized for this job, like EMBOSS (http://emboss.sourceforge.net/).

    评论

报告相同问题?

悬赏问题

  • ¥15 在获取boss直聘的聊天的时候只能获取到前40条聊天数据
  • ¥20 关于URL获取的参数,无法执行二选一查询
  • ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开
  • ¥15 marlin编译错误,如何解决?
  • ¥15 有偿四位数,节约算法和扫描算法
  • ¥15 VUE项目怎么运行,系统打不开
  • ¥50 pointpillars等目标检测算法怎么融合注意力机制
  • ¥20 Vs code Mac系统 PHP Debug调试环境配置
  • ¥60 大一项目课,微信小程序
  • ¥15 求视频摘要youtube和ovp数据集