duanhan4763 2016-04-25 15:37
浏览 159

使用mysql提高数据库查询的速度

I have the following query for a type-ahead search (as you type into the form it displays matches in a drop down). This query worked well until I switched to a database with about a million records. Now it takes 15 seconds for the match to be displayed. Because search hits are displayed as you type, the query is inside a loop. Is there anything about this query that can be changed to speed it up?

$diagnosis = isset($_GET['diagnosis']) ? $_GET['diagnosis'] : '';

$data = array();

if ($diagnosis) {
$query = explode(' ', $diagnosis);

for ($i = 0, $c = count($query); $i < $c; $i ++) {
    $query[$i] = '+' . mysql_real_escape_string($query[$i]) . '*';
}

$query = implode(' ', $query);

$sql = "SELECT diagnosis, icd9, MATCH(diagnosis) AGAINST('$query' IN BOOLEAN MODE) AS relevance 
        FROM icd10 WHERE MATCH(diagnosis) AGAINST('$query' IN BOOLEAN MODE) HAVING relevance > 0 ORDER BY relevance ";

$r = mysql_query($sql);

    while ($row = mysql_fetch_array($r)) {
        $data[] = $row;
    }
}

echo json_encode($data);
exit;
  • 写回答

1条回答 默认 最新

  • duandi2853 2016-04-25 19:57
    关注

    You can try some stuff:

    First, make sure you have a fulltext index for diagnosis. Second, make sure you have a fulltext index for diagnosis! A million rows isn't that much (depending on the number of words in diagnosis of course), so that just already might be the problem.

    Then try the following code:

    SELECT diagnosis, icd9, MATCH(diagnosis) AGAINST('$query' IN BOOLEAN MODE) AS relevance 
    FROM icd10 ORDER BY relevance desc limit 30
    

    (It might not be obvious that this is faster, and it might not be, so just try it).

    If you need to support short words, e.g. if 3 digit icd9-codes are entered often, you should check your ft_min_word_len / innodb_ft_min_token_size-values (depending on your database) to make sure they are included in the index - but be aware it will increase your index size. Maybe check the stopwords.

    You didn't specify your setup; you can often improve general database performance by e.g. changing settings, hdds or ram. Especially ram.

    Some general ideas: You might want to call the function asynchronously (the user should be able to type while the query runs). As soon as you hit less than 30 results (or whatever limit you set), you can just filter the remaining results on the fly in php (as long as the query gets longer/no words are removed) - it's the closest you get to a cache. Or set the limit to 1000 and filter manually afterwards, php regex is fast too, you just need a score-function. Depending on your data, you might want to not run the query when you just add a single letter to the query (every text will contain a word beginning with an "a", so you might not get a better result - that might not be the case for "q" though). That won't reduce runtime of the query, but you can just save one execution.

    评论

报告相同问题?

悬赏问题

  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探