dpxnrx11199 2017-11-08 08:04
浏览 39
已采纳

优化SQL查询以从大量MySQL数据库中获取数据

I am having a problem getting data from a large amount MySQL database.

With the below code it is ok to get the list of 10K patients and 5K appointments which is our test server.

However, on our live server, the number of patients is over 100K and the number of appointments is over 300K and when I run the code after a while it gives 500 error.

I need the list of the patients whose patient_treatment_status is 1 or 3 and has no appointment after one month from their last appointment. (The below code is working for small amount of patients and appointments)

How can I optimise the first database query so there will be no need the second database query in the foreach loop?

<?php
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 0);

require_once('Db.class.php');

$patients = $db->query("
SELECT
    p.id, p.first_name, p.last_name, p.phone, p.mobile,
    LatestApp.lastAppDate
FROM
    patients p
LEFT JOIN (SELECT patient_id, MAX(start_date) AS lastAppDate FROM appointments WHERE appointment_status = 4) LatestApp ON p.id = LatestApp.patient_id
WHERE
    p.patient_treatment_status = 1 OR p.patient_treatment_status = 3
ORDER BY
    p.id
");

foreach ($patients as $row) {
    $one_month_after_the_last_appointment = date('Y-m-d', strtotime($row['lastAppDate'] . " +1 month"));
    $appointment_check = $db->single("SELECT COUNT(id) FROM appointments WHERE patient_id = :pid AND appointment_status = :a0 AND (start_date >= :a1 AND start_date <= :a2)", array("pid"=>"{$row['id']}","a0"=>"1","a1"=>"{$row['lastAppDate']}","a2"=>"$one_month_after_the_last_appointment"));

    if($appointment_check == 0){
        echo $patient_id = $row['id'].' - '.$row['lastAppDate'].' - '.$one_month_after_the_last_appointment. '<br>';
    }
}
?>
  • 写回答

1条回答 默认 最新

  • dter8514 2017-11-08 10:25
    关注

    First off, this subquery likely does not do what you think it does.

    SELECT patient_id, MAX(start_date) AS lastAppDate 
    FROM appointments WHERE appointment_status = 4
    

    Without a GROUP BY clause, that subquery will simply take the maximum start_date of all appointments with appointment_status=4, and then arbitrarily pick one patient_id. To get the results you want you'll need to GROUP BY patient_id.

    For your overall question, try the following query:

    SELECT
        p.id, p.first_name, p.last_name, p.phone, p.mobile,
        LatestApp.lastAppDate
    FROM
        patients p
    INNER JOIN (
        SELECT patient_id,
        MAX(start_date) AS lastAppDate
        FROM appointments
        WHERE appointment_status = 4
        GROUP BY patient_id
    ) LatestApp ON p.id = LatestApp.patient_id
    WHERE
        (p.patient_treatment_status = 1
        OR p.patient_treatment_status = 3)
        AND NOT EXISTS (
            SELECT 1
            FROM appointments a
            WHERE a.patient_id = p.patient_id
            AND a.appointment_status = 1
            AND a.start_date >= LatestApp.lastAppDate
            AND a.start_date < DATE_ADD(LatestApp.lastAppDate,INTERVAL 1 MONTH)
    )
    ORDER BY
        p.id
    

    Add the following index, if it doesn't already exist:

    ALTER TABLE appointments
    ADD INDEX (`patient_id`, `appointment_status`, `start_date`)
    

    Report how this performs and if the data appears correct. Provide SHOW CREATE TABLE patient and SHOW CREATE TABLE appointments for further assistance related to performance.

    Also, try the query above without the AND NOT EXISTS clause, together with the second query you use. It is possible that running 2 queries may be faster than trying to run them together, in this situation.

    Note that I used an INNER JOIN to find the latest appointment. This will result in all patients that have never had an appointment to not be included in the query. If you need those added, just UNION the results those found by selecting from patients that have never had an appointment.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 求指导ADS低噪放设计
  • ¥15 CARSIM前车变道设置
  • ¥50 三种调度算法报错 有实例
  • ¥15 关于#python#的问题,请各位专家解答!
  • ¥200 询问:python实现大地主题正反算的程序设计,有偿
  • ¥15 smptlib使用465端口发送邮件失败
  • ¥200 总是报错,能帮助用python实现程序实现高斯正反算吗?有偿
  • ¥15 对于squad数据集的基于bert模型的微调
  • ¥15 为什么我运行这个网络会出现以下报错?CRNN神经网络
  • ¥20 steam下载游戏占用内存