dpxnrx11199 2017-11-08 08:04
浏览 39
已采纳

优化SQL查询以从大量MySQL数据库中获取数据

I am having a problem getting data from a large amount MySQL database.

With the below code it is ok to get the list of 10K patients and 5K appointments which is our test server.

However, on our live server, the number of patients is over 100K and the number of appointments is over 300K and when I run the code after a while it gives 500 error.

I need the list of the patients whose patient_treatment_status is 1 or 3 and has no appointment after one month from their last appointment. (The below code is working for small amount of patients and appointments)

How can I optimise the first database query so there will be no need the second database query in the foreach loop?

<?php
ini_set('memory_limit', '-1');
ini_set('max_execution_time', 0);

require_once('Db.class.php');

$patients = $db->query("
SELECT
    p.id, p.first_name, p.last_name, p.phone, p.mobile,
    LatestApp.lastAppDate
FROM
    patients p
LEFT JOIN (SELECT patient_id, MAX(start_date) AS lastAppDate FROM appointments WHERE appointment_status = 4) LatestApp ON p.id = LatestApp.patient_id
WHERE
    p.patient_treatment_status = 1 OR p.patient_treatment_status = 3
ORDER BY
    p.id
");

foreach ($patients as $row) {
    $one_month_after_the_last_appointment = date('Y-m-d', strtotime($row['lastAppDate'] . " +1 month"));
    $appointment_check = $db->single("SELECT COUNT(id) FROM appointments WHERE patient_id = :pid AND appointment_status = :a0 AND (start_date >= :a1 AND start_date <= :a2)", array("pid"=>"{$row['id']}","a0"=>"1","a1"=>"{$row['lastAppDate']}","a2"=>"$one_month_after_the_last_appointment"));

    if($appointment_check == 0){
        echo $patient_id = $row['id'].' - '.$row['lastAppDate'].' - '.$one_month_after_the_last_appointment. '<br>';
    }
}
?>
  • 写回答

1条回答 默认 最新

  • dter8514 2017-11-08 10:25
    关注

    First off, this subquery likely does not do what you think it does.

    SELECT patient_id, MAX(start_date) AS lastAppDate 
    FROM appointments WHERE appointment_status = 4
    

    Without a GROUP BY clause, that subquery will simply take the maximum start_date of all appointments with appointment_status=4, and then arbitrarily pick one patient_id. To get the results you want you'll need to GROUP BY patient_id.

    For your overall question, try the following query:

    SELECT
        p.id, p.first_name, p.last_name, p.phone, p.mobile,
        LatestApp.lastAppDate
    FROM
        patients p
    INNER JOIN (
        SELECT patient_id,
        MAX(start_date) AS lastAppDate
        FROM appointments
        WHERE appointment_status = 4
        GROUP BY patient_id
    ) LatestApp ON p.id = LatestApp.patient_id
    WHERE
        (p.patient_treatment_status = 1
        OR p.patient_treatment_status = 3)
        AND NOT EXISTS (
            SELECT 1
            FROM appointments a
            WHERE a.patient_id = p.patient_id
            AND a.appointment_status = 1
            AND a.start_date >= LatestApp.lastAppDate
            AND a.start_date < DATE_ADD(LatestApp.lastAppDate,INTERVAL 1 MONTH)
    )
    ORDER BY
        p.id
    

    Add the following index, if it doesn't already exist:

    ALTER TABLE appointments
    ADD INDEX (`patient_id`, `appointment_status`, `start_date`)
    

    Report how this performs and if the data appears correct. Provide SHOW CREATE TABLE patient and SHOW CREATE TABLE appointments for further assistance related to performance.

    Also, try the query above without the AND NOT EXISTS clause, together with the second query you use. It is possible that running 2 queries may be faster than trying to run them together, in this situation.

    Note that I used an INNER JOIN to find the latest appointment. This will result in all patients that have never had an appointment to not be included in the query. If you need those added, just UNION the results those found by selecting from patients that have never had an appointment.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何实现H5在QQ平台上的二次分享卡片效果?
  • ¥15 python爬取bilibili校园招聘网站
  • ¥30 求解达问题(有红包)
  • ¥15 请解包一个pak文件
  • ¥15 不同系统编译兼容问题
  • ¥100 三相直流充电模块对数字电源芯片在物理上它必须具备哪些功能和性能?
  • ¥30 数字电源对DSP芯片的具体要求
  • ¥20 antv g6 折线边如何变为钝角
  • ¥30 如何在Matlab或Python中 设置饼图的高度
  • ¥15 nginx中的CORS策略应该如何配置