doubianyan9749 2012-12-14 09:20
浏览 33
已采纳

过滤MySQL结果的最佳实践

I want to implement a filter-function in my PHP project. To implement a filter, I usually just add a WHERE clause in my query to show filtered results.

My Problem is: These filters require not only a smple added WHERE clause, but a huge Query including multiple JOINs. The resulting Query has > 30 lines.

Later, there should also be a search function which would then also require this huge query. I wonder if this is a good practice or if I should add a "redundant" Database column to my database table where I compute the attribute I need for filtering on every update. With this column, I wouldnt have my huge query on different places over my project, but have a redundant column.

What do you think?

Greetings

As questioned, here the table structure/code. This is not the exact code, because there is also a revision system which makes it even more complex, but for understanding this is enough:

table submissions:

ID (primary)
(additionalColumns)

table reports:

ID (primary)
submissionID (reference to submission table)
(additionalColumns)

table report_objects:

reportID (reference to reports table, multiple report_object for one report)

table accounting:

ID (primary)
reportID (reference to reports table, multiple accountings for one report)
(additionalColumns)

table accounting_objects:

ID
accountingID (reference to accounting table, multiple accounting_object for one accounting)
(additionalColumns)

For a submission, one or multiple reports are being create with multiple objects to account (report_objects). For each report, I can create multiple accountings, where each accounting is for a few objects of the report. The accounted report_objects are stored in accounting_object

My query/filter checks, if each report_object of a submissionID is accounted (accounting_object exists) for one submissionID.

  • 写回答

1条回答 默认 最新

  • dongquan8753 2012-12-14 10:21
    关注

    There isn't one definitive answer and, in practice, if it works and runs quickly enough for your needs then you can leave it as is. Optimization is always something you can come back to.

    Joining correctly

    If you are simply checking for the existence of a join table and only including the results with that join you can do this through the correct LEFT / RIGHT JOIN expressions. This is always the first call.

    Expressiveness

    Also be as expressive as you can with SQL, you want to give it the best chance to optimize your query, there are keywords such as EXISTS, for example, make sure to use them.

    Denormalization

    You can add in a column that stores the computed value, the complexity that arises out of this is ensuring that the value is always up to date. This can be done by triggers or manually. The pros:

    • It is the easiest method of getting around slowness introduced by computed columns.

    The cons:

    • Ruins your nice normalized schema
    • If you do it manually in code, you will forget to do it somewhere, causing headaches.
    • Triggers can be a bit of a pain.

    Materialized view

    This is like denormalization but prevents polluting your normalized tables by created a stored view. This is achieved in MySQL by storing the result of your complex select into a results table when the values change. Again, the same as denormalization, the complexity is keeping this up to date. It is typically done with triggers. This can be a pain but keeps the complexity out of your schema. As mentioned by@eggyal it isn't a supported feature of MySQL yet so you will have to DIY... Materialized views with MySQL

    Pros:

    • Keeps dirty denormalized stuff away from your nice normalized schema.

    Cons:

    • Materialized views aren't supported so setting them up requires work.
    • If you trigger the refresh of your views in code you get stale data, but isn't quite as painful as the single column staleness of denormalization.
    • Triggers can be a bit of a pain.

    If you aren't sure, and it really matters, do some benchmarking.

    EDIT If you code has this query in one form or another across your code base then that has the possibility of cause headaches in future as you will have to remember to change the statements in all of the places if or when they change.

    If by doing the above you have made your statements really simple and concise then they may differ enough from each other for it to not be a problem.

    You can do some things to help you out:

    1. Put all of the related queries in a single place, i.e. a single class or script that handles this query in its various forms. This way at least all of the changes are limited to the one file.
    2. You can, to help yourself out a bit more, do a bit of refactoring it to remove duplication between the queries.

    Also, If you feel the database information is too exposed to the code you may want to abstract it behind a view.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大