dpr26232 2015-12-30 15:42
浏览 13

处理大型mysql表以获取报告

I have to get some monthly reports from some big MySQL tables (around 200,000,000 records now). While tables are almost static (previous records are not changed and just new records are added to them)

  • is it a good idea to store this huge tables to another database (like Oracle or ...) and then run the report queries?
  • If it sound rational what are possible destination options?

In essence I want to know what database specialists do in this situation.

  • 写回答

1条回答 默认 最新

  • douzangdang2225 2015-12-30 16:06
    关注

    Extracting your large tables and loading them into some other commercial make and model of RDBMS (like Oracle) is probably a waste of time, and certainly a waste of funds for licensing. Avoid that unless you absolutely need some kind of vendor specific reporting feature that you can't implement in MySQL.

    When you mention "monthly" reports, it sounds possible that you need to retrieve time-limited data from these tables. For example, you might have five years' worth of information in the tables, but you only need information from the last six months. You can optimize that kind of thing with appropriate indexes.

    For example, you might have a query like this to extract sales for a six month period ending with the month before the present month.

    SELECT COUNT(*) num_sales, SUM(sale_value) total_sales,
           LAST_DAY(sale_date) month_ending
      FROM sales
     WHERE sale_date >= LAST_DAY(CURDATE()) + INTERVAL 1 DAY - INTERVAL 7 MONTH
       AND sale_date < LAST_DAY(CURDATE()) + INTERVAL 1 DAY - INTERVAL 1 MONTH
       AND customer_id = 12345
     GROUP BY LAST_DAY(sale_date)
    

    If you have an index in your table on the following columns

    customer_id, sale_date, sale_value
    

    this query will be reasonably fast. It will take a while to build the indexes you don't have, but that's OK because it's a one-time deal.

    If you need to have a separate database server for reporting, you should explore setting up MySQL in a master-slave replication setup. That way your reporting server will always be close to up-to-date. Read this: http://dev.mysql.com/doc/refman/5.7/en/replication.html

    One other thing: You may want to use the MyISAM access method for these enormous tables rather than InnoDB. It sounds like you don't, often, UPDATE them. Rather, you INSERT new rows. And, it sounds like your reporting work doesn't absolutely need the most recent rows. So, the InnoDB transaction support isn't helping you much.

    评论

报告相同问题?

悬赏问题

  • ¥20 西门子S7-Graph,S7-300,梯形图
  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改