dqxsuig64994 2017-09-29 15:58 采纳率: 0%
浏览 33
已采纳

处理1000万个数据集 - php和sql [关闭]

We're using PHP 7 and have a MySQL DB running on a Webserver with only 128 MB RAM. We have a problem with processing tons of datasets. Simple description: We have 40.000 products and we want to collect data to these products to find out, if they need to be updated or not. The query which is collecting the specific data from another table with 10 Million datasets takes 1.2 seconds, because we have some SUM functions in it. We need to do the query for every product individually, because the time range which is relevant for the SUM, differs. Because of the mass of queries the function which should iterate over all the products returns a time out (after 5 min) - that's why we decided to implement a cronjob, which calls the function and the function continues with the product it ended the last time. We call the cronjob every 5 min. But still, with our 40.000 products, it takes us ~30 hours until all the products were processed. Per cronjob, our function processes about 100 products... How is it possible to deal with such a mass of data - is there a way to parallelize it with e.g. pthreads or does somebody have another idea? Could a server update be a solution?

Thanks a lot! Nadine

  • 写回答

1条回答 默认 最新

  • douquanqiao6788 2017-09-29 17:18
    关注

    Parallel processing will require resources as well, so on 128 MB it will not help.

    Monitor your system to see where the bottleneck is. Most probably the memory since it is so low. Once you find the bottleneck resource, you will have to increase it. No amount of tuning and tinkering will solve an overloaded server issue.

    If you can see that it is not a server resources issue (!), it could be at the query level (to many joints, need some indexes, ...). And your 5 min. timeout could be increased.

    But start with the server.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 目详情-五一模拟赛详情页
  • ¥15 有了解d3和topogram.js库的吗?有偿请教
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看
  • ¥15 关于#Java#的问题,如何解决?
  • ¥15 加热介质是液体,换热器壳侧导热系数和总的导热系数怎么算
  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
  • ¥15 cmd cl 0x000007b
  • ¥20 BAPI_PR_CHANGE how to add account assignment information for service line