douzhao7445 2015-09-21 11:47
浏览 68

检查是否存在要插入或更新的记录在MYSQL中

Every week I need to load 50K~200K rows of records from a raw CSV file to my system.

Currently I am solution is to load the CVS to a temp table(empty it after the process), then run my Stored procedure to manipulate the data to different relevant tables in my system. If records already exists will run update query (80% records in CSV are already in my system table), if not exists will Insert the records.

The problem i am facing now is the tables are growing to few millions records, approx. 5~6 millions each tables. "Select Exist" seems very slow too, after that i change to left join tables by batch also slow. Even I just loaded 5K records it may took about few hours to finish the Stored Procedure process.

Any good and faster solutions to handle huge records when comparing tables to decide insert/update records?

Thanks!!

Jack

  • 写回答

2条回答 默认 最新

  • doudiandi6967 2015-09-21 12:09
    关注

    Do the following process which will reduce your time

    First try to update the record and check the number of rows affected if number of rows affected = 0 then insert record.

    But make sure every time you need to modify the modified_Date if modified_Date not exist in table then you need to add that because if the all data are same in new and old record then it will create new query just because there is no modification in table record so it will return 0.

    评论

报告相同问题?

悬赏问题

  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP