doujin8673 2015-07-27 04:44
浏览 6

最优化/最有效的方式来更新数据库

I have a data set with more than 10000 (this will be more in future) records as below:

[[name=>'name1',url=>'url1', visit=>120],
[name=>'name2',url=>'url2'], visit=>250,
..........
]

It is possible to have duplicate values for the key combination name,url. In such situations I need to get the sum of each records have the duplicate name,url.

Finally I want insert this values into a database. When I do this I have two method to do this:

  1. Create another array with unique combination (name,url) and sum of visit
  2. Update/insert db for each record in a loop.

What is the optimal solution to do this or is there better way to do this?

I know there will be memory issues for a large data set in the first method. In second method there are many db hits and I need to know the disadvantage(s) if I follow 2nd way.

Any help or insight would be appreciated.

  • 写回答

1条回答 默认 最新

  • douwen9540 2015-07-27 05:07
    关注

    I do some big database update like this myself and spent ages trying different solutions.

    Instead of:

    1. Check if record exists, eg select count(id) from data where name='name' and url='url'
    2. Not found, insert record
    3. Found, sum result

    I would try this

    1. Set the unique primary keys on your data table on the url and name field.
    2. Try to do a normal insert and see if you get a successful result.
    3. On unsuccessful result (there already is value for name and url because these 2 fields must be unique), sum the result.
    评论

报告相同问题?

悬赏问题

  • ¥20 数学建模,尽量用matlab回答,论文格式
  • ¥15 昨天挂载了一下u盘,然后拔了
  • ¥30 win from 窗口最大最小化,控件放大缩小,闪烁问题
  • ¥20 易康econgnition精度验证
  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能