dousi1097 2012-10-02 02:48
浏览 39
已采纳

PHP MongoDB execute()锁定集合

I'm using MongoDB over the command line to go loop through a bunch of documents for a particular condition, move from one collection to another collection and removing from the original collection.

db.coll1.find({'status' : 'DELETED'}).forEach(
    function(e) {db.deleted.insert(e);  db.coll1.remove({_id:e._id});  });

This works however I need to script this so it moves all the documents in coll1 to the deleted collection everyday (or every hour) via a cron script. I'm using PHP so I figured I would write a script in use the Mongo PHP Library ::

$db->execute('db.coll1.find({'status' :'DELETED'}).forEach(
    function(e) {  db.deleted.insert(e); db.coll1.remove({_id:e._id});  })');

This works but unlike the Mongo command line, db->execute() is evaled, which causes a lock until the execution block is finished, which holds off all writes to the collection. I can't do that in my production environment.

Is there a way (without manually logging into Mongo and running the command) and executing it via a PHP script without locking?

If I use:

db->selectCollection('coll1')->find(array('status' => 'DELETED')) 

and iterate through that I can select the documents, save to the deleted collection and delete from the coll1 collection. However this seems like a lot of bandwidth to pull everything on the client and to save it back to the server.

Any suggestions?

  • 写回答

1条回答 默认 最新

  • dongshao1021 2012-10-02 07:51
    关注

    Is there a way (without manually logging into Mongo and running the command) and executing it via a PHP script without locking?

    As you stated the best thing is to do it client side. As for the bandwidth, unless you got a pre-90's network then it will most likely be a very small amount of bandwidth in comparison to how much you would use for everything else including replica sets etc.

    What you could do is warehouse your deletes upon their actual deletion (in your app) instead of once every day and then you would, once a day, go back through your original collection removing all deleted rows. That way the bandwidth will be spread throughout the day and when it comes to clean your production you just do a single delete command.

    Another alternative would be to use an MR and make its output be that collection.

    Though in general warehousing deletes in this manner is normally more work than it is worth. It is normally better to just keep them in your main collection and work your queries around the deleted flag (as you probably already do to not warehouse these immediately).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 个人网站被恶意大量访问,怎么办
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM
  • ¥15 划分vlan后不通了
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大