dourui7186 2011-01-24 23:09
浏览 63
已采纳

Linux内存管理和大文件

I'm acquiring image objects from a remote server, then attempting to upload them to Rackspace's Cloud Files using their API. Wondering a) how I can make this process more efficient, and b) assuming I'll need to purchase more memory, what a reasonable amount of RAM might be to accomplish this task (current development server is just 512MB).

In executing the script I'm:

  • Querying my local database for a set of ids (around 1 thousand)
  • For each id, querying a remote server, which returns between 10-20 image objects, each image is 25-30k
  • Create a Cloud Files container, based on the id from my db
  • For each image object returned from the remote server, create an image object in my container, and write image data to that object
  • Update row in local db with datetime of added images

This executes relatively quickly on a small set of ids, however 100 (so 700-1k images) can take 5-10min, and anything more than that seems to run indefinitely. Have tried the following, with little success:

  • using php's set_timeout to kill the script after a couple minutes, figuring that'd purge memory allocated to execution, allowing me to pick up where I left off and work through the set is smaller pieces. However this error is never thrown
  • unset the array key containing the image object after it's uploaded (not just the reference inside the loop).

PHP's memory_limit is set to 128MB, and running 'tops' command I see that user 'www-data' was consuming 16% of memory resources. However that no longer appears in the list of users, but I continue to see this:

PID  USER      PR   NI VIRT RES  SHR  S %CPU %MEM  TIME+    COMMAND
2400 mysql     20   0  161m 8220 2808 S    0  1.6  11:12.69 mysqld

...but the TIME+ never changes. I see that there is still 1 task running, yet these values never change:

Mem:    508272k total,   250616k used,   257656k free,     4340k buffers

Apologies for the lengthy post - not entirely sure what (if any of that) is useful. This is not my area of expertise so grasping at straws a little. Thanks in advance for your help.

  • 写回答

1条回答 默认 最新

  • douweidao3882 2011-01-25 03:39
    关注

    MySQL's a daemon - it'll keep running and sit in memory until it dies or you kill it. The TIME+ is how much cpu time it's used since last restart. If it's idle (%CPU = 0), then TIME+ will not increment, since no cpu time has been consumed.

    Have you checked if the cloudfiles API is leaking handles of some sort? You may be unsetting the image object you've retrieved from your service (service->you), but the Cloudfiles API still has to send that image back out the door (you->Rackspace), and that could be leaking somewhere.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题
  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向
  • ¥15 如何用python向钉钉机器人发送可以放大的图片?