dourui7186 2011-01-24 23:09
浏览 63
已采纳

Linux内存管理和大文件

I'm acquiring image objects from a remote server, then attempting to upload them to Rackspace's Cloud Files using their API. Wondering a) how I can make this process more efficient, and b) assuming I'll need to purchase more memory, what a reasonable amount of RAM might be to accomplish this task (current development server is just 512MB).

In executing the script I'm:

  • Querying my local database for a set of ids (around 1 thousand)
  • For each id, querying a remote server, which returns between 10-20 image objects, each image is 25-30k
  • Create a Cloud Files container, based on the id from my db
  • For each image object returned from the remote server, create an image object in my container, and write image data to that object
  • Update row in local db with datetime of added images

This executes relatively quickly on a small set of ids, however 100 (so 700-1k images) can take 5-10min, and anything more than that seems to run indefinitely. Have tried the following, with little success:

  • using php's set_timeout to kill the script after a couple minutes, figuring that'd purge memory allocated to execution, allowing me to pick up where I left off and work through the set is smaller pieces. However this error is never thrown
  • unset the array key containing the image object after it's uploaded (not just the reference inside the loop).

PHP's memory_limit is set to 128MB, and running 'tops' command I see that user 'www-data' was consuming 16% of memory resources. However that no longer appears in the list of users, but I continue to see this:

PID  USER      PR   NI VIRT RES  SHR  S %CPU %MEM  TIME+    COMMAND
2400 mysql     20   0  161m 8220 2808 S    0  1.6  11:12.69 mysqld

...but the TIME+ never changes. I see that there is still 1 task running, yet these values never change:

Mem:    508272k total,   250616k used,   257656k free,     4340k buffers

Apologies for the lengthy post - not entirely sure what (if any of that) is useful. This is not my area of expertise so grasping at straws a little. Thanks in advance for your help.

  • 写回答

1条回答 默认 最新

  • douweidao3882 2011-01-25 03:39
    关注

    MySQL's a daemon - it'll keep running and sit in memory until it dies or you kill it. The TIME+ is how much cpu time it's used since last restart. If it's idle (%CPU = 0), then TIME+ will not increment, since no cpu time has been consumed.

    Have you checked if the cloudfiles API is leaking handles of some sort? You may be unsetting the image object you've retrieved from your service (service->you), but the Cloudfiles API still has to send that image back out the door (you->Rackspace), and that could be leaking somewhere.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条
  • ¥15 LiBeAs的带隙等于0.997eV,计算阴离子的N和P
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘
  • ¥15 来真人,不要ai!matlab有关常微分方程的问题求解决,
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算