dourui7186
dourui7186
2011-01-24 23:09
浏览 53

Linux内存管理和大文件

I'm acquiring image objects from a remote server, then attempting to upload them to Rackspace's Cloud Files using their API. Wondering a) how I can make this process more efficient, and b) assuming I'll need to purchase more memory, what a reasonable amount of RAM might be to accomplish this task (current development server is just 512MB).

In executing the script I'm:

  • Querying my local database for a set of ids (around 1 thousand)
  • For each id, querying a remote server, which returns between 10-20 image objects, each image is 25-30k
  • Create a Cloud Files container, based on the id from my db
  • For each image object returned from the remote server, create an image object in my container, and write image data to that object
  • Update row in local db with datetime of added images

This executes relatively quickly on a small set of ids, however 100 (so 700-1k images) can take 5-10min, and anything more than that seems to run indefinitely. Have tried the following, with little success:

  • using php's set_timeout to kill the script after a couple minutes, figuring that'd purge memory allocated to execution, allowing me to pick up where I left off and work through the set is smaller pieces. However this error is never thrown
  • unset the array key containing the image object after it's uploaded (not just the reference inside the loop).

PHP's memory_limit is set to 128MB, and running 'tops' command I see that user 'www-data' was consuming 16% of memory resources. However that no longer appears in the list of users, but I continue to see this:

PID  USER      PR   NI VIRT RES  SHR  S %CPU %MEM  TIME+    COMMAND
2400 mysql     20   0  161m 8220 2808 S    0  1.6  11:12.69 mysqld

...but the TIME+ never changes. I see that there is still 1 task running, yet these values never change:

Mem:    508272k total,   250616k used,   257656k free,     4340k buffers

Apologies for the lengthy post - not entirely sure what (if any of that) is useful. This is not my area of expertise so grasping at straws a little. Thanks in advance for your help.

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • douweidao3882
    douweidao3882 2011-01-25 03:39
    已采纳

    MySQL's a daemon - it'll keep running and sit in memory until it dies or you kill it. The TIME+ is how much cpu time it's used since last restart. If it's idle (%CPU = 0), then TIME+ will not increment, since no cpu time has been consumed.

    Have you checked if the cloudfiles API is leaking handles of some sort? You may be unsetting the image object you've retrieved from your service (service->you), but the Cloudfiles API still has to send that image back out the door (you->Rackspace), and that could be leaking somewhere.

    点赞 评论

相关推荐