I have a cron job that was taking up too much memory (and i later discovered that it was hitting the php timeout on my server). I decided to try to fix it by refactoring the code to load a smaller amount of data to be operated on. Assuming that Record is the class responsible for interfacing with the database table 'Record', the original code looks similar to this:
$allRecords = Record::getAll(); // $records contains Record instances from every record in the db, > 100k array elements
do{
$records = array_splice($allRecords, 500);
foreach($records as $record){
$record->doStuff(); // modify some data and save it back to the database
}
}while(!empty($allRecords))
now the code looks like this:
$ids = Record::getAllIDs(); // $ids is an array of ints which are the id numbers for every record in the database
do{
$records = [];
foreach($ids as $key => $id){
$records[] = new Record($id); // add single Record instance to working pool
unset($id[$key]); // remove id so we don't do same thing twice
if(sizeof($records)===500) // only want 500 at a time to save on memory
break;
}
foreach($records as $record){
$record->doStuff(); // same as before
}
}while(!empty($ids))
Obviously this takes up less system memory as that's what I set out intending to do in the first place. What I'm confused about is how this ends up taking a SIGNIFICANTLY less amount of time. I was regularly seeing this cron taking over an hour to finish and now it takes between 15 and 20 minutes to operate on the same amount of records. Anybody have any ideas as to why that might be the case? In the end I'm still loading the same amount of records from the db and I kind of figured that breaking it up like this would make it slower, not faster.