I wrote a PHP script to pull tweets from the Twitter firehose and store them into a database. Ideally I want to just let it run so that it collects tweets over time, thus, it's wrapped in a while(1)
loop.
This seems to be problematic because it's timing out. If I just run it in a browser, it won't run for more than 30 seconds before timing out and giving me a 324 Error.
Question: Is there a way that I can have it run for a certain amount of time (20 seconds), auto kill itself, then restart? All in a cron job (PS...I don't know how to write a cron job)?
Background: Site hosted on Godaddy. Would ideally like to run this on my hosting server there.
The Script:
<?php
$start = time();
$expAddress = "HOSTNAME";
$expUser = "USERNAME";
$expPwd = "PASSWORD";
$database = "DBNAME";
$opts = array(
'http' => array(
'method' => "POST",
'content' => 'keywords,go,here',
)
);
// Open connection to stream
$db = mysql_connect($expAddress, $expUser, $expPwd);
mysql_select_db($database, $db);
$context = stream_context_create($opts);
while (1) {
$instream = fopen('https://USERNAME:PASSWORD@stream.twitter.com/1/statuses/filter.json','r' ,false, $context);
while(! feof($instream)) {
if(time() - $start > 5) { // break after 5 seconds
break;
}
if(! ($line = stream_get_line($instream, 100000, "
"))) {
continue;
}
else {
$tweet = json_decode($line);
// Clean before storing
// LOTS OF VARIABLES FOR BELOW...REMOVED FOR READABILITY
// Send to database
$ok = mysql_query("INSERT INTO tweets
(created_at, from_user, from_user_id, latitude, longitude, tweet_id, language_code,
place_name, profile_img_url, source, text, retweet_count, followers_count,
friends_count, listed_count, favorites_count)
VALUES
(NOW(), '$from_user', '$from_user_id', '$latitude', '$longitude', '$tweet_id', '$language_code',
'$place_name', '$profile_img_url', '$source', '$text', '$retweet_count', '$followers_count',
'$friends_count', '$listed_count', '$favorites_count')");
if (!$ok) { echo "Mysql Error: ".mysql_error(); }
flush();
}
}
}
?>