dseomy1964 2016-01-24 12:49
浏览 34

Cron作业和PHP脚本问题

I am having a difficult situation here that I am trying to debug.

I have a scrapper script, which has a file locking to prevent multiple instances, and also have a block on non cli executions.

I have set it in cron to run every minute. The scrapper does some database functions, and outputs statuses to 2 text files, which I monitor using jquery page so I see things live.

So I can see if the script is idle or running, I can see the last worked string, and details.I also do a check in the database to see the pending no. of sites that scrapper should handle.

The situation now is, I see the pending no. of sites reducing, but the check says script is idle, and there is nothing fed into the 2 text files.

When I try ps aux | grep php I see the 2 lines.

root      3159  0.0  0.0   4392   664 ?        Ss   17:57   0:00 /bin/sh -c php /var/www/html/workers/scraper.php
root      3160  0.0  1.5 298100 15824 ?        S    17:57   0:00 php /var/www/html/workers/scraper.php

And when I try to run it myself, using php scraper.php it runs ( It shouldnt be, because the file lock should now fail), and I get it.

root      3159  0.0  0.0   4392   664 ?        Ss   17:57   0:00 /bin/sh -c php /var/www/html/workers/scraper.php
root      3160  0.0  1.5 298100 15824 ?        S    17:57   0:00 php /var/www/html/workers/scraper.php
root      3295  0.0  1.4 297852 15208 pts/2    S+   18:11   0:00 php scraper.php

Anyway idea whats going on? I have been struggling to find whats wrong. In short, the cron actually does the work and do the scrapping, but it doesnt care about filelocking, or outputting statuses to text files, which looks so weird.

Here is the code :

<?php
error_reporting(E_ALL);
if(PHP_SAPI !== 'cli' || isset($_SERVER['HTTP_USER_AGENT'])) {
    exit('cli only');
}
//get the site lists.
require_once("writedata.php");
require_once("dbconnect.php");

//set file lock to get script status
$file = fopen("lock.txt","w+");
// exclusive lock
if (flock($file, LOCK_EX | LOCK_NB))
{
    // Grab lock. Continue work.
    echo "Started Running";
//ALL THE MAIN WORK HAPPENS HERE.
$sql = "SELECT site FROM `sites` WHERE status=0 LIMIT 2500";
$result = $db->query($sql);

while($row = $result->fetch_assoc()){
    $site=$row['site'];
    file_put_contents("errors.txt", $site);

//Scrapper functions and parsing functions removed.

}
}
else{
    echo "Script already running";
    exit;
}
  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
    • ¥100 为什么这个恒流源电路不能恒流?
    • ¥15 有偿求跨组件数据流路径图
    • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
    • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
    • ¥15 CSAPPattacklab
    • ¥15 一直显示正在等待HID—ISP
    • ¥15 Python turtle 画图
    • ¥15 stm32开发clion时遇到的编译问题
    • ¥15 lna设计 源简并电感型共源放大器