dongyi0114 2012-07-16 16:42
浏览 36
已采纳

文件爬虫PHP

just wondering how it would be possible to recursively search through a website folder directory (the same one as the script is uploaded to) and open/read every file and search for a specific string?

for example I might have this:

search.php?string=hello%20world

this would run a process then output somethign like

"hello world found inside"

httpdocs
/index.php
/contact.php

httpdocs/private/
../prviate.php
../morestuff.php
../tastey.php

httpdocs/private/love
../../goodness.php

I dont want it to link- crawl as private files and unlinked files are round, but i'd like every other non-binary file to be access really.

many thanks

Owen

  • 写回答

3条回答 默认 最新

  • dongzhan5286 2012-07-16 16:59
    关注

    Two immediate solutions come to mind.

    1) Using grep with the exec command (only if the server supports it):

    $query = $_GET['string'];
    $found = array();
    exec("grep -Ril '" . escapeshellarg($query) . "' " . $_SERVER['DOCUMENT_ROOT'], $found);
    

    Once finished, every file-path that contains the query will be placed in $found. You can iterate through this array and process/display it as needed.

    2) Recursively loop through the folder and open each file, search for the string, and save it if found:

    function search($file, $query, &$found) {
        if (is_file($file)) {
            $contents = file_get_contents($file);
            if (strpos($contents, $query) !== false) {
                // file contains the query string
                $found[] = $file;
            }
        } else {
            // file is a directory
            $base_dir = $file;
            $dh = opendir($base_dir);
            while (($file = readdir($dh))) {
                if (($file != '.') && ($file != '..')) {
                    // call search() on the found file/directory
                    search($base_dir . '/' . $file, $query, $found);
                }
            }
            closedir($dh);
        }
    }
    
    $query = $_GET['string'];
    $found = array();
    search($_SERVER['DOCUMENT_ROOT'], $query, $found);
    

    This should (untested) recursively search into each subfolder/file for the requested string. If it's found, it will be in the variable $found.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 TLS1.2协议通信解密
  • ¥40 图书信息管理系统程序编写
  • ¥20 Qcustomplot缩小曲线形状问题
  • ¥15 企业资源规划ERP沙盘模拟
  • ¥15 树莓派控制机械臂传输命令报错,显示摄像头不存在
  • ¥15 前端echarts坐标轴问题
  • ¥15 ad5933的I2C
  • ¥15 请问RTX4060的笔记本电脑可以训练yolov5模型吗?
  • ¥15 数学建模求思路及代码
  • ¥50 silvaco GaN HEMT有栅极场板的击穿电压仿真问题