douchuitang0331 2012-10-29 14:27
浏览 47
已采纳

从Python到PHP的GREP功能

I have a python script I wrote that I need to port to php. It recursively searches a given directory and builds a string based on regex searches. The first function I am trying to port is below. It takes a regex and a base dir, recursively searches all files in that dir for the regex, and builds a list of the string matches.

def grep(regex, base_dir):
    matches = list()
    for path, dirs, files in os.walk(base_dir):
        for filename in files:
            fullpath = os.path.join(path, filename)
            with open(fullpath, 'r') as f:
                content = f.read()
                matches = matches + re.findall(regex, content)
    return matches

I never use PHP except for basic GET param manipulation. I grabbed some directory walking code from the web, and am struggling to make it work like the python function above due to my utter lack of the php API.

function findFiles($dir = '.', $pattern = '/./'){
  $prefix = $dir . '/';
  $dir = dir($dir);
  while (false !== ($file = $dir->read())){
    if ($file === '.' || $file === '..') continue;
    $file = $prefix . $file;
    if (is_dir($file)) findFiles($file, $pattern);
    if (preg_match($pattern, $file)){
      echo $file . "
";
    }
  }
}
  • 写回答

1条回答 默认 最新

  • doujiazong0322 2012-10-30 22:11
    关注

    Here is my solution:

    <?php 
    
    class FileGrep {
        private $dirs;      // Scanned directories list
        private $files;     // Found files list
        private $matches;   // Matches list
    
        function __construct() {
            $this->dirs = array();
            $this->files = array();
            $this->matches = array();
        }
    
        function findFiles($path, $recursive = TRUE) {
            $this->dirs[] = realpath($path);
            foreach (scandir($path) as $file) {
                if (($file != '.') && ($file != '..')) {
                    $fullname = realpath("{$path}/{$file}");
                    if (is_dir($fullname) && !is_link($fullname) && $recursive) {
                        if (!in_array($fullname, $this->dirs)) {
                            $this->findFiles($fullname, $recursive);
                        }
                    } else if (is_file($fullname)){
                        $this->files[] = $fullname;
                    }
                }
            }
            return($this->files);
        }
    
        function searchFiles($pattern) {
            $this->matches = array();
            foreach ($this->files as $file) {
                if ($contents = file_get_contents($file)) {
                    if (preg_match($pattern, $contents, $matches) > 0) {
                        //echo $file."
    ";
                        $this->matches = array_merge($this->matches, $matches);
                    }
                }
            }
            return($this->matches);
        }
    }
    
    
    // Usage example:
    
    $fg = new FileGrep();
    $files = $fg->findFiles('.');               // List all the files in current directory and its subdirectories
    $matches = $fg->searchFiles('/open/');      // Search for the "open" string in all those files
    
    ?>
    <html>
        <body>
            <pre><?php print_r($matches) ?></pre>
        </body>
    </html>
    

    Be aware that:

    • It reads each file to search for the pattern, so it may require a lot of memory (check the "memory_limit" configuration in your PHP.INI file).
    • It does'nt work with unicode files. If you are working with unicode files you should use the "mb_ereg_match" function rather than the "preg_match" function.
    • It does'nt follow symbolic links

    In conclusion, even if it's not the most efficient solution at all, it should work.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 BP神经网络控制倒立摆
  • ¥20 要这个数学建模编程的代码 并且能完整允许出来结果 完整的过程和数据的结果
  • ¥15 html5+css和javascript有人可以帮吗?图片要怎么插入代码里面啊
  • ¥30 Unity接入微信SDK 无法开启摄像头
  • ¥20 有偿 写代码 要用特定的软件anaconda 里的jvpyter 用python3写
  • ¥20 cad图纸,chx-3六轴码垛机器人
  • ¥15 移动摄像头专网需要解vlan
  • ¥20 access多表提取相同字段数据并合并
  • ¥20 基于MSP430f5529的MPU6050驱动,求出欧拉角
  • ¥20 Java-Oj-桌布的计算