douxin2002 2011-04-28 22:18
浏览 29
已采纳

PHP提取和解析_基本问题

I have some files (about 500 files) with NO extension.
But I managed to view its contents , it has some weird tags and stuff.

I need to extract all IP addreesses from it.. For ex in line 2 there is always an IP address like this ... (71.129.195.163)

Also, there are some html tags like < a href = "http://www.xyz.com" > in a lot of lines. I need to get this domain name from it , like xyz.com.

could someone assist this php newbie? i know to get the entire file as a string and all tht.. but since php is powerful, I am looking for a sweet and simple way to achieve this .

Thanks a lot

  • 写回答

1条回答 默认 最新

  • dongyi3776 2011-04-28 22:34
    关注

    Regular expressions are great for this.

    To find all IPs in a file:

    $ipPattern = '/(?:25[0-5]|2[0-4]\d|1\d\d|[1-9]\d|\d)(?:[.](?:25[0-5]|2[0-4]\d|1\d\d|[1-9]\d|\d)){3}/';
    
    $ips = array();
    preg_match_all($ipPattern, $fileContents, $ips);
    $ips = $ips[0];
    

    To find all links:

    $linkPattern = '/href(\s+)?\=(\s+)?[\'"](.+?)[\'"]/';
    
    $links = array();
    preg_match($linkPattern, $fileContents, $links);
    
    $link = $links[3];
    

    The file content is assumed to be in $fileContents. Run this code for every file. If you need to collect all IPs and domains than you can merge them into big arrays:

    $allIps = array();
    $allLinks = array();
    
    // after each run of the above code do:
    $allIps = array_merge($allIps, $ips);
    $allLinks[] = $link;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥500 火焰左右视图、视差(基于双目相机)
  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本