douchuang1861 2017-01-24 05:33
浏览 220
已采纳

Php Curl解析m3u文件

Hope you guys can help me out. I have the following .m3u file

#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
#EXTINF:-1 tvg-id="" tvg-name="Animal Planet" tvg-logo="" group-title="ENTRETENIMIENTO",Animal Planet
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/185.ts

As you can see, there is the main tag for the file #EXTM3U and down that start the video information tag (#EXTINF:-1 ...) and down that the video link entry (http:// .....)

Can you explicitly tell me how can i parse this whole file (it's a pretty large one) and save the fields in an array for example like this? videos[ ] and later i can acces to every video attributes lets say videos[0]['title'] for getting the title for the first video? and so on with the other attributes for example videos[42]['link'] and get the link to the video #42.

I am already using curl to get the file content into a variable like this

<?php
   $handler = curl_init("link to m3u file");  
   $response = curl_exec ($handler);  
   curl_close($handler); 
   echo $response;
?>

What i need now is to parse the Curl response and save all the videos information into an array, where i can acces to every attribute of every video.

I know i must use some regexp or something like that. i just dont understand how. can you please help me with some code? thank you so much.

  • 写回答

3条回答 默认 最新

  • dongsao8279 2017-01-24 06:24
    关注

    Behold the magik of Regx

    $string = <<<CUT
    #EXTM3U
    #EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
    http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
    #EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
    http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
    CUT;
    
    preg_match_all('/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^
    ]+)|(?<url>http[^\s]+)/', $string, $match );
    
    $count = count( $match[0] );
    
    $result = [];
    $index = -1;
    
    for( $i =0; $i < $count; $i++ ){
        $item = $match[0][$i];
    
        if( !empty($match['tag'][$i])){
            //is a tag increment the result index
            ++$index;
        }elseif( !empty($match['prop_key'][$i])){
            //is a prop - split item
            $result[$index][$match['prop_key'][$i]] = $match['prop_val'][$i];
        }elseif( !empty($match['something'][$i])){
            //is a prop - split item
            $result[$index]['something'] = $item;
        }elseif( !empty($match['url'][$i])){
            $result[$index]['url'] = $item ;
        }
    }
    
    print_r( $result );
    

    Returns

    array (
      0 => 
      array (
        'tvg-name' => 'A&E',
        'group-title' => 'ENTRETENIMIENTO',
        'something' => ',A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
        'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
      ),
      1 => 
      array (
        'tvg-name' => 'ABC Puerto Rico',
        'group-title' => 'NACIONALES',
        'something' => ',ABC Puerto Rico',
        'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts',
      ),
    )
    

    Seriously though I have no clue what some of this is something for example. Anyway should get you started.

    For the regx, it's actually pretty simple when it's broken down. The real trick is in using preg_match_all instead of preg_match.

    Here is our regx

     /(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^
    ]+)|(?<url>http[^\s]+)/
    

    First we will break it down to more manageable bits. These are separated by the pipe | for or. Each one can be thought as a separate pattern, match this one or the next one. Now, the order can be important, because they will match left to right so if one matches on the left it stops. So you have to be careful no to have a regx that can match in two places ( if you don't want that ). However, it can be used to your advantage too, as I will show below. This is really what we are dealing with

     (?P<tag>#EXTINF:-1)
    
     (?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")
    
     (?<something>,[^
    ]+)
    
     (?<url>http[^\s]+)
    

    Four regular expressions. For all of these (?P<name>...) is a named capture group, it just makes it more readable, easier to find the bits. If you look at the conditions I use to find the matches, for example!empty($match['tag'][$i]), we can use the tag index/key because of a named capture group, otherwise it would be 1. With a number of regx all together, having 1 2 3 can get messy if you consider this is actually nested so it would be $match[1][$i] for tag etc. Anyway, once that is taken out we have

    • #EXTINF:-1 match this string literally
    • (?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)") this is more complicated (?: .. ) is a non-capture group, this is so the key/value winds up with the same index in the match array but not captured togather, Broken down this is ([-a-z]+)=\"([^"]+)\" or match a word followed by = then " than anything but a " ending with ". Basically one side captures the key, the other the value excluding the double quotes
    • ,[^ ]+ starts with a comma then anything but a line return
    • and last http[^\s] a url

    Now remember what I said about order being important, this url http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts would match the last expression, except that it starts with ,A&Ehttp://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts` which matches the 3rd one so it never gets to number 4

    Hope that helps, granted you'll have to have a basic grasp of Regx, this is not really the place for a full tutorial on that, and you can find better examples then I can provide in a few short minutes.

    Just for the sake of completeness, here is part of what preg_match_all returns

    (
        [0] => Array(
                [0] => #EXTINF:-1
                [1] => tvg-name="A&E"
                [2] => group-title="ENTRETENIMIENTO"
                [3] => ,A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
                [4] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
                [5] => #EXTINF:-1
                [6] => tvg-name="ABC Puerto Rico"
                [7] => group-title="NACIONALES"
                [8] => ,ABC Puerto Rico
                [9] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
            )
        [tag] => Array(
                [0] => #EXTINF:-1
                [1] => 
                [2] => 
                [3] => 
                [4] => 
                [5] => #EXTINF:-1
                [6] => 
                [7] => 
                [8] => 
                [9] => 
            )
        [1] => Array(
                [0] => #EXTINF:-1
                [1] => 
                [2] => 
                [3] => 
                [4] => 
                [5] => #EXTINF:-1
                [6] => 
                [7] => 
                [8] => 
                [9] => 
            )
        [prop_key] => Array(
                [0] => 
                [1] => tvg-name
                [2] => group-title
                [3] => 
                [4] => 
                [5] => 
                [6] => tvg-name
                [7] => group-title
                [8] => 
                [9] => 
            )
        [2] => Array( ... duplicate of prop_key .. ) 
       etc. 
    )
    

    The way to find the item in the above array is if you look at the for loop when it runs the first time index 0, the main part of the match $match[0][$i] contains all the matches, but the tag array only contains the items that match that regx, we can correlate them using the $i index.

        if( !empty($match['tag'][$i])){
            //is a tag increment the result index
            ++$index;
        }
    

    If $match[tag][$i] is not empty. which if you look at $match[tag][0] when $i = 0 you will see that indeed it is not empty. On the second loop $match[tag][1] is empty but $match[prop_key][1] is not so we know that when $i = 1 item is a prop_key match. That's how that works.

    -ps- if you can find a way to remove the duplicated numeric indexes, please share it with me ... lol ... these are the normal matches if I didn't use a named capture group, as I said it can get messy.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥30 雷达辐射源信号参考模型
  • ¥15 html+css+js如何实现这样子的效果?
  • ¥15 STM32单片机自主设计
  • ¥15 如何在node.js中或者java中给wav格式的音频编码成sil格式呢
  • ¥15 不小心不正规的开发公司导致不给我们y码,
  • ¥15 我的代码无法在vc++中运行呀,错误很多
  • ¥50 求一个win系统下运行的可自动抓取arm64架构deb安装包和其依赖包的软件。
  • ¥60 fail to initialize keyboard hotkeys through kernel.0000000000
  • ¥30 ppOCRLabel导出识别结果失败
  • ¥15 Centos7 / PETGEM