Behold the magik of Regx
$string = <<<CUT
#EXTM3U
#EXTINF:-1 tvg-id="" tvg-name="A&E" tvg-logo="" group-title="ENTRETENIMIENTO",A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
#EXTINF:-1 tvg-id="" tvg-name="ABC Puerto Rico" tvg-logo="" group-title="NACIONALES",ABC Puerto Rico
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
CUT;
preg_match_all('/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^
]+)|(?<url>http[^\s]+)/', $string, $match );
$count = count( $match[0] );
$result = [];
$index = -1;
for( $i =0; $i < $count; $i++ ){
$item = $match[0][$i];
if( !empty($match['tag'][$i])){
//is a tag increment the result index
++$index;
}elseif( !empty($match['prop_key'][$i])){
//is a prop - split item
$result[$index][$match['prop_key'][$i]] = $match['prop_val'][$i];
}elseif( !empty($match['something'][$i])){
//is a prop - split item
$result[$index]['something'] = $item;
}elseif( !empty($match['url'][$i])){
$result[$index]['url'] = $item ;
}
}
print_r( $result );
Returns
array (
0 =>
array (
'tvg-name' => 'A&E',
'group-title' => 'ENTRETENIMIENTO',
'something' => ',A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts',
),
1 =>
array (
'tvg-name' => 'ABC Puerto Rico',
'group-title' => 'NACIONALES',
'something' => ',ABC Puerto Rico',
'url' => 'http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts',
),
)
Seriously though I have no clue what some of this is something
for example. Anyway should get you started.
For the regx, it's actually pretty simple when it's broken down. The real trick is in using preg_match_all
instead of preg_match
.
Here is our regx
/(?P<tag>#EXTINF:-1)|(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")|(?<something>,[^
]+)|(?<url>http[^\s]+)/
First we will break it down to more manageable bits. These are separated by the pipe |
for or. Each one can be thought as a separate pattern, match this one or the next one. Now, the order can be important, because they will match left to right so if one matches on the left it stops. So you have to be careful no to have a regx that can match in two places ( if you don't want that ). However, it can be used to your advantage too, as I will show below. This is really what we are dealing with
(?P<tag>#EXTINF:-1)
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")
(?<something>,[^
]+)
(?<url>http[^\s]+)
Four regular expressions. For all of these (?P<name>...)
is a named capture group, it just makes it more readable, easier to find the bits. If you look at the conditions I use to find the matches, for example!empty($match['tag'][$i])
, we can use the tag
index/key because of a named capture group, otherwise it would be 1
. With a number of regx all together, having 1
2
3
can get messy if you consider this is actually nested so it would be $match[1][$i]
for tag etc. Anyway, once that is taken out we have
-
#EXTINF:-1
match this string literally
-
(?:(?P<prop_key>[-a-z]+)=\"(?P<prop_val>[^"]+)")
this is more complicated (?: .. )
is a non-capture group, this is so the key/value winds up with the same index in the match array but not captured togather, Broken down this is ([-a-z]+)=\"([^"]+)\"
or match a word followed by =
then "
than anything but a "
ending with "
. Basically one side captures the key, the other the value excluding the double quotes
-
,[^
]+
starts with a comma then anything but a line return
- and last
http[^\s]
a url
Now remember what I said about order being important, this url http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
would match the last expression, except that it starts with ,A&E
http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts` which matches the 3rd one so it never gets to number 4
Hope that helps, granted you'll have to have a basic grasp of Regx, this is not really the place for a full tutorial on that, and you can find better examples then I can provide in a few short minutes.
Just for the sake of completeness, here is part of what preg_match_all
returns
(
[0] => Array(
[0] => #EXTINF:-1
[1] => tvg-name="A&E"
[2] => group-title="ENTRETENIMIENTO"
[3] => ,A&E`http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
[4] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/76.ts
[5] => #EXTINF:-1
[6] => tvg-name="ABC Puerto Rico"
[7] => group-title="NACIONALES"
[8] => ,ABC Puerto Rico
[9] => http://nxtv.tk:8080/live/jarenas/iDKZrC56xZ/96.ts
)
[tag] => Array(
[0] => #EXTINF:-1
[1] =>
[2] =>
[3] =>
[4] =>
[5] => #EXTINF:-1
[6] =>
[7] =>
[8] =>
[9] =>
)
[1] => Array(
[0] => #EXTINF:-1
[1] =>
[2] =>
[3] =>
[4] =>
[5] => #EXTINF:-1
[6] =>
[7] =>
[8] =>
[9] =>
)
[prop_key] => Array(
[0] =>
[1] => tvg-name
[2] => group-title
[3] =>
[4] =>
[5] =>
[6] => tvg-name
[7] => group-title
[8] =>
[9] =>
)
[2] => Array( ... duplicate of prop_key .. )
etc.
)
The way to find the item
in the above array is if you look at the for loop when it runs the first time index 0, the main part of the match $match[0][$i]
contains all the matches, but the tag
array only contains the items that match that regx, we can correlate them using the $i
index.
if( !empty($match['tag'][$i])){
//is a tag increment the result index
++$index;
}
If $match[tag][$i]
is not empty. which if you look at $match[tag][0]
when $i = 0
you will see that indeed it is not empty. On the second loop $match[tag][1]
is empty but $match[prop_key][1]
is not so we know that when $i = 1
item is a prop_key
match. That's how that works.
-ps- if you can find a way to remove the duplicated numeric indexes, please share it with me ... lol ... these are the normal matches if I didn't use a named capture group, as I said it can get messy.