everyone i have a Regex question here, i want to parse this Log file, right now i want to get the keys and values inside of SESSION
The problem is that the logs don't all look the same, some of them lack the # characters enclosing the 'SESSION', they all contain the word SESSION to start off the block of variables however, and they all end with another block which always contains either the words "POST" or "API CURL CALL".
So i have to use quantifiers most likely to make it disregard anything in between those strings but when match any sets of keys and values (separated by :) inside of these two other values...
That's a mouthful just talking about it... i'm completely stumped, so i turn to you guys for some guidance and help in this matter. The goal is to parse these shitty logs into something i can actually read quickly and understand.
I'm creating a class in PHP to do that and spit out some nice HTML formatted logs. This is the log file as it stands.
[05:40:40] ################
[05:40:40] #### SOURCE ####: /zalo/vn/interface.call.php
[05:40:40] #### REQUEST ####: /zalo/vn/interface.call.php
[05:40:40] #### Refer: http://app.com/zalo/vn/?v=1&adsid=d6e5f33e5a94d9fafaf15dc0cf4a1e5&sub_id=170100sf01435487523&sub_id1=232s5
[05:40:40] #### SESSION #####
[05:40:40] v: 1
[05:40:40] adsid: d6e5f33e5a94d93sfsf5dc0cf4a1e5
[05:40:40] sub_id: 799e12b08fa1edes1d7bgsg0506a6e9
[05:40:40] landingpage: http%3A%2F%2Fapp.com%2Fzalo%2Fvn%2Finterface.call.php
[05:40:40] c_id: da21bae82c02d1e2b8168d57cd3fbab7
[05:40:40] nId: 3943
[05:40:40] partner: Marvel
[05:40:40] country_code: 84
[05:40:40] country: VN
[05:40:40] url: http://app.com/zalo/vn/
[05:40:40] campaign_id: 1066
[05:40:40] source: web
[05:40:40] msisdn: 906346534
[05:40:40] Phone: 906346534
[05:40:40] #### POST ####
[05:40:40] action: subscribe
[05:40:40] Phone: 906346534
[05:40:40] ################
[05:40:40] #### API CURL CALL ####
Ideally what i'd want to keep is this section
v: 1
adsid: d6e5f33e5a94d93sfsf5dc0cf4a1e5
sub_id: 799e12b08fa1edes1d7bgsg0506a6e9
landingpage: http%3A%2F%2Fapp.com%2Fzalo%2Fvn%2Finterface.call.php
c_id: da21bae82c02d1e2b8168d57cd3fbab7
nId: 3943
partner: Marvel
country_code: 84
country: VN
url: http://app.com/zalo/vn/
campaign_id: 1066
source: web
msisdn: 906346534
Phone: 906346534
I probably need a lookbehind-lookahead combination of some sort.
(?=SESSION).*?(?<=POST)
Something along these lines but that also removes the timestamps the actual SESSION and POST keywords that i don't require.