duai0935 2015-11-12 20:57
浏览 17
已采纳

解析组合日志格式日志的问题

I have changed my nginx logs to show custom logs instead of the default. I've added two fields $request_time and $upstream_response_time. I'm using PHP to parse this.

I'm not great with regexes but I tried to modify another regex I picked up from Parse Apache log in PHP using preg_match

The regex there is:

$regex = '/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)"$/';

I'm not great with regexes, so this is what I'm trying to do instead:

$pattern = '/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)"$ ^(\S+) ^(\S+) /';

Where my input looks something like this:

$line = "127.0.0.1 - - [12/Nov/2015:13:39:19 -0500] \"GET /mj/feed/ HTTP/1.1\" 200 3276 \"-\" \"rogerbot/1.0 (http://www.moz.com/dp/rogerbot, rogerbot-crawler@moz.com)\" 0.254 0.254";

The two extra fields are 0.254 and 0.254 above.

So I'm trying to obtain [14] = 0.254 and [15] = 0.254.

I've tried playing around with the regex through live online regex tools without any luck.

Any help would be appreciated.

  • 写回答

1条回答 默认 最新

  • dongren5293 2015-11-12 21:13
    关注

    The ^ is the start of a string (or line if the m modifier is being used). In a character class it negates the character inside. So

    ^(\S+) ^(\S+)
    

    doesn't work in the middle of your regex.

    Give this a try:

    ^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)" (\S+) (\S+)$
    

    Regex101 Demo: https://regex101.com/r/lQ6zX9/1

    or another way of writing using the negated character class:

    ^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)" ([^\s]+) ([^\s]+)$
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题