drgc9632 2019-09-18 05:20
浏览 95

RegEx全部匹配,直到两个连续的特殊字符(]])

I tried to figure out one (multiline.pattern) or two (multiline.pattern & exclude_line) regex in order to ship log information from filebeat to logstash. The system which writes the logs has a standardized log format which looks as follows

[2019-08-28 10:38:57 +0200][0000000000][Info][User][OLS][201][Some Logging Information]

To match this I have built up the regex (maybe this needs also some improvements :-))

^\[(\d{4})-(\d{2})-(\d{2})\s(\d{2}):(\d{2}):(\d{2})\s\+(\d{4})\]\[\d{10}\]\[[^\]]*\]\[[^\]]*\]\[[^\]]*\]\[[\d]*\]\[[^\]]*\]$

Unfortunately the log structure changes when the system runs in debug mode

[2019-05-24 09:58:39 +0200][0000000000][Debug][External][RESTLM][HTDOC_REQUEST][Some Debug Loginformation]
[2019-05-24 09:58:39 +0200][0000000000][Debug][External][RESTLM[HTDOC_REQUEST][Some Debug Loginformation]
[2019-05-24 09:58:34 +0200][0000000026][Debug][External][RESTLM][REST_RESPONSE][[45][HTTP/1.0 201 Created
    Server: Test/2019.3
    Pragma: no-cache
    Cache-control: no-cache
    Content-Type: text/xml
    Content-Length: 255

    <?xml version="1.0" encoding="utf-8"?>
    <Status><Repository><Path>D:/repository/tabfiles</Path><Version>4_0</Version><Fingerprint>p12uqocQM0gtaRieBldCix/CSSs=</Fingerprint></Repository><System>Running</System></Status>]]
[2019-05-24 09:58:34 +0200][0000000000][Debug][External][RESTLM][REST_REQUEST][[45][POST / HTTP/1.1
    Content-Type: text/xml; charset=utf-8
    Cache-Control: no-cache
    Pragma: no-cache
    User-Agent: Java/11.0.2
    Host: serverxyz:24821
    Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
    Connection: keep-alive
    Content-Length: 10

    <Status />]]

I want to exclude those log entries (multiline) which contains "Debug" in the 3rd field. From my point of view the main difference between normal and debug log is in 6th field is not a [\d*]. And in some cases, I think this is my problem, there is a log inside the Loginformation (last logfield) - which looks like [[[45][some text][other text]]

What I am looking for is either a regex which matches one complete log entry independent of debug or normal. Or two expression 1st match of normal logs 2nd match debug logs (and exclude them)

  • 写回答

3条回答 默认 最新

  • dotdx80642 2019-09-18 05:28
    关注

    Since all you want to do is match the log entries, and not capture any info, use this:

    ^\[\d{4}-\d{2}-\d{2}[\s\S]+?\]\]?$ /gm

    The idea is to capture the data lazily (by using ?) until a single or two ] are encountered at the end of the line.

    Demo

    评论

报告相同问题?

悬赏问题

  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码
  • ¥15 对于相关问题的求解与代码
  • ¥15 ubuntu子系统密码忘记
  • ¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
  • ¥15 保护模式-系统加载-段寄存器
  • ¥15 电脑桌面设定一个区域禁止鼠标操作
  • ¥15 求NPF226060磁芯的详细资料