dongyan7876 2017-07-12 09:37
浏览 39
已采纳

使用正则表达式在键值中拆分字符串

I'm having some trouble parsing plain text output from samtools stats.

Example output:

45205768 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
5203838 + 0 duplicates
44647359 + 0 mapped (98.76% : N/A)
0 + 0 paired in sequencing
0 + 0 read1
0 + 0 read2
0 + 0 properly paired (N/A : N/A)
0 + 0 with itself and mate mapped
0 + 0 singletons (N/A : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)

I'd like to parse the file line-by-line and get the following output in a PHP array like this:

Array(
 "in total" => [45205768,0],
 ...
)

So, long story short, I'd like to get the numerical values from the front of the line as an array of integers and the following string (without the brackets) as key.

  • 写回答

3条回答 默认 最新

  • dou4381 2017-07-12 09:48
    关注
    ^(\d+)\s\+\s(\d+)\s([a-zA-Z0-9 ]+).*$
    

    This regex will put first value, second value and the following string without the brackets in the match groups 1, 2 and 3 respectively.

    Regex101 demo

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?