douxin5953 2018-07-24 14:11
浏览 39
已采纳

正则表达式 - 没有时间的新线

I need to preform a preg_replace for every empty line that is not followed by this:

00:00:02.800 --> 00:00:04.800

Its format is:

any 2 digits:any 2 digits:any 2 digits.any 3 digits --> any 2 digits:any 2 digits:any 2 digits.any 3 digits

I know how to search for an empty line:

"/(^[
]*|[
]+)[\s\t]*[
]+/"

And for the time row :

[0-9]{1,2}[:.,-]?[:][0-9]{1,2}[:.,-]?[:][0-9]{1,2}[:.,-]?[.][0-9]{1,3}[:.,-]?[\s][-][-][>][\s][0-9]{1,2}[:.,-]?[:][0-9]{1,2}[:.,-]?[:][0-9]{1,2}[:.,-]?[.][0-9]{1,3}[:.,-]?

But I wasn't able to create a regex that finds only the lines that are not followed by the time row.

EDIT: OPTION 1 File Input:

WEBVTT

00:00:00.300 --> 00:00:01.000
line1

line2
line3

00:00:01.000 --> 00:00:02.800
line1

00:00:02.800 --> 00:00:04.800
line1
line2


line3

File desired output:

WEBVTT

00:00:00.300 --> 00:00:01.000
line1  
line2
line3

00:00:01.000 --> 00:00:02.800
line1

00:00:02.800 --> 00:00:04.800
line1
line2
line3

My function:

 $content = preg_replace("/regex expresion/", "", $file_content);

EDIT 2:

Just found out I need to find another format: OPTION 2 File Input:

1
00:00:00,300 --> 00:00:01,000
line 1 line 1

line 2

2
00:00:01,000 --> 00:00:02,800
line 1 line 1 line 1

line2


line 3 line 3

3
00:00:02,800 --> 00:00:04,800
line 1

File desired output:

1
00:00:00,300 --> 00:00:01,000
line 1 line 1
line 2

2
00:00:01,000 --> 00:00:02,800
line 1 line 1 line 1
line2
line 3 line 3

3
00:00:02,800 --> 00:00:04,800
line 1

Totos answer worked great. I tried to modify it to my need and was unsuccessful. I tried:

/(\R){1,}(?!(\d\R\d\d:\d\d:\d\d\.\d{3}) --> (?2))

Solved

Solution:

option 1:

$regex = "/(\R){1,}(?=(\d\d:\d\d:\d\d\.\d{3}) --> (?2))/";

option 2:

$regex = "/(\R)(?!(\d\R\d\d:\d\d:\d\d\,\d{3}))/";
  • 写回答

3条回答 默认 最新

  • doushan5222 2018-07-24 15:53
    关注
    $str = <<<EOD
    1
    00:00:00,300 --> 00:00:01,000
    line 1 line 1
    
    line 2
    
    2
    00:00:01,000 --> 00:00:02,800
    line 1 line 1 line 1
    
    line2
    
    
    line 3 line 3
    
    3
    00:00:02,800 --> 00:00:04,800
    line 1
    EOD;
    
    $str =preg_replace('/(\R)+(?!\d)/', '$1', $str);
    echo $str,"
    ";
    

    Output for given example:

    00:00:00,300 --> 00:00:01,000
    line 1 line 1
    line 2
    
    2
    00:00:01,000 --> 00:00:02,800
    line 1 line 1 line 1
    line2
    line 3 line 3
    
    3
    00:00:02,800 --> 00:00:04,800
    line 1
    

    Explanation:

    (\R)+       : group 1, any kind of linebreak, 2 or more times
    (?!\d)      : negative lookahead, make sure we don't have digit after
    

    Or, if lineX could begin with digit:

    $str =preg_replace('/(\R){2,}(?!(\d\d:\d\d:\d\d\.\d{3}) --> (?2)|\d+)\R/', '$1', $str);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 问题遇到的现象和发生背景 360导航页面千次ip是20元,但是我们是刷量的 超过100ip就不算量了,假量超过100就不算了 这是什么逻辑呢 有没有人能懂的 1000元红包感谢费
  • ¥30 计算机硬件实验报告寻代
  • ¥15 51单片机写代码,要求是图片上的要求,请大家积极参与,设计一个时钟,时间从12:00开始计时,液晶屏第一行显示time,第二行显示时间
  • ¥15 用C语言判断命题逻辑关系
  • ¥15 原子操作+O3编译,程序挂住
  • ¥15 使用STM32F103C6微控制器设计两个从0到F计数的一位数计数器(数字),同时,有一个控制按钮,可以选择哪个计数器工作:需要两个七段显示器和一个按钮。
  • ¥15 在yolo1到yolo11网络模型中,具体有哪些模型可以用作图像分类?
  • ¥15 AD9910输出波形向上偏移,波谷不为0V
  • ¥15 淘宝自动下单XPath自动点击插件无法点击特定<span>元素,如何解决?
  • ¥15 曙光1620-g30服务器安装硬盘后 看不到硬盘