douwaif22244 2016-09-22 13:47
浏览 79
已采纳

将2个带有正则表达式的preg_split应用于文本

Context: I have to split an email with several customers’ reservations details that is received every day, with a set of rules. This is an example of the email:

A N K U N F T   11.08.15
*** NEUBUCHUNG ***
 11.08.15  xxx  xxx  X3 2830  14:25   17:50
 18.08.15  xxx  xxx  X3 2831  18:40
 F882129  dsdsaidsaia
 F882129  xxxyxyagydaysd
sadsdsdsdsadsadadssda
sadsdsdsdsadsadadssda
**«CUT HERE2»**


A N K U N F T   18.08.15
*** NEUBUCHUNG ***
 11.08.15  xxx  xxx  X3 2830  14:25   17:50
 18.08.15  xxx  xxx  X3 2831  18:40
 F881554  ZXCXZCXCXZCCXZ
 F881554  xcvcxvcxvcvxc
 F881554  xvcxvcxcvxxvccvxxcv

**«CUT HERE»**


11.08.15  xxx  xxx  X3 2830  14:25   17:50
 18.08.15  xxx  xxx  X3 2831  18:40
 F881605  xczxcdfsfdsdfs
 F881605  zxccxzxzdffdsfds

**«CUT HERE»**

So it basically has to be cut whenever the last F999999 appears (where 9 can be any digit), because F999999 is the reservation code.* I inserted the text: «CUT HERE» just to better understand where to cut.

*NOTE: reservation code may have the following formats: F999999, A999999, E999999 or 999999.

So I apply a working preg_split with the following regex:

Regex1 = "/(?:\\s(F|A|E)?\\d{6}\\s?+.*?
\\s?
)\\K//ms";

However sometimes I have to cut where «CUT HERE2» appears, because sometimes there is some text after the reservation code delimiter.

So I created this regex:

Regex2 = "/^\h*(F|A|E)?\d{6}.*?\R{2}\K/ms"

Yet, I sometimes have this format (newlines between, F999999 of the same reservation), making my previous regex (regex2) cut where it says «NOT CUT HERE»:

A N K U N F T   11.08.15
*** NEUBUCHUNG ***
 11.08.15  xxx  xxx  X3 2830  14:25   17:50
 18.08.15  xxx  xxx  X3 2831  18:40
 F882129  dsdsaidsaia

<<NOT CUT HERE>>

 F882129  xxxyxyagydaysd
sadsdsdsdsadsadadssda
sadsdsdsdsadsadadssda
**«CUT HERE»**


A N K U N F T   18.08.15
*** NEUBUCHUNG ***
 11.08.15  xxx  xxx  X3 2830  14:25   17:50
 18.08.15  xxx  xxx  X3 2831  18:40
 F881554  ZXCXZCXCXZCCXZ

<<NOT CUT HERE>>

 F881554  xcvcxvcxvcvxc
 F881554  xvcxvcxcvxxvccvxxcv

**«CUT HERE»**


11.08.15  xxx  xxx  X3 2830  14:25   17:50
 18.08.15  xxx  xxx  X3 2831  18:40
 F881605  xczxcdfsfdsdfs
 F881605  zxccxzxzdffdsfds

**«CUT HERE»**

I just want it to cut where «CUT HERE» appears.

This error happens for example:

***NEUBUCHUNG ***
 23.02.17  DUS  FNC  DE 1414  12:05   15:10
 09.03.17  FNC  DUS  DE 1415  16:40
 FNC011  Enotel Baia                  9360-215 Ponta do Sol
  1  DZ Typ I Meerblick 2Erw.         Frühstück
 am 03.10.16  CRS: MX  - PNR: 1290689
 Fluggeber: Condor Flugdienst / PNR: 1290689  Frühbucher 10%  inkl. Reiseleitung  und Transfer ab/bis   
 A025808  HERR Berg, Ulrich               62


<<NOT CUT HERE>

Anfrage.
 A025808  FRAU Berghaus, Petra            58

 **«CUT HERE»**

***S T O R N O **
 04.10.16 STR  X3 2810
 11.10.16 FNC  STR  X3 2811  18:15
FNC036    The Flame Tree               Funchal
 1  DZ Meerblick 2Erw.                 H
A987025  FRAU  BURG, GERTRUD          *** STORNO ***              O


<<NOT CUT HERE>>


A987025  HERR  BURG, WALTER           *** STORNO ***              O

**«CUT HERE»**

***ÄNDERUNG ***
NEU:01.11.16 FRA  X3 2806  13:35   16:50
08.11.16 FNC  FRA  X3 2807  17:40
   FNC813    Golden Residence/Wanderk. 9000-105 Funchal
 1  Suite seitl. Meerblick 3Erw.       F
A982512 FRAU   KROST, SIMONE
Frühbucher 15%


<<NOT CUT HERE>>

inkl. Reiseleitung
und Transfer ab/bis 
Im Reisepreis bereits enthalten: Drei
geführte Wanderungen (1 Ganztags- und 2
Halbtagswanderungen) inkl. aller
Transfers.

**«SHOULD CUT HERE»**

***ÄNDERUNG ***
ALT:01.11.16 FRA  X3 2806  13:35   16:50
08.11.16 FNC  FRA  X3 2807  17:40
FNC813   Golden Residence/Wanderk. 9000-105 Funchal
 1  Suite seitl. Meerblick 3Erw.   F
   A982512      HERR KROST, SIMONE 

**«CUT HERE»**


 25.04.17  DRS  FNC  ST 1602  13:25   17:15
 09.05.17  FNC  DRS  ST 1607  00:00
 FNC076  Baia Azul                    9004-530 Funchal
  1  DZ Typ I Meerblick 2Erw.         Halbpension
 am 03.10.16  CRS: MX  - PNR: 15326821
 Fluggeber: alltours / PNR: 15326821
 inkl. Reiseleitung
 und Transfer ab/bis Flughafen
 A025986  HERR Schulze, Steffen           55
 A025986  FRAU Schulze, Kerstin           54

**«CUT HERE»**

***S T O R N O **
 13.11.16 FRA  X3 2806
 20.11.16 FNC  FRA  X3 2807  17:35
FNC096    Pestana Village & Miramar    Funchal
 1  Studio 2Erw.                       H
A976918  FRAU  HEBING, BETTINA        *** STORNO ***              O

<<NOT CUT HERE>> 

A976918  HERR  HEBING, LUDGER         *** STORNO ***              O

  **«CUT HERE»**

I put «NOT CUT HERE» where it splits but shouldn’t. I put: «SHOULD CUT HERE» where it should cut. And i put «CUT HERE» were it cuts correctly.

  • 写回答

1条回答 默认 最新

  • doulue1949 2016-09-22 14:13
    关注

    You may use

    '~^\h*F\d{6}.*?\R{2}\K~sm'
    

    See the regex demo

    Details:

    • ^ - start of a line
    • \h* - 0+ horizontal whitespaces
    • F\d{6} - F + 6 digits -.*? - any 0+ chars up to the first
    • \R{2} - 2 linebreaks
    • \K - and omit the whole match text.

    See PHP demo:

    $re = '~^\h*F\d{6}.*?\R{2}\K~ms'; 
    $str = "A N K U N F T   11.08.15
    *** NEUBUCHUNG ***
     11.08.15  xxx  xxx  X3 2830  14:25   17:50
     18.08.15  xxx  xxx  X3 2831  18:40
     F882129  dsdsaidsaia
     F882129  xxxyxyagydaysd
    sadsdsdsdsadsadadssda
    sadsdsdsdsadsadadssda
    
    A N K U N F T   18.08.15
    *** NEUBUCHUNG ***
     11.08.15  xxx  xxx  X3 2830  14:25   17:50
     18.08.15  xxx  xxx  X3 2831  18:40
     F881554  ZXCXZCXCXZCCXZ
     F881554  xcvcxvcxvcvxc
     F881554  xvcxvcxcvxxvccvxxcv
    
    
    11.08.15  xxx  xxx  X3 2830  14:25   17:50
     18.08.15  xxx  xxx  X3 2831  18:40
     F881605  xczxcdfsfdsdfs
     F881605  zxccxzxzdffdsfds
    
    "; 
    print_r(preg_split($re, $str));
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 用三极管设计—个共射极放大电路
  • ¥15 请完成下列相关问题!
  • ¥15 drone 推送镜像时候 purge: true 推送完毕后没有删除对应的镜像,手动拷贝到服务器执行结果正确在样才能让指令自动执行成功删除对应镜像,如何解决?
  • ¥15 求daily translation(DT)偏差订正方法的代码
  • ¥15 js调用html页面需要隐藏某个按钮
  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 牛顿斯科特系数表表示