I've been trying to extract this data from a file but the thing is, at the point where I'm stuck, there could be a whole new pattern (that starts with a date), or there could be a complemente in the route (which does not start with a digit).
I'm having trouble identifying whether or not the next digit is a new pattern or a complement. I also haven't been able to optimize this pattern, as you can see after the EQPT mark.
Examples of strings to match:
291011 311011 1234560 AZU4059 E190/M SBKP1513 N0458 350 DCT BGC DCT TRIVI DCT CNF UW58 SBRF0249 EQPT/WRG PBN/D1O1 EET/SBRE0107 SAGAZ/N0454F370 UW58 GEBIT UW10
271011 UFN 1230060 AZU4062 E190/M SBPA2140 N0460 350 UM540 OSAMU DCT NEGUS UW47 SBKP0120 EQPT/WRG PBN/D1O1 EET/SBBS0106
My regex so far:
preg_match_all('/([0-3][0-9][0|1][0-9][0-9]{2})\s*(UFN|[0-3][0-9][0|1][0-9][0-9]{2})\s*([0-7]{7})\s*(AZU[0-9]{4})\s*([A-Z0-9]{4})\/([L|M|H])\s*([A-Z0-9]{8})\s*(N[0-9]{4})\s*([0-9]{3})\s*([\S\s]{1,40})\s*([A-Z0-9]{8})\s*(EQPT\/WR?G?\s?P?B?N?\/?D?1?O?1?\s?E?E?T?\/?([A-Z0-9]{8})?)\s*)/', $result, $match);