I have asked several questions regarding this and I have tried many different things, but I am not completely happy with it. I have a lot of data in the following format
3*O#AA6160 F7 A7 P7 J7 R7 D7 I7 Y7 LHRMIA 1040 1455 * 744 0E
B7 H0 W0 K0 M0 L0 V0 G0 S0 Q0 N0 O0
The spaces you see on the second row are there by default. Essentially, from that string I am trying to get the following
$flightNumber = AA6160;
$from = LHR;
$to = MIA;
$other = 1040 1455 * 744 0E;
$seats = array(
"F" => 7,
"A" => 7,
"P" => 7,
"J" => 7,
"R" => 7,
"D" => 7,
"I" => 7,
"Y" => 7,
"B" => 7,
"H" => 0,
"W" => 0,
"K" => 0,
"M" => 0,
"L" => 0,
"V" => 0,
"G" => 0,
"S" => 0,
"Q" => 0,
"O" => 0,
)
The rules are as follows.
The start of a row starts with a digit (in the above case 3). The second row is a continuation of seats from the first row. If I was to post the full data I have, the third row starts with 4 which means that its not related to the two above.
A flight number always starts with a # and is following by TWO Letters and 1-4 numbers. Sometimes there is spaces between the letters and numbers. These are all the types of flight numbers I have discovered
#AA6160
#AA 57
#BA 207
The second row will only contain a continuation of seats, nothing else. This is what I have come up with so far
while ( $elNum < $elements->length ) {
$flightInfo = $elements->item($elNum)->nodeValue;
if (preg_match('/^\\d/', $flightInfo) === 1) {
if(preg_match('/(\d)+[^#]*?\#(\p{Lu}{2})\s*(\d{1,4})\b\s*([\w. ]+?)(?=\s+\p{Lu}{6})\s([A-Z]{3})([A-Z]{3})(.+)/', $flightInfo, $matches)===1){
$row = $matches[1];
$fltcode = $matches[2].$matches[3];
$ffrom = $matches[5];
$fto = $matches[6];
$other = $matches[7];
$this->flights[$fltcode] = array(
"command" => $terminal_command,
"row" => $row,
"flightNumber" => $fltcode,
"from" => $ffrom,
"to" => $fto,
"other" => $other
);
}
}
++$elNum;
}
The main thing I am struggling with is the seats. I am not sure how to get the ones I need from the first row and combine them with the ones from the second row in the output format I need them all to be.
I am not even sure if regex is the best option here, or if I should explode everything on spaces and sort them like this?
Any advice on the situation is appreciated. Here is some additional data
5*S#DL4386 J9 C9 D9 I9 Z9 W9 Y9 B9 LHRMIA 1235 1705 * 744 0E
M9 S9 H9 Q9 K9 L9 U9 T9 X9 V9
6 #VS 5 J9 C9 D9 I9 Z9 W9 S9 H9 LHRMIA 1235 1705 744 0E
K9 Y9 B9 R9 L9 U9 M9 E9 Q9 X9 N9 O9
7 #IB4637 F9 A9 J9 C9 D9 R9 I. W9 LHRMIA 1415 1825 * 744 0E
Z. Y9 B9 H9 K. M. L. V. S. N. Q. O.
Thanks