douzhong3887 2013-04-12 05:33
浏览 54
已采纳

在PHP中解析数组的有效方法?

Background

I have an array which I create by splitting a string based on every occurrence of 0d0a using preg_split('/(?<=0d0a)(?!$)/').

For example:

$string = "78781110d0a78782220d0a";

will be split into:

Array ( [0] => 78781110d0a [1] => 78782220d0a )  

A valid array element has to start with 7878 and end with 0d0a.

The Problem

But sometimes, there's an additional 0d0a in the string which splits into an extra and invalid array element, i.e., that doesn't begin with 7878.

Take this string for example:

$string = "78781110d0a2220d0a78783330d0a";

This is split into:

Array ( [0] => 78781110d0a [1] => 2220d0a [2] => 78783330d0a )

But it should actually be:

Array ( [0] => 78781110d0a2220d0a [1] => 78783330d0a)

My Solution

I've written the following (messy) code to get around this:

    $data = Array('78781110d0a','2220d0a','78783330d0a');
    $i = 0; //count for $data array;
    $j = 0; //count for $dataFixed array;
    $dataFixed = $data;

    foreach($data as $packet) {
        if (substr($packet,0,4) != "7878") { //if packet doesn't start with 7878, do some fixing
            if ($i != 0) { //its the first packet, can't help it!
                $j++;                    

                if ((substr(strtolower($packet), -4, 4) == "0d0a")) { //if the packet doesn't end with 0d0a, its 'mostly' not valid, so discard it
                    $dataFixed[$i-$j] = $dataFixed[$i-$j] . $packet;
                }
                    unset($dataFixed[$i-$j+1]);                        
                    $dataFixed = array_values($dataFixed);
            }
        }
        $i++;
    }

Description

I first copy the array to another array $dataFixed. In a foreach loop of the $data array, I check whether it starts with 7878. If it doesn't, I join it with the previous array in $data. I then unset the current array in $dataFixed and reset the array elements with array_values.

But I'm not very confident about this solution.. Is there a better, more efficient way?

UPDATE

What if the input string doesn't end in 0d0a like its supposed to? It will stick to the previous array element..

For e.g.: in the string 78781110d0a2220d0a78783330d0a0000, 0000 should be separated as another array element.

  • 写回答

3条回答 默认 最新

  • dousui6488 2013-04-12 05:39
    关注

    Use another positive lookahead (?=7878) to form:

    preg_split('/(?<=0d0a)(?=7878)/',$string)
    

    Note: I removed (?!$) because I wasn't sure what that was for, based on your example data.

    For example, this code:

    $string = "78781110d0a2220d0a78783330d0a";
    $array  = preg_split('/(?<=0d0a)(?=7878)(?!$)/',$string);
    print_r($array);
    

    Results in:

    Array ( [0] => 78781110d0a2220d0a [1] => 78783330d0a )

    UPDATE:

    Based on your revised question of having possible random characters at the end of the input string, you can add three lines to make a complete program of:

    $string = "78781110d0a2220d0a787830d0a330d0a0000";
    $array  = preg_split('/(?<=0d0a)(?=7878)/',$string);
    $temp = preg_split('/(7878.*0d0a)/',$array[count($array)-1],null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
    $array[count($array)-1] = $temp[0];
    if(count($temp)>1) { $array[] = $temp[1]; }
    print_r($array);
    

    We basically do the initial splitting, then split the last element of the resulting array by the expected data format, keeping the delimiter using PREG_SPLIT_DELIM_CAPTURE. The PREG_SPLIT_NO_EMPTY ensures we won't get an empty array element if the input string doesn't end in random characters.

    UPDATE 2:

    Based on your comment below where it seems you're implying there might be random characters between any of the desired matches, and you want these random characters preserved, you could do this:

    $string = "0078781110d0a2220d0a2220d0a0000787830d0a330d0a000078781110d0a2220d0a0000787830d0a330d0a0000";
    $split1 = preg_split('/(7878.*?0d0a)/',$string,null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
    $result = array();
    foreach($split1 as $e){
      $split2 = preg_split('/(.*0d0a)/',$e,null,PREG_SPLIT_NO_EMPTY|PREG_SPLIT_DELIM_CAPTURE);
      foreach($split2 as $el){
        // test if $el doesn't start with 7878 and ends with 0d0a
        if(strpos($el,'7878') !== 0 && substr($el,-4) == '0d0a'){
        //if(preg_match('/^(?!7878).*0d0a$/',$el) === 1){
          $result[ count($result)-1 ] = $result[ count($result)-1 ] . $el;
        } else {
          $result[] = $el;
        }
      }
    }
    print_r($result);
    

    The strategy employed here is different than above. First we split the input string based on the delimiter that matches your desired data, using the nongreedy regex .*?. At this point we have some strings that contain the ending of a desired value and some garbage at the end, so we split again based on the last occurrence of "0d0a" with the greedy regex .*0d0a. We then append any of those resulting values that don't start with "7878" but end with "0d0a" to the previous value, as this should repair the first and second halves that got split because it contained an extra "0d0a".

    I provided two methods for the innermost if statement, one using regular expressions. The regex one is marginally slower in my testing, so I've left that one commented out.

    I might still not have your full requirements, so you'll have to let me know if it works and perhaps provided your full dataset.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥35 MIMO天线稀疏阵列排布问题
  • ¥60 用visual studio编写程序,利用间接平差求解水准网
  • ¥15 Llama如何调用shell或者Python
  • ¥20 谁能帮我挨个解读这个php语言编的代码什么意思?
  • ¥15 win10权限管理,限制普通用户使用删除功能
  • ¥15 minnio内存占用过大,内存没被回收(Windows环境)
  • ¥65 抖音咸鱼付款链接转码支付宝
  • ¥15 ubuntu22.04上安装ursim-3.15.8.106339遇到的问题
  • ¥15 blast算法(相关搜索:数据库)
  • ¥15 请问有人会紧聚焦相关的matlab知识嘛?