dongying3830 2010-10-24 14:47
浏览 55

php,preg_match,regex,提取特定文本

I have a very big .txt file with our clients order and I need to move it in a mysql database . However I don't know what kind of regex to use as the information is not very different .

-----------------------
4046904


KKKKKKKKKKK
Laura Meyer
MassMutual Life Insurance
153 Vadnais Street

Chicopee, MA 01020
US
413-744-5452
lmeyer@massmutual.co...


KKKKKKKKKKK
373074210772222 02/12 6213 NA
-----------------------
4046907


KKKKKKKKKKK
Venkat Talladivedula

6105 West 68th Street

Tulsa, OK 74131
US
9184472611
venkat.talladivedula...


KKKKKKKKKKK
373022121440000 06/11 9344 NA
-----------------------

I tried something but I couldn't even extract the name ... here is a sample of my effort with no success

$htmlContent = file_get_contents("orders.txt");

//print_r($htmlContent);

$pattern = "/KKKKKKKKKKK(.*)
/s";
preg_match_all($pattern, $htmlContent, $matches);
print_r($matches);
$name = $matches[1][0];
echo $name;

展开全部

  • 写回答

4条回答 默认 最新

  • doutonghang2761 2010-10-24 14:57
    关注

    You may want to avoid regexes for something like this. Since the data is clearly organized by line, you could repeatedly read lines with fgets() and parse the data that way.

    评论
  • douzi2333 2010-10-24 15:01
    关注

    You could read this file with regex, but it may be quite complicated create a regex that could read all fields.

    I recommend that you read this file line by line, and parse each one, detecting which kind of data it contains.

    评论
  • dsjz1119 2010-10-24 15:10
    关注

    As you know exactly where your data is (i.e. which line its on) why not just get it that way?

    i.e. something like

    $htmlContent = file_get_contents("orders.txt");
    
    $arrayofclients = explode("-----------------------",$htmlContent);
    $newlinesep = "
    ";
    for($i = 0;i < count($arrayofclients);$i++)
    {
    $temp = explode($newlinesep,$arrayofclients[i]);
    $idnum = $temp[0];
    $name = $temp[4];
    $houseandstreet = $temp[6];
    //etc
    }
    

    or simply read the file line by line using fgets() - something like:

    $i = 0;$j = 0;
    $file = fopen("orders.txt","r");
    $clients = [];
    while ($line = fgets($ffile) )
    {
        if(line != false)
        {
            $i++;
            switch($i)
            {
            case 2:
                $clients[$j]["idnum"] = $line;
                break;
            case 6:
                $clients[$j]["name"] = $line;
                break;
            //add more cases here for each line up to:
            case 18:
                $j++;
                $i = 0;
                break;
            //there are 18 lines per client if i counted right, so increment $j and reset $i.
            }
        }
    }
    fclose ($f);
    

    You could use regex's, but they are a bit awkward for this situation.

    Nico

    展开全部

    评论
  • doushan7077 2017-05-03 06:10
    关注

    For the record, here is the regex that will capture the names for you. (Granted speed very well may be an issue.)

    (?<=K{10}\s{2})\K[^
    ]++(?!\s{2}-)
    

    Explanation:

    (?<=K{10}\s{2})  #Positive lookbehind for KKKKKKKKKK then 2 return/newline characters
    \K[^
    ]++      #Greedily match 1 or more non-return/newline characters
    (?!\s{2}-)       #Negative lookahead for return/newline character then dash
    

    Here is a Regex Demo.

    You will notice that my regex pattern changes slightly between the Regex Demo and my PHP Demo. Slight tweaking depending on environment may be required to match the return / newline characters.

    Here is the php implementation (Demo):

    if(preg_match_all("/(?<=K{10}\s{2})\K[^
    ]++(?!\s{2}-)/",$htmlContent,$matches)){
        var_export($matches[0]);   
    }else{
        echo "no matches";
    }
    

    By using \K in my pattern I avoid actually having to capture with parentheses. This cuts down array size by 50% and is a useful trick for many projects. The \K basically says "start the fullstring match from this point", so the matches go in the first subarray (fullstrings, key=0) of $matches instead of generating a fullstring match in 0 and the capture in 1.

    Output:

    array (
      0 => 'Laura Meyer',
      1 => 'Venkat Talladivedula',
    )
    
    评论
编辑
预览

报告相同问题?

手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部