douyi6290
douyi6290
2014-02-14 13:19

<td>和</ td>中内容的正则表达式

已采纳

I need to find a regular expression to use for finding the content within and tags for use in PHP. I have tried...

preg_split("<td>([^\"]*)</td>", $table[0]);

But that gives me the PHP error...

Warning: preg_split(): Unknown modifier '(' in C:\xampp\htdocs\.....

Can anyone tell me what I am doing wrong?

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

4条回答

  • duanpie4763 duanpie4763 7年前

    Try this:

    preg_match("/<td>([^\"]*)<\/td>/", $table[0], $matches);
    

    But, as a general rule, please, do not try to parse HTML with regexes... :-)

    点赞 评论 复制链接分享
  • douan1893 douan1893 7年前

    First of all you forgot to wrap regex with delimiters. Also you shouldn't specify closing td tag in regex.

    Try the following code. Assuming $table[0] contains html between <table>, </table> tags, it allows to extract any content (including html) from cells of table:

    $a_result = array_map(
        function($v) { return preg_replace('/<\/td\s*>/i', '', $v); },
        array_slice(preg_split('/<td[^>]*>/i', $table[0]), 1)
    );
    
    点赞 评论 复制链接分享
  • douzhuang2016 douzhuang2016 7年前

    Keep in mind that you need to do some extra work to make sure that the * between <td> and </td> in your regular expression doesn't slurp up entire lines of <td>some text</td>. That's because * is pretty greedy.

    To toggle off the greediness of *, you can put a ? after it - this tells it just grab up until the first time it reaches whatever is after the *. So, the regular expression you're looking for is something like:

    /<td>(.*?)<\/td>/
    

    Remember, since the regular expression starts and ends with a /, you have to be careful about any / that is inside your regular expression - they have to be escaped. Hence, the \/.

    From your regular expression, it looks like you're also trying to exclude any " character that might be between a <td> and </td> - is that correct? If that were the case, you would change the regular expression to use the following:

    /<td>([^\"]*?)<\/td>/
    

    But, assuming you don't want to exclude the " character in your matches, your PHP code could look like this, using preg_match_all instead of preg_match.

    preg_match_all("/<td>(.*?)<\/td>/", $str, $matches);
    print_r($matches);
    

    What you're looking for is in $matches[1].

    点赞 评论 复制链接分享
  • doy57007 doy57007 7年前

    Use preg_match instead of preg_split

    preg_match("|<td>([^<]*)</td>|", $table[0], $m);
    print_r($m);
    
    点赞 评论 复制链接分享