2013-02-04 09:46
浏览 41


I'm trying to match the following:

  • This:

    HIGH SCHOOL WRESTLING NOTEBOOK: A surge at Delaware Valley, team rankings shakeup and more.
  • With This :

      <div class="sum">
        <div class="photo_gutter">
          <div class="photo">
            <a href="http://media.lehighvalleylive.com/brad-wilson/photo/jaryd-flank-b30e919c41bc86b2.jpg">
              <img src="http://media.lehighvalleylive.com/brad-wilson/photo/jaryd-flank-b30e919c41bc86b2.jpg" alt="" title="" width="200" border="0"/>
      HIGH SCHOOL WRESTLING NOTEBOOK: A surge at Delaware Valley, team rankings shakeup and more.

What I have so far is /<.*>\s/i, but I need the opposite of that. Can someone help me?

图片转代码服务由CSDN问答提供 功能建议


  • 这个:

      HIGH SCHOOL WRESTLING NOTEBOOK:特拉华谷的激增,团队排名改组等等。
  • 使用此:

     &lt; pre&gt; 
    &lt; div class =“sum”&gt; 
    &lt; div class =“  photo_gutter“&gt; 
    &lt; div class =”photo“&gt; 
    &lt; a href =”http://media.lehighvalleylive.com/brad-wilson/photo/jaryd-flank-b30e919c41bc86b2.jpg“&gt;  
    &lt; img src =“http://media.lehighvalleylive.com/brad-wilson/photo/jaryd-flank-b30e919c41bc86b2.jpg”alt =“”title =“”width =“200”border =“0”  /&gt; 
    &lt; / a&gt; 
    &lt; / div&gt; 
    &lt; / div&gt; 
    &lt; / div&gt; 
    &lt; / pre&gt; 

    到目前为止我所拥有的是 /&lt;。*&gt; \ s / i ,但我需要与此相反。 有人可以帮帮我吗?

  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • doulei6330 2013-02-04 09:51

    Do not use regex to parse HTML, use PHP Domdocument instead.

    打赏 评论
  • doudiejian5827 2013-02-04 12:09

    It is not recommended to use regex to parse HTML, but since it's a simple task (and probably meant to learn regex):

    You have this: /<.*>\s/i

    1- The i modifier does nothing here, since you are not using any character that could be case sensitive in the regex expression. i.e: /apple/i makes sense cause you want to find Apple. /\w+/i does nothing since \w includes both lowercase and uppercase characters.

    2- If you are parsing HTML it's better to don't assume or use any \s unless you are inside of a tag.

    3- If you want to capture a part of the regex into a variable you have to use ( and ). i.e: /(\w+) Apple/ parsing Red Apple would give you Red in $1 or in the array of matches of the preg_match() function.

    Now how would I do this:

    First of all, I would remove any or from the input string. Regex works much better with only 1 line of text. You can do this with a str_replace()

    If you want to get anything that is not inside <>:


    If you want to get the text inside of a certain tag, for example <div>this one</div>:


    The ? character makes the .* match to be non-greedy, so It will get the least number of characters that match the pattern.

    Hope it helped.

    打赏 评论

相关推荐 更多相似问题