doushang7209 2014-01-02 19:37
浏览 28
已采纳

正则表达式从PHP中提取前3个匹配实例[关闭]

I have a full html file in PHP variable.. and would like to extract first 3 values in the html which are formatted as such q?s=XXX OR q?s=XX or q?s=XXXX (where X is a stock symbol).

$html variable contains:

<a name='mkt-movers' class='anchor'><\/a><h2 class='Fz-l Fw-200 Mend-4 D-i'>Market Movers<\/h2><\/div><div class=\"bd\">\t<div class=\"dropdown rapid-nf Fw-200 Bdrs\">
            <form class=\"SelectBox SelectBoxNoBorder\">
                <div class=\"SelectBox-Pick\">
                    <span class=\"SelectBox-Text\">U.S. Composite<\/span>
\t\t    <i class='Icon'>&#xe002;<\/i>
                <\/div>

                <select data-plugin=\"selectbox\"  class='Start-0' name='selectBox' >
\t\t    <option value=\"0\" selected=\"selected\" class=\"Selected\">U.S. Composite<\/option><option value=\"1\" >Nasdaq<\/option><option value=\"2\" >NYSE Market<\/option><option value=\"3\" >NYSE<\/option>
                <\/select>
                <noscript>
                    <Btn type=\"submit\" class=\"Hidden\">Select<\/Btn>
                <\/noscript>
            <\/form>
\t<\/div><div class=\"content\"><div class=\"mod-85ac7b2b-640f-323f-a1c1-00b2f4865d18 mod active\"><div id=\"mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18\" class=\"yom-mod yom-app yom-data yfi-table wp yfi-mmovers fin-glass-disabled\">
\t<a name=\"mkt-movers\" class=\"anchor\"><\/a>
    <div class=\"hd\">
        <h2 class=\"Fw-200 Fz-l M-0\"><\/h2>
    <\/div>
    <div class=\"bd yom-tabview\">
            <ul role=\"tablist\" data-plugin='tabpanel' class='FinTabs Mb-10'>
                <li class=\"Grid-U Mend-8 FinTab-Item Selected rmp-0\" role=\"presentation\">
                    <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\"  role = \"tab\"  class = \"FinTab-Label no-pjax\"  data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" >Most Actives<\/a>
                <\/li>
                <li class=\"Grid-U Mend-8 FinTab-Item rmp-0\" role=\"presentation\">
                    <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab2\"  role = \"tab\"  class = \"FinTab-Label no-pjax\"  data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab2\" >% Gainers<\/a>
                <\/li>
                <li class=\"Grid-U Mend-8 FinTab-Item rmp-0\" role=\"presentation\">
                    <a href=\"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab3\"  role = \"tab\"  class = \"FinTab-Label no-pjax\"  data-tabpanel-target = \"#mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab3\" >% Losers<\/a>
                <\/li>
            <\/ul>
\t<div class=\"yfi-panelcontainer yui3-tabview-panel\">
            <div role=\"tabpanel\" id=\"mod_85ac7b2b_640f_323f_a1c1_00b2f4865d18-tab1\" class=\" Selected\" data-start=\"0\" data-count=\"10\" data-content=\"mostactive\" >
        \t<div class=\"original\">
                
        <table summary=\"1\" class=\"yom-data col-8 phatable\" >
          <caption><\/caption>
          <colgroup><col><col><col><col><col><col><col><col><\/colgroup>
          <thead>
            <tr>
                <th id=\"table-31-0-0\" class=\"symbol  txt-color\" scope=\"col\"><span>Symbol<\/span><\/th>
                <th id=\"table-31-0-1\" class=\"name  txt-color\" scope=\"col\"><span>Company Name<\/span><\/th>
          

I want to extract the first 3 stock symbols in the large full HTML string above. I.e. output = "BAC", "GE", "MSFT".

Note - stock symbols could be 1, 2, 3 or 4 characters long.

Any ideas to get this would be appreciated - thanks!!

  • 写回答

2条回答 默认 最新

  • doushou8730 2014-01-02 20:21
    关注

    This should work, try:

    if(preg_match_all('~(?<=q\?s=)[-A-Z.]{1,5}~', $source, $out))
    {
        // The matches are in [0] (whole pattern)
        echo "<pre>"; print_r($out[0]); echo "</pre>";
    
        // If you need first 3
        #$out[0] = array_slice($out[0],0,3);
        #echo "<pre>"; print_r($out[0]); echo "</pre>";
    
        // If you need them unique:
        $out[0] = array_unique($out[0]);
        echo "<pre>"; print_r($out[0]); echo "</pre>";
    
    } else {
        echo "FAIL";
    }
    

    I changed the pattern a bit, to match stock symbols like in this list to ~(?<=q\?s=)[-A-Z.]{1,5}~

    • It looks behind for q?=
    • If found, matches 1-5 of characters: A-Z,., -
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 drone 推送镜像时候 purge: true 推送完毕后没有删除对应的镜像,手动拷贝到服务器执行结果正确在样才能让指令自动执行成功删除对应镜像,如何解决?
  • ¥15 求daily translation(DT)偏差订正方法的代码
  • ¥15 js调用html页面需要隐藏某个按钮
  • ¥15 ads仿真结果在圆图上是怎么读数的
  • ¥20 Cotex M3的调试和程序执行方式是什么样的?
  • ¥20 java项目连接sqlserver时报ssl相关错误
  • ¥15 一道python难题3
  • ¥15 牛顿斯科特系数表表示
  • ¥15 arduino 步进电机
  • ¥20 程序进入HardFault_Handler