douruocai4111 2010-11-13 12:53
浏览 94
已采纳

如何使用curl函数从存储为字符串的html页面中提取值

I am using PHP / curl to get a HTML into a string and then i need to extract the following data and then project a graph out of it .

The data I want looks like :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta name="generator" content=
  "HTML Tidy for Linux (vers 25 March 2009), see www.w3.org" />

  <title></title>
</head>

<body>
  <table>
    <tbody>
      <tr>
        <td>
          <h3>Income</h3>
        </td>
      </tr>

      <tr>
        <td>Operating income</td>

        <td class="numericalColumn">22,922.00</td>

        <td class="numericalColumn">21,507.30</td>

        <td class="numericalColumn">17,492.60</td>

        <td class="numericalColumn">13,683.90</td>

        <td class="numericalColumn">10,227.12</td>
      </tr>

      <tr>
        <td>
          <h3>Expenses</h3>
        </td>
      </tr>

      <tr>
        <td>Material consumed</td>

        <td class="numericalColumn">4,029.40</td>

        <td class="numericalColumn">3,442.60</td>

        <td class="numericalColumn">2,952.30</td>

        <td class="numericalColumn">1,889.00</td>

        <td class="numericalColumn">1,367.67</td>
      </tr>

      <tr>
        <td>Manufacturing expenses&nbsp;</td>

        <td class="numericalColumn">2,213.20</td>

        <td class="numericalColumn">1,841.80</td>

        <td class="numericalColumn">299.80</td>

        <td class="numericalColumn">120.50</td>

        <td class="numericalColumn">1,020.70</td>
      </tr>

      <tr>
        <td>Personnel expenses</td>

        <td class="numericalColumn">9,062.80</td>

        <td class="numericalColumn">9,249.80</td>

        <td class="numericalColumn">7,409.10</td>

        <td class="numericalColumn">5,768.20</td>

        <td class="numericalColumn">4,279.03</td>
      </tr>

      <tr>
        <td>Selling expenses</td>

        <td class="numericalColumn">378.10</td>

        <td class="numericalColumn">308.40</td>

        <td class="numericalColumn">532.10</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">171.05</td>
      </tr>

      <tr>
        <td>Adminstrative expenses</td>

        <td class="numericalColumn">1,737.00</td>

        <td class="numericalColumn">1,906.00</td>

        <td class="numericalColumn">2,583.70</td>

        <td class="numericalColumn">2,651.70</td>

        <td class="numericalColumn">904.78</td>
      </tr>

      <tr>
        <td>Expenses capitalised</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>
      </tr>

      <tr>
        <td>Cost of sales</td>

        <td class="numericalColumn">17,420.50</td>

        <td class="numericalColumn">16,748.60</td>

        <td class="numericalColumn">13,777.00</td>

        <td class="numericalColumn">10,429.40</td>

        <td class="numericalColumn">7,743.22</td>
      </tr>

      <tr>
        <td>Operating profit</td>

        <td class="numericalColumn">5,501.50</td>

        <td class="numericalColumn">4,758.70</td>

        <td class="numericalColumn">3,715.60</td>

        <td class="numericalColumn">3,254.50</td>

        <td class="numericalColumn">2,483.90</td>
      </tr>

      <tr>
        <td>Other recurring income</td>

        <td class="numericalColumn">434.20</td>

        <td class="numericalColumn">468.20</td>

        <td class="numericalColumn">326.90</td>

        <td class="numericalColumn">288.70</td>

        <td class="numericalColumn">113.59</td>
      </tr>

      <tr>
        <td>Adjusted PBDIT</td>

        <td class="numericalColumn">5,935.70</td>

        <td class="numericalColumn">5,226.90</td>

        <td class="numericalColumn">4,042.50</td>

        <td class="numericalColumn">3,543.20</td>

        <td class="numericalColumn">2,597.49</td>
      </tr>

      <tr>
        <td>Financial expenses</td>

        <td class="numericalColumn">108.40</td>

        <td class="numericalColumn">196.80</td>

        <td class="numericalColumn">116.80</td>

        <td class="numericalColumn">7.20</td>

        <td class="numericalColumn">3.13</td>
      </tr>

      <tr>
        <td>Depreciation&nbsp;</td>

        <td class="numericalColumn">579.60</td>

        <td class="numericalColumn">533.60</td>

        <td class="numericalColumn">456.00</td>

        <td class="numericalColumn">359.80</td>

        <td class="numericalColumn">292.26</td>
      </tr>

      <tr>
        <td>Other write offs</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>
      </tr>

      <tr>
        <td>Adjusted PBT</td>

        <td class="numericalColumn">5,247.70</td>

        <td class="numericalColumn">4,496.50</td>

        <td class="numericalColumn">3,469.70</td>

        <td class="numericalColumn">3,176.20</td>

        <td class="numericalColumn">2,302.10</td>
      </tr>

      <tr>
        <td>Tax charges&nbsp;</td>

        <td class="numericalColumn">790.80</td>

        <td class="numericalColumn">574.10</td>

        <td class="numericalColumn">406.40</td>

        <td class="numericalColumn">334.10</td>

        <td class="numericalColumn">286.10</td>
      </tr>

      <tr>
        <td>Adjusted PAT</td>

        <td class="numericalColumn">4,456.90</td>

        <td class="numericalColumn">3,922.40</td>

        <td class="numericalColumn">3,063.30</td>

        <td class="numericalColumn">2,842.10</td>

        <td class="numericalColumn">2,016.00</td>
      </tr>

      <tr>
        <td>Non recurring items</td>

        <td class="numericalColumn">441.10</td>

        <td class="numericalColumn">-948.60</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">38.33</td>
      </tr>

      <tr>
        <td>Other non cash adjustments</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-33.85</td>
      </tr>

      <tr>
        <td>Reported net profit</td>

        <td class="numericalColumn">4,898.00</td>

        <td class="numericalColumn">2,973.80</td>

        <td class="numericalColumn">3,063.30</td>

        <td class="numericalColumn">2,842.10</td>

        <td class="numericalColumn">2,020.48</td>
      </tr>

      <tr>
        <td>Earnigs before appropriation</td>

        <td class="numericalColumn">4,898.00</td>

        <td class="numericalColumn">2,973.80</td>

        <td class="numericalColumn">3,063.30</td>

        <td class="numericalColumn">2,842.10</td>

        <td class="numericalColumn">2,020.48</td>
      </tr>

      <tr>
        <td>Equity dividend</td>

        <td class="numericalColumn">880.90</td>

        <td class="numericalColumn">586.00</td>

        <td class="numericalColumn">876.50</td>

        <td class="numericalColumn">873.70</td>

        <td class="numericalColumn">712.88</td>
      </tr>

      <tr>
        <td>Preference dividend</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>

        <td class="numericalColumn">-</td>
      </tr>

      <tr>
        <td>Dividend tax</td>

        <td class="numericalColumn">128.30</td>

        <td class="numericalColumn">99.60</td>

        <td class="numericalColumn">148.90</td>

        <td class="numericalColumn">126.80</td>

        <td class="numericalColumn">99.98</td>
      </tr>

      <tr>
        <td>Retained earnings</td>

        <td class="numericalColumn">3,888.80</td>

        <td class="numericalColumn">2,288.20</td>

        <td class="numericalColumn">2,037.90</td>

        <td class="numericalColumn">1,841.60</td>

        <td class="numericalColumn">1,207.62</td>
      </tr>
    </tbody>
  </table>
</body>
</html>

I want to extract each value like Manufacturing Data and the values of all the years mentioned in that line. How do I go about this?

I found something like preg_match('#<tr><th>(.*)</th> <td><b>price</b></td></tr>#', $content, $match); but that doesn't get the values I want.

  • 写回答

2条回答 默认 最新

  • dougong8012 2010-11-13 13:13
    关注

    If i understood you question well you want something like this to be done. this was written by me so if you need clarifications i'd love to help.

    cheers !

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置