duanliujie8639 2018-03-28 01:36
浏览 55
已采纳

使用PHP从网站获取某些信息的最佳方式

I want to get a certain information of a website. The problem I'm facing is that this certain information is changing maybe a few times a day. This is because the content is dynamic.

The goal of my PHP script is to get the content (dynamic content from a database) in a PHP variable.

I've set up a codepen to show you what I mean: https://codepen.io/anon/pen/XEVpBo
The HTML from the codepen:

<div class="wrapper">
  <div class="some_useless_div">
    <p>Some useless text paragraph.</p>
    <div id="another_useless_div">
      <p>The actual important part is: SOME_DYNAMIC_TEXT what I want to put into a variable. The text around that dynamic text is static text and will not change.</p>
    </div>
  </div>
</div>

Currently, what I do to capture the information is to explode around the dynamic information:

$content = file_get_contents('https://codepen.io/anon/pen/XEVpBo');
$parts = explode('The actual important part is: ', $content); // some text that is left of the information.
$parts2 = explode(' what I want to put into a variable.', $parts[1]); // some text that is right of the information.
$information = $parts2[0]; // AHA! Now we have the information!

However, this really feels like spaghetti code. Isn't there a function that maybe searches for a string and returns that value such as:
$information = search_string('The actual important part is: %s what I want to put into a variable.'); where %s would be the information put into the $information variable.

Again, the code I use (above) works but it really feels like bad code. I'm looking for a clean function of PHP.

  • 写回答

1条回答 默认 最新

  • duanjiu1003 2018-03-28 09:49
    关注

    maybe you're looking for preg_match ?

    test code seems to work fine: https://3v4l.org/6YeSh ,

    <?php
    $html=<<<'HTML'
    <div class="wrapper">
      <div class="some_useless_div">
        <p>Some useless text paragraph.</p>
        <div id="another_useless_div">
          <p>The actual important part is: SOME_DYNAMIC_TEXT what I want to put into a variable. The text around that dynamic text is static text and will not change.</p>
        </div>
      </div>
    </div>
    HTML;
    preg_match('/The actual important part is\: (.*?) what I want to put into a variable\./',$html,$matches);
    $str=$matches[1];
    var_dump($str);
    

    also, when you're talking about the "best" way, it's definitely not file_get_contents, for at least 2 reasons:

    file_get_contents keep reading until the socket is closed by the target server, but should stop reading once content-length bytes has been read, which, depending on the server, might have executed much faster

    file_get_contents does not support compressed transfers.

    curl reads until content-length bytes have been read, then returns, it also supports compressed transfers, thus curl should run significantly faster than file_get_contents.

    (and i disagree, your code is not spaghetti code. i don't think it's good code, because you should have been using preg_match instead of explode(), it's probably faster, use less memory, and easier to write and maintain than your explode code, but your explode code is not spaghetti.)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 keil的map文件中Image component sizes各项意思
  • ¥30 BC260Y用MQTT向阿里云发布主题消息一直错误
  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据
  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 Centos / PETGEM
  • ¥15 划分vlan后不通了
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)