doucong6884 2014-10-09 19:36
浏览 91
已采纳

strpos没有正确搜索字符串

I am trying to make a simple script that will get the contents of a page and when the Order button for a new server comes in it sends an email to the address specified. Currently, as I am having trouble with it I am just echoing the result.

This is the code I have at the moment:

<?php
$site = file_get_contents('http://www.soyoustart.com/en/offers/sys-ip-2.xml');

$needle = '<class="order-button"';

if (strpos($site, $needle) !== FALSE)
{
  echo 'Found';
}
else
{
  echo 'Not Found';
}

Currently I am getting returned with `Not Found' even though that string exists in the contents of the file. What am I doing wrong?

  • 写回答

2条回答 默认 最新

  • duanbichou4942 2014-10-09 19:40
    关注

    You assume that the page contains <class="order". But it doesn't; what it does contain is

    <div class="zone-dedicated-availability button" 
         data-actions="orderButton"
         data-ref="142sys5"
         data-cgi="order"></div>
    

    You possibly need a more powerful tool than strpos (no, not regexps).

    If you really are sure the structure of the page/CSS is not going to change too much, you can try to extract all "" tags (recognizable with an easy and reasonable regexp: "]+>"), and then check all of them until you find one that contains "orderButton" or something like that. preg_match_all() and array_filter() are probably your friends.

    Another very promising possibility is to use a XML library - the URL extension seems to indicate it's possible to access a reasonably structured and well-formed entity tree behind that page. If so, XPath is your friend.

    Update

    The XML you indicated is not very well formed (it has the non-HTML tags header, footer, and nav; and it has the Italian flag erroneously declared as Flagz/fi instead of Flagz/it, colliding with the Finland flag. Which says the file was not validated and therefore cannot be trusted to work reliably), so

    simplexml_load_file($address)
          ->xpath('/div[class="button"][data-actions="orderButton"]');
    

    or something like that (e.g. DOMdocument/DOMXpath), while the correct approach, is nonetheless not going to work off-the-shelf. A more permissive XML library is needed; you can try SimpleDOM.

    The DOM approach is usually much better because it's extremely more flexible and does not need awkward 'fixes' to manage things such as the attributes changing their order. Also, several tools collaborate with DOM - for example with Firefox's Firebug extension you can simply grab the XPath off the object. They change their page layout, and instead of guessing how to extract the data you need, you can just open up the page, copy and paste the new XPath, and Bob's your uncle.

    Otherwise, the brute force solution described above:

    $xml = file_get_contents($url);
    
    // Extract all DIVs with a `class` attribute (maybe `data-actions` would be better?)
    preg_match_all('#<div[^>]+class[^>]+>#', $xml, $gregs);
    
    // Accept only those with the appropriate data action
    $btns = array_values(
        array_filter(
            $gregs[0],
            function($div) {
                return preg_match('#data-actions="orderButton"#', $div);
            }
        )
    );
    
    print_r($btns);
    

    will return (unless $btns is empty, of course)

    Array
    (
        [0] => <div class="zone-dedicated-availability button" data-actions="orderButton" data-ref="142sys5" data-cgi="order">
    )
    

    You can then parse it (with XML too - just add '</div>') to access the attributes such as data-ref:

    if (count($btns) != 1) {
        die("No button, or too many buttons");
    }
    
    $xml = simplexml_load_string($btns[0] . '</div>');
    $attrs = array();
    foreach ($xml->attributes() as $key => $value) {
        $attrs[$key] = (string)$value;
    }
    
    $ref = $attrs['data-ref'];
    
    print $ref;
    

    This will assign to $ref the value '142sys5'. You can var_dump the $attrs array and see the other attributes, if needed.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测