dongxietao0263 2018-06-15 16:50
浏览 105

cURL结果作为另一个cURL参数

Dear StackOverflow Community,

I was tasked to create a script that by using cURL in PHP takes Craigslist advertisements and posts them on another page.

The way I thought of doing it:

  • run cURL once over the webpage containing the advertisements "https://amsterdam.craigslist.org/d/flats-housing-for-rent/search/apa?lang=en&cc=gb"
  • extract the link of each advertisement and store it in an array (e.g. $listings['links'])
  • then run another cURL on each of the links in the array and extract all the elements of the advertisement and place them in the $listings = array() as follows $listings['title'], $listings['price'], $listings['description']
  • finally I have to display the information scrapped with cURL, but this I can manage myself

My only question is, how can I run cURL on the results of the first cURL? I would like to run a foreach loop on the $listings['links'] and scrap the information from there.

Below is the code I wrote, which only scraps the $listings['links'], therefore the second cURL within the foreach loop does not work.

Could you please advise on how should I proceed to get it working?

Thank you for the support!

$url = "https://amsterdam.craigslist.org/d/flats-housing-for-rent/search/apa?lang=en&cc=gb";

$ch1 = curl_init();

curl_setopt($ch1, CURLOPT_URL, $url);
curl_setopt($ch1, CURLOPT_RETURNTRANSFER, true);

$result = curl_exec($ch1);

$listings = array();


//Match listing link
preg_match_all(
    "!<a href=\"https:\/\/amsterdam\.craigslist\.org\/apa\/d\/.*\/.*\.html\?lang=en&amp;cc=gb\" class=\"result-image gallery\" data-ids=\".*\">
            <span class=\"result-price\">.*<\/span>
    <\/a>!", $result, $match);
$listings['link'] = $match[0];

foreach($listings['link'] as $link){

    $ch2 = curl_init();

    curl_setopt($ch2, CURLOPT_URL, $link);
    curl_setopt($ch2, CURLOPT_RETURNTRANSFER, false);

    $result_meta = curl_exec($ch2);

    $meta = array();

    preg_match_all("!<title>(.*)</title>!", $result_meta, $match);
    $listings['title'] = $match[0];


}


// Return results
echo "<pre>";
print_r($listings['title']); // no title is stored in this array :(
die;
echo "</pre>";
  • 写回答

1条回答 默认 最新

  • dousong9729 2018-06-15 18:18
    关注

    dude, you're parsing html wrong, wrong, wrong. do not parse HTML with regex, use a HTML parser instead, like, for example, DOMDocument/DOMXPath. quote My only question is, how can I run cURL on the results of the first cURL?, until you fix your parsing, you shouldn't, but after that, just keep changing CURLOPT_URL between each invocation of curl_exec, it should look something like:

    <?php
    declare(strict_types = 1);
    $url = "https://amsterdam.craigslist.org/d/flats-housing-for-rent/search/apa?lang=en&cc=gb";
    
    $ch1 = curl_init ();
    
    curl_setopt ( $ch1, CURLOPT_URL, $url );
    curl_setopt ( $ch1, CURLOPT_RETURNTRANSFER, true );
    
    $result = curl_exec ( $ch1 );
    $domd = @DOMDocument::loadHTML ( $result );
    $xp = new DOMXPath ( $domd );
    foreach ( $xp->query ( "//a[contains(@class,'result-title')]" ) as $result ) {
        $url = $result->getAttribute ( "href" );
        echo "scraping '{$result->textContent}': $url
    ";
        curl_setopt ( $ch1, CURLOPT_URL, $url );
        $html = curl_exec ( $ch1 );
        // ...
    }
    

    which currently prints this:

    scraping 'Modern furnished 2 bedroom apartment on the Govert Flinckstraat': https://amsterdam.craigslist.org/apa/d/modern-furnished-2-bedroom/6617858191.html?lang=en&cc=gb
    scraping 'Luxurious 2 bedroom apartment on the Van Nijenrodeweg in Buitenveldert': https://amsterdam.craigslist.org/apa/d/luxurious-2-bedroom-apartment/6604340163.html?lang=en&cc=gb
    scraping 'Beautiful furnished 2 bedroom apartment on the Panamalaan': https://amsterdam.craigslist.org/apa/d/beautiful-furnished-2-bedroom/6604327875.html?lang=en&cc=gb
    scraping 'Beautiful 4 bedroom apartment on the Govert Flinckstraat in De Pijp': https://amsterdam.craigslist.org/apa/d/beautiful-4-bedroom-apartment/6604367785.html?lang=en&cc=gb
    scraping 'Renovated 2 bedroom apartment on the Eerste Oosterparkstraat in Oost': https://amsterdam.craigslist.org/apa/d/renovated-2-bedroom-apartment/6604314398.html?lang=en&cc=gb
    scraping 'Furnished modern studio on the Borneolaan in Zeeburg': https://amsterdam.craigslist.org/apa/d/furnished-modern-studio-on/6604281673.html?lang=en&cc=gb
    scraping 'Furnished one bedroom apartment on the Van Baerlestraat': https://amsterdam.craigslist.org/apa/d/furnished-one-bedroom/6604292644.html?lang=en&cc=gb
    scraping 'Renovated furnished 2 bedrooom apartment on the Van Leijenberghlaan': https://amsterdam.craigslist.org/apa/d/renovated-furnished-2/6614934522.html?lang=en&cc=gb
    scraping 'Available furnished one bedroom apartment beside Center': https://amsterdam.craigslist.org/apa/d/available-furnished-one/6617750497.html?lang=en&cc=gb
    scraping 'Beautiful renovated apartment in Almere-buiten (sharing is possible)': https://amsterdam.craigslist.org/apa/d/beautiful-renovated-apartment/6617746303.html?lang=en&cc=gb
    scraping 'Furnished 1-BR apartment available for 6 months in Amsterdam Oud West': https://amsterdam.craigslist.org/apa/d/furnished-1-br-apartment/6604971876.html?lang=en&cc=gb
    scraping 'Furnished 3 bedrooms apartment with lift and lake view': https://amsterdam.craigslist.org/apa/d/furnished-3-bedrooms/6617572711.html?lang=en&cc=gb
    scraping 'CITY CENTER-FURNISHED 3 ROOMS APARTMENT': https://amsterdam.craigslist.org/apa/d/city-center-furnished-3-rooms/6617558121.html?lang=en&cc=gb
    scraping '1750 euros 3 bedr apt. furn/unfurn Hoogoord': https://amsterdam.craigslist.org/apa/d/1750-euros-3-bedr-apt-furn/6617550824.html?lang=en&cc=gb
    scraping '2200 euros 2 bedr luxe apt with large balcony, etc. TweedeJanvdHeijden': https://amsterdam.craigslist.org/apa/d/2200-euros-2-bedr-luxe-apt/6597678297.html?lang=en&cc=gb
    scraping '2250 euros 1 furn bedr luxe apt. Herengracht': https://amsterdam.craigslist.org/apa/d/2250-euros-1-furn-bedr-luxe/6597752437.html?lang=en&cc=gb
    scraping '1850 euros 2 bedr furn, luxe apt. Van Woustraat': https://amsterdam.craigslist.org/apa/d/1850-euros-2-bedr-furn-luxe/6597716061.html?lang=en&cc=gb
    scraping '1700 euros 2 bedr furn apt patio, Uranusstraat': https://amsterdam.craigslist.org/apa/d/1700-euros-2-bedr-furn-apt/6600293763.html?lang=en&cc=gb
    scraping '1 bedr semi-furn apt. Loenermark': https://amsterdam.craigslist.org/apa/d/1-bedr-semi-furn-apt/6615099202.html?lang=en&cc=gb
    scraping '2050 euros 2 bedr furn, spacious apt. roof terrace. Valckenierstraat': https://amsterdam.craigslist.org/apa/d/2050-euros-2-bedr-furn/6597768160.html?lang=en&cc=gb
    scraping '2000 euros 2 bedr furn apt. Duke Ellingtonstraat': https://amsterdam.craigslist.org/apa/d/2000-euros-2-bedr-furn-apt/6597764353.html?lang=en&cc=gb
    scraping '1750 euros 2 bedr semi-furn apt. Gerard Schaepstraat': https://amsterdam.craigslist.org/apa/d/1750-euros-2-bedr-semi-furn/6595976016.html?lang=en&cc=gb
    scraping '1500 euros excl. 1 bedr furn apt. Tweede Atjehstraat': https://amsterdam.craigslist.org/apa/d/1500-euros-excl-1-bedr-furn/6602844754.html?lang=en&cc=gb
    scraping '1650 euros 2 bedr furn apt. Korte Leidsedwarsstraat': https://amsterdam.craigslist.org/apa/d/1650-euros-2-bedr-furn-apt/6615165183.html?lang=en&cc=gb
    scraping 'New fully furnished 50m² 1 bedroom apartment with lift and garage': https://amsterdam.craigslist.org/apa/d/new-fully-furnished-50m-1/6617317703.html?lang=en&cc=gb
    scraping 'Available now unfurnished 2 bedrooms apartment with terrace': https://amsterdam.craigslist.org/apa/d/available-now-unfurnished-2/6617299772.html?lang=en&cc=gb
    scraping '1350,- ALL INCLUSIVE - Fully furnished cozy studio in Old West': https://amsterdam.craigslist.org/apa/d/1350-all-inclusive-fully/6612567692.html?lang=en&cc=gb
    scraping 'Spacious Jordaan duplex with 3 bedrooms 2 bathrooms and roof terrace': https://amsterdam.craigslist.org/apa/d/spacious-jordaan-duplex-with/6608058332.html?lang=en&cc=gb
    scraping 'Available  furnished house 3 double bedrooms and garden': https://amsterdam.craigslist.org/apa/d/available-furnished-house-3/6617299424.html?lang=en&cc=gb
    scraping 'Available modern unfurnished 3 bedrooms apartment with lift': https://amsterdam.craigslist.org/apa/d/available-modern-unfurnished/6617315497.html?lang=en&cc=gb
    scraping '1000,- ALL INCLUSIVE - Furnished cozy studio available in Zuidoost': https://amsterdam.craigslist.org/apa/d/1000-all-inclusive-furnished/6613790116.html?lang=en&cc=gb
    scraping 'Fully  furnished modern 2 bedroom apartment beside Centraal Station': https://amsterdam.craigslist.org/apa/d/fully-furnished-modern-2/6613775710.html?lang=en&cc=gb
    scraping 'Spacious furnished 3 double bedrooms house with garden and garage': https://amsterdam.craigslist.org/apa/d/spacious-furnished-3-double/6613775633.html?lang=en&cc=gb
    scraping 'Spacious furnished 2 bedrooms apartment with terrace and lift': https://amsterdam.craigslist.org/apa/d/spacious-furnished-2-bedrooms/6605279190.html?lang=en&cc=gb
    scraping 'New 2 double bedrooms apartment with lift and terrace near De Pijp': https://amsterdam.craigslist.org/apa/d/new-2-double-bedrooms/6617314884.html?lang=en&cc=gb
    scraping 'Furnished beautiful big apartment 2 bedroom lift terrace and parking': https://amsterdam.craigslist.org/apa/d/furnished-beautiful-big/6596164877.html?lang=en&cc=gb
    scraping 'New 2 double bedroom apartment with lift and terrace near De Pijp': https://amsterdam.craigslist.org/apa/d/new-2-double-bedroom/6613777474.html?lang=en&cc=gb
    scraping 'Duplex ground floor house with two bedrooms garden and parking': https://amsterdam.craigslist.org/apa/d/duplex-ground-floor-house/6598363327.html?lang=en&cc=gb
    scraping 'Living and workspace beside center with 3 bedrooms and garden': https://amsterdam.craigslist.org/apa/d/living-and-workspace-beside/6613775908.html?lang=en&cc=gb
    scraping 'Cozy fully furnished 1 bedroom apartment with lift and parking': https://amsterdam.craigslist.org/apa/d/cozy-fully-furnished-1/6612147393.html?lang=en&cc=gb
    scraping 'Cozy and nicely furnished 1 bedroom apartment in De Pijp': https://amsterdam.craigslist.org/apa/d/cozy-and-nicely-furnished-1/6612016108.html?lang=en&cc=gb
    scraping 'Fully furnished spacious 2 double bedrooms apartment with lift': https://amsterdam.craigslist.org/apa/d/fully-furnished-spacious-2/6611297183.html?lang=en&cc=gb
    scraping 'Fully furnished 1 bedroom apartment in the Jordaan': https://amsterdam.craigslist.org/apa/d/fully-furnished-1-bedroom/6613772277.html?lang=en&cc=gb
    scraping 'Unfurnished huge 4 bedrooms duplex just beside Vondelpark': https://amsterdam.craigslist.org/apa/d/unfurnished-huge-4-bedrooms/6611295848.html?lang=en&cc=gb
    scraping '1350,- ALL INCLUSIVE - Fully furnished spacious studio in Old West': https://amsterdam.craigslist.org/apa/d/1350-all-inclusive-fully/6617312889.html?lang=en&cc=gb
    scraping 'Furnished apartment beside center 2 bedrooms parking and terrace': https://amsterdam.craigslist.org/apa/d/furnished-apartment-beside/6611292171.html?lang=en&cc=gb
    scraping 'Beautiful apartment near Vondelpark 2 bedrooms 2 bathrooms and terrace': https://amsterdam.craigslist.org/apa/d/beautiful-apartment-near/6613771364.html?lang=en&cc=gb
    scraping 'Available fully furnished 2 double bedrooms apartment with lift': https://amsterdam.craigslist.org/apa/d/available-fully-furnished-2/6611290058.html?lang=en&cc=gb
    scraping 'Available spacious unfurnished house 4 bedrooms and terrace': https://amsterdam.craigslist.org/apa/d/available-spacious/6611250253.html?lang=en&cc=gb
    scraping 'Modem luxu furnished two bedroom modern apartment with lift and garage': https://amsterdam.craigslist.org/apa/d/modem-luxu-furnished-two/6611251916.html?lang=en&cc=gb
    

    (and more, but the list was getting too long so i capped it, 120 results per page)

    评论

报告相同问题?

悬赏问题

  • ¥20 delta降尺度方法,未来数据怎么降尺度
  • ¥15 c# 使用NPOI快速将datatable数据导入excel中指定sheet,要求快速高效
  • ¥15 再不同版本的系统上,TCP传输速度不一致
  • ¥15 高德地图点聚合中Marker的位置无法实时更新
  • ¥15 DIFY API Endpoint 问题。
  • ¥20 sub地址DHCP问题
  • ¥15 delta降尺度计算的一些细节,有偿
  • ¥15 Arduino红外遥控代码有问题
  • ¥15 数值计算离散正交多项式
  • ¥30 数值计算均差系数编程