duanjianfu1398
2018-05-19 20:25
浏览 49
已采纳

使用XPath访问子段落内容

HTML:

<div class="b-list-fact__item-explanation js-fact-explanation">
    <p>Text 1 Text 1 Text 1 Text 1 Text 1 Text 1</p>
    <p>Text 2 Text 2 Text 2 Text 2 Text 2 Text 2 </p>
</div>

I'm trying to access the text inside paragraphs and to combine all p's into one string.

Was trying with a bunch of variations like:

PHP (running on 7.1.11):

    $html = file_get_contents('https://...');
    $html = mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8');
    $dom = new DOMDocument;
    @$dom->loadHTML($html);

    $finder = new DomXPath($dom);
    $facts = $finder->query("//a[contains(@class, normalize-space('b-list-fact__item-text'))]");
    $long_fact = $finder->query("//*[contains(@class, 'b-list-fact__item-explanation js-fact-explanation')]/p");

    foreach ($facts as $key => $fact) {
            $fact_description = $long_fact[$key]->textContent;
            $fact = trim($fact->textContent);
            $dataArr[] = str_replace("
", " ", $fact);
            array_push($dataArr, $fact_description);
    }

$long_fact = $finder->query("//*[contains(@class, 'b-list-fact__item-explanation js-fact-explanation')]/p");

$long_fact = $finder->query("//*[contains(@class, 'b-list-fact__item-explanation js-fact-explanation')]//p[1]");

$long_fact = $finder->query("//*[contains(@class, 'b-list-fact__item-explanation js-fact-explanation')]/p/text()");

if($long_fact->length)
        {
            var_dump($long_fact[0]->textContent);
        }

if($$long_fact->length)
        {
            var_dump($long_fact->textContent);
        }

if($$long_fact->length)
        {
            var_dump($long_fact->nodeValue);
        }

And like 30 other variations...

I'm totally lost as to why this can happen, other variations which don't include p tags are working just fine.

图片转代码服务由CSDN问答提供 功能建议

HTML:

 &lt; div class =“b-  list-fact__item-explanation js-fact-explanation“&gt; 
&lt; p&gt; Text 1 Text 1 Text 1 Text 1 Text 1 Text 1&lt; / p&gt; 
&lt; p&gt; Text 2 Text 2 Text 2 Text 2 Text  2文字2&lt; / p&gt; 
&lt; / div&gt; 
   
 
 

我正在尝试访问段落内的文字并将所有 p < / code>分成一个字符串。

尝试了一堆变体,如:

PHP(在7.1.11上运行):< / p>

  $ html = file_get_contents('https:// ...'); 
 $ html = mb_convert_encoding($ html,'HTML-ENTITIES','UTF-8'  ); 
 $ dom = new DOMDocument; 
 @ $ dom-&gt; loadHTML($ html); 
 
 $ finder = new DomXPath($ dom); 
 $ facts = $ finder-&gt; query(  “// a [contains(@class,normalize-space('b-list-fact__item-text'))]”); 
 $ long_fact = $ finder-&gt; query(“// * [contains(@class)  ,'b-list-fact__item-explanation js-fact-explanation')] / p“); 
 
 foreach($ facts as $ key =&gt; $ fact){
  $ fact_description = $ long_fact [$ key]  - &gt; textContent; 
 $ fact = trim($ fact-&gt; textContent); 
 $ dataArr [] = str_replace(“
”,“”,$ fact);  
 array_push($ dataArr,$ fact_description); 
} 
   
 
 

$ long_fact = $ finder-&gt; query(“// * [contains] (@class,'b-list-fact__item-explanation js-fact-explanation')] / p“);

$ long_fact = $ finder-&gt ; query(“// * [contains(@class,'b-list-fact__item-explanation js-fact-explanation')] // p [1]”);

$ long_fact = $ finder-&gt; query(“// * [contains(@class,'b-list-fact__item-explanation js-fact-explanation')] / p / text()”) ;

  if($ long_fact-&gt; length)
 {
 var_dump($ long_fact [0]  - &gt; textContent); 
} \  n 
if($$ long_fact-&gt; length)
 {
 var_dump($ long_fact-&gt; textContent); 
} 
 
if($$ long_fact-&gt; length)
 {
 var_dump(  $ long_fact-&gt; nodeValue); 
} 
   
 
 <  p>和其他30个变种一样...  
 
 

我完全不知道为什么会发生这种情况,其他不包含 p 标签的变体是 工作得很好。

  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • dqed19166 2018-05-19 21:24
    已采纳
    $ptext = $finder->query('//div[contains(@class, "b-list-fact__item-explanation js-fact-explanation")]/p');
    $paragraphs = [];
    foreach ($ptext as $paragraph) {
        $paragraphs[] = $paragraph->textContent;
    }
    $combined = implode("
    ", $paragraphs);
    

    Alternatively just:

    $ptext = $finder->query('//div[contains(@class, "b-list-fact__item-explanation js-fact-explanation")]')
        ->item(0)->textContent;
    
    打赏 评论

相关推荐 更多相似问题