dongmin1166 2017-07-10 11:01
浏览 54

Php将html链接转换为保持相同html结构的文本

I am struggling converting html links into text keeping same html structure.

I need to covert this html page part

<div>
    <p>text text bla blah</p>
    <p><a href="https://google.com" rel="nofollow" target="_blank" title="google">Cool website</a></p>
    <p><a href="https://google.com" rel="nofollow" target="_blank" title="google">Cool website</a></p>
</div>

into this

<div>
    <p>text text bla blah</p>
    <p>Cool website https://google.com</p>
    <p>Cool website https://google.com</p>
</div>

I found a nice script PHP regex: How to convert HTML string with links into plain text that shows URL after text in brackets which collects html links and converts them into plain text and that is part of job.

this is what i have so far:

$htmlString = '
<div>
    <p>text text bla blah</p>
    <p><a href="https://google.com" rel="nofollow" target="_blank" title="google">Cool website</a></p>
    <p><a href="https://google.com" rel="nofollow" target="_blank" title="google">Cool website</a></p>
</div>
';

libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTML($htmlString);
$xpath = new DOMXPath($dom);

$links = [];
$linksAsString = '';

foreach ($xpath->query('//a') as $linkElement)
{
    $link = [
        'href' => $linkElement->getAttribute('href'),
        'text' => $linkElement->textContent
    ];
    $links[] = $link;

    $linksAsString .= $link['text'] . " {$link['href']}<br/>";
}
libxml_clear_errors();

echo $linksAsString;

current code only outputs converted links:

Cool website https://google.com
Cool website https://google.com

I would appreciate some help.

  • 写回答

2条回答 默认 最新

  • doufusi2013 2017-07-10 11:27
    关注

    You could use str_replace with the full element.

    <?php
    $htmlString = '
    <div>
        <p>text text bla blah</p>
        <p><a href="https://google.com" rel="nofollow" target="_blank" title="google">Cool website</a></p>
        <p><a href="https://google.com" rel="nofollow" target="_blank" title="google">Cool website</a></p>
    </div>
    ';
    libxml_use_internal_errors(true);
    $dom = new DOMDocument();
    $dom->loadHTML($htmlString);
    $xpath = new DOMXPath($dom);
    foreach ($xpath->query('//a') as $linkElement)
    {
        $htmlString = str_replace($dom->saveHTML($linkElement), $linkElement->textContent . ' ' . $linkElement->getAttribute('href'), $htmlString);
    }
    libxml_clear_errors();
    
    echo $htmlString;
    

    Output:

    <div>
        <p>text text bla blah</p>
        <p>Cool website https://google.com</p>
        <p>Cool website https://google.com</p>
    </div>
    

    Demo: https://eval.in/830127

    评论

报告相同问题?

悬赏问题

  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路
  • ¥15 公交车和无人机协同运输