dongshenjie3055 2018-10-16 19:32
浏览 406
已采纳

保持现有HTML实体不变,但转换双引号和单引号

I'm using PHP code to generate my meta description tag, like so:

<meta name="description" content="<?php
echo $this->utf->clean_string(word_limiter(strip_tags(trim($paperResult['file_content'])),27));
?>


Here's an example of the meta description output:

<meta name="description" content="blah blah &#182; &#8230; blah blah "words in quotation marks" blah blah "more words in quotation marks" blah blah" />

The two HTML entities in that example meta description are a paragraph sign (&#182;) followed by an ellipsis (&#8230;). They are already in HTML entity form in the source text, so I want them to remain unchanged. The problem is that I also need the quotation marks within the description to convert to &quot; in order to prevent the meta tag from breaking. Every combination/configuration that I try either does not work or breaks my site because I'm getting the code wrong. For example, when I try the following code, the quotation marks convert to their HTML entity, as desired, but the paragraph symbol and ellipsis entities break because the ampersand character at the beginning of the existing HTML entities gets converted to &amp;. That leaves me with a broken &#182; (&amp;#182;) and a broken &#8230; (&amp;#8230;) :

 echo $this->utf->clean_string(word_limiter(htmlspecialchars(strip_tags(trim($paperResult['file_content']))),27));

I've been trying—literally, for days—to figure this out. I've searched extensively in Stack Overflow, to no avail. I just need the existing HTML entities to remain unchanged and quotation marks to be converted to their HTML entity (&quot;). I have studied the ENT_QUOTES option and I know that the solution probably exists therein, but I can't figure out how to incorporate it into my particular line of code. I'm hoping that you PHP gurus will have mercy on this tortured soul! I'd truly appreciate your help.

Thank you!

  • 写回答

2条回答 默认 最新

  • douyou9923 2018-10-16 19:43
    关注

    If it's the contents of the "content" attribute you can do this

    $str = 'blah blah &#182; &#8230; blah blah "words in quotation marks" blah blah "more words in quotation marks" blah blah';
    echo htmlentities($str, ENT_QUOTES, "UTF-8", false);
    

    Output

    blah blah &#182; &#8230; blah blah &quot;words in quotation marks&quot; blah blah &quot;more words in quotation marks&quot; blah blah
    

    Sandbox

    The key thing here is the 4th argument

    string htmlentities ( string $string [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = ini_get("default_charset") [, bool $double_encode = TRUE ]]] )

    Specifically

    double_encode When double_encode is turned off PHP will not encode existing html entities. The default is to convert everything.

    That way it doesn't double encode the ampersand.

    htmlspecialchars also has a double encode argument.

    htmlspecialchars ( string $string [, int $flags = ENT_COMPAT | ENT_HTML401 [, string $encoding = ini_get("default_charset") [, bool $double_encode = TRUE ]]] )

    $str = 'blah blah &#182; &#8230; blah blah "words in quotation marks" blah blah "more words in quotation marks" blah blah';
    echo htmlspecialchars($str, ENT_QUOTES, "UTF-8", false);
    

    Output

    blah blah &#182; &#8230; blah blah &quot;words in quotation marks&quot; blah blah &quot;more words in quotation marks&quot; blah blah
    

    Sandbox

    If it's the whole tag, then you'll have to pull out the contents and modify it and then replace it so as to preserve the < and >, but it's not clear in the question if that is the case.

    PS there is not a whole lot of difference between htmlspecialchars and htmlentities, it mainly has to do with é accute and other accent things like that, htmlentities encodes those too, if I remember correctly.

    UPDATE

    I need the solution to be incorporated into my particular format of PHP code (i.e., a single line of PHP that maintains my existing functions/functionality), as miken32 brilliantly did above

    To put it in your code,

    <meta name="description" content="<?=htmlspecialchars(word_limiter(trim($paperResult['file_content']),27),ENT_QUOTES,"UTF-8",false);?>"/>
    

    UPDATE2

    With preg_replace('/[ ]+/', ' ', $string) removes or one or more times +. But it may be better to do it this way preg_replace(['/[ ]+/', '/\s+/'], ' ', $string). Which would remove run on spaces too.

     <meta name="description" content="<?=htmlspecialchars(word_limiter(preg_replace('/[
    ]+/', ' ', trim($paperResult['file_content'])),27),ENT_QUOTES,"UTF-8",false);?>"/>
    

    Basically what it amounts to is anything that makes the text shorter you probably want to do before word_limiter (whatever that is). And any thing that makes it longer, like changing " to &quote; you probably want to do after (maybe). It just seems more logical to me.

    Cheers!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 eda:门禁系统设计
  • ¥50 如何使用js去调用vscode-js-debugger的方法去调试网页
  • ¥15 376.1电表主站通信协议下发指令全被否认问题
  • ¥15 物体双站RCS和其组成阵列后的双站RCS关系验证
  • ¥15 复杂网络,变滞后传递熵,FDA
  • ¥20 csv格式数据集预处理及模型选择
  • ¥15 部分网页页面无法显示!
  • ¥15 怎样解决power bi 中设置管理聚合,详细信息表和详细信息列显示灰色,而不能选择相应的内容呢?
  • ¥15 QTOF MSE数据分析
  • ¥15 平板录音机录音问题解决