douchengchen7959 2015-03-31 10:35
浏览 37
已采纳

PHP - 改进文本大小写功能

I have a function that capitalize string:

function capitalize_sentence($text)
    {
      $output =  preg_replace_callback('/([.!?])\s*(\w)/', function ($matches) {
            return strtoupper($matches[1] . ' ' . $matches[2]);
        }, ucfirst(strtolower($text)));
        return $output;
    }

When I have a simple string like that:

$text = 'hello. this works !';
var_dump($text);

$text = capitalize_sentence($text);
var_dump($text);die;

this works nice:

string 'hello.this works !' (length=18) 

string 'Hello. This works !' (length=19)

But in my code, sometimes, string looks like this (with some tags) :

$text = '<span>hello.</span> this <b>works</b> !';
var_dump($text);

$text = capitalize_sentence($text);
var_dump($text);die;

Which gives me this (as you can see, first words are not capitalized...):

string '<span>hello.</span> this <b>works</b> !' (length=39)

string '<span>hello.</span> this <b>works</b> !' (length=39)

How improve my code ? I need to "escape" <tags> without delete them but capitalize first word as in the first example....

I need output like this :

string '<span>Hello.</span> This <b>works</b> !' (length=39)

Thank you !

  • 写回答

3条回答 默认 最新

  • duanbin198788 2015-03-31 11:07
    关注

    Try this:

    function ucSentence($str) {
        $len = strlen($str);
        $flagNeedUC = TRUE; // start of sentence flag
        $flagTag = FALSE;   // inside tag flag
        $endOfSentence = array('.', '!', '?');
        for ($ix = 0; $ix < $len; $ix += 1) {
            if ($flagTag) {
                if ('>' === $str{$ix}) { // resolve end tag
                    $flagTag = FALSE;
                }
            } else {
                if (in_array($str{$ix}, $endOfSentence)) { // resolve end sentence
                    $flagNeedUC = TRUE;
                } elseif ('<' === $str{$ix}) { // resolve start tag
                    $flagTag = TRUE;
                } elseif (ctype_alpha($str{$ix}) && $flagNeedUC) { // resolve first char after sentence end
                    $flagNeedUC = FALSE;
                    $str{$ix} = strtoupper($str{$ix});
                }
            }
        }
        return $str;
    }
    echo ucSentence('<span><b>hello. </b></span> this <b>works</b> !');
    

    It prints <span><b>Hello. </b></span> This <b>works</b>

    UPDATE especially for @w35l3y :)

    I'm added passing attribute value. It recognizes several forms of attribute value which occurs in wild internet: <tag attr="value">, <tag attr='value'> and <tag attr=value attr=value>

    function ucSentence($str) {
        $len = strlen($str);
        $flagNeedUC = TRUE; // start of sentence flag
        $flagTag = FALSE;   // inside tag flag
        $stageAttr = FALSE;  // inside attribute value
        $endOfSentence = array('.', '!', '?');
        for ($ix = 0; $ix < $len; $ix += 1) {
            if ($flagTag) {
                if ($stageAttr) {
                    if ('=' === $stageAttr) {
                        if ('"' === $str{$ix}) {
                            $stageAttr = '"';
                        } elseif ('\'' === $str{$ix}) {
                            $stageAttr = '\'';
                        } else {
                            $stageAttr = ' >';                        
                        }
                    } elseif (strpos($stageAttr, $str{$ix}) !== FALSE) {
                        if ('>' === $str{$ix}) {
                            $flagTag = FALSE;
                        }
                        $stageAttr = FALSE;
                    }
                } else {
                    if ('>' === $str{$ix}) { // resolve end tag
                        $flagTag = FALSE;
                    } elseif ('=' === $str{$ix}) {
                        $stageAttr = '=';
                    }
                }
            } else {
                if (in_array($str{$ix}, $endOfSentence)) { // resolve end sentence
                    $flagNeedUC = TRUE;
                } elseif ('<' === $str{$ix}) { // resolve start tag
                    $flagTag = TRUE;
                } elseif (ctype_alpha($str{$ix}) && $flagNeedUC) { // resolve first char after sentence end
                    $flagNeedUC = FALSE;
                    $str{$ix} = strtoupper($str{$ix});
                }
            }
        }
        return $str;
    }
    
    $testArr = array(
        '<span><b>hello. </b></span> this <b>works</b> !',
        'test. <span title="jane <3 john"> <b>hello. </b></span> this <b>works</b> !',
        'test! <span title="hover -> here"> <b>hello. </b></span> this <b>works</b> !',
        'test <span title="jane <3 john"> <b>hello. </b></span> this <b>works</b> !',
        'test? <span title="hover -> here"> <b>hello. </b></span> this <b>works</b> !',
        'test <span title="hover -> here"> <b>hello. </b></span> this <b>works</b> !',
        'test. <span title=\'hover -> here\'> <b>hello. </b></span> this <b>works</b> !',
        'test. <span title=jane<3john data=jane> <b>hello. </b></span> this <b>works</b> !',
    );
    foreach ($testArr as $num => $testStr) {
        printf("[%d] %s
    ", $num, ucSentence($testStr));
    }
    

    It prints:

    [0] <span><b>Hello. </b></span> This <b>works</b> !
    [1] Test. <span title="jane <3 john"> <b>Hello. </b></span> This <b>works</b> !
    [2] Test! <span title="hover -> here"> <b>Hello. </b></span> This <b>works</b> !
    [3] Test <span title="jane <3 john"> <b>hello. </b></span> This <b>works</b> !
    [4] Test? <span title="hover -> here"> <b>Hello. </b></span> This <b>works</b> !
    [5] Test <span title="hover -> here"> <b>hello. </b></span> This <b>works</b> !
    [6] Test. <span title='hover -> here'> <b>Hello. </b></span> This <b>works</b> !
    [7] Test. <span title=jane<3john data=jane> <b>Hello. </b></span> This <b>works</b> !
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大