dongrunying7537 2012-01-30 17:23
浏览 60
已采纳

用PHP替换字符串中的多个单词

I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to do this on the command line.

So the program reads in a string, and asks me what I want to replace the first word with, and then the second word, and then the third word, and so on, until all words have been processed.

The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation and spacing.

Is there a proper way to do this?

  • 写回答

2条回答 默认 最新

  • duanjian3920 2012-01-30 18:09
    关注

    Given some text

    $subject = <<<TEXT
    I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to do this on the command line.
    
    So the program reads in a string, and asks me what I want to replace the first word with, and then the second word, and then the third word, and so on, until all words have been processed.
    
    The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation and spacing.
    
    Is there a proper way to do this?
    TEXT;
    

    You first tokenize the string into words and "everything else" tokens (e.g. call them fill). Regular expressions are helpful for that:

    $pattern = '/(?P<fill>\W+)?(?P<word>\w+)?/';
    $r = preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
    

    The job is now to convert the return value into a more useful data-structure, like an array of tokens and an index of all words used:

    $tokens = array(); # token stream
    $tokenIndex = 0;
    $words = array(); # index of words
    foreach($matches as $matched)
    {
        foreach($matched as $type => $match)
        {
            if (is_numeric($type)) continue;
            list($string, $offset) = $match;
            if ($offset < 0) continue;
    
    
            $token = new stdClass;
            $token->type = $type;
            $token->offset = $offset;
            $token->length = strlen($string);
    
            if ($token->type === 'word')
            {
                if (!isset($words[$string]))
                {
                    $words[$string] = array('string' => $string, 'tokens' => array());
                }
                $words[$string]['tokens'][] = &$token;
                $token->string = &$words[$string]['string'];
            } else {
                $token->string = $string;
            }
    
    
            $tokens[$tokenIndex] = &$token;
            $tokenIndex++;
            unset($token);
        }
    }
    

    Exemplary you can then output all words:

    # list all words
    
    foreach($words as $word)
    {
        printf("Word '%s' used %d time(s)
    ", $word['string'], count($word['tokens']));
    }
    

    Which would give you with the sample text:

    Word 'I' used 3 time(s)
    Word 'need' used 1 time(s)
    Word 'a' used 4 time(s)
    Word 'systematic' used 1 time(s)
    Word 'way' used 2 time(s)
    Word 'of' used 1 time(s)
    Word 'replacing' used 1 time(s)
    Word 'each' used 2 time(s)
    Word 'word' used 5 time(s)
    Word 'in' used 3 time(s)
    Word 'string' used 3 time(s)
    Word 'separately' used 1 time(s)
    Word 'by' used 1 time(s)
    Word 'providing' used 1 time(s)
    Word 'my' used 1 time(s)
    Word 'own' used 1 time(s)
    Word 'input' used 1 time(s)
    Word 'for' used 1 time(s)
    Word 'want' used 2 time(s)
    Word 'to' used 5 time(s)
    Word 'do' used 2 time(s)
    Word 'this' used 2 time(s)
    Word 'on' used 2 time(s)
    Word 'the' used 7 time(s)
    Word 'command' used 1 time(s)
    Word 'line' used 1 time(s)
    Word 'So' used 1 time(s)
    Word 'program' used 1 time(s)
    Word 'reads' used 1 time(s)
    Word 'and' used 5 time(s)
    ... (and so on)
    

    Then you do the job on the word tokens only. For example replacing one string with another:

    # change one word (and to AND)
    
    $words['and']['string'] = 'AND';
    

    Finally you concatenate the tokens into a single string:

    # output the whole text
    
    foreach($tokens as $token) echo $token->string;
    

    Which gives with the sample text again:

    I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to
     do this on the command line.
    
    So the program reads in a string, AND asks me what I want to replace the first word with, AND then the second word, AND 
    then the third word, AND so on, until all words have been processed.
    
    The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation AND
     spacing.
    
    Is there a proper way to do this?
    

    Job done. Ensure that word tokens are only replaced with valid word tokens, so tokenize the user-input as well and give errors if it's not a single word token (does not matches the word pattern).

    Code/Demo

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题
  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向
  • ¥15 如何用python向钉钉机器人发送可以放大的图片?