dongrunying7537 2012-01-30 17:23
浏览 60
已采纳

用PHP替换字符串中的多个单词

I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to do this on the command line.

So the program reads in a string, and asks me what I want to replace the first word with, and then the second word, and then the third word, and so on, until all words have been processed.

The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation and spacing.

Is there a proper way to do this?

  • 写回答

2条回答 默认 最新

  • duanjian3920 2012-01-30 18:09
    关注

    Given some text

    $subject = <<<TEXT
    I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to do this on the command line.
    
    So the program reads in a string, and asks me what I want to replace the first word with, and then the second word, and then the third word, and so on, until all words have been processed.
    
    The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation and spacing.
    
    Is there a proper way to do this?
    TEXT;
    

    You first tokenize the string into words and "everything else" tokens (e.g. call them fill). Regular expressions are helpful for that:

    $pattern = '/(?P<fill>\W+)?(?P<word>\w+)?/';
    $r = preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE | PREG_SET_ORDER);
    

    The job is now to convert the return value into a more useful data-structure, like an array of tokens and an index of all words used:

    $tokens = array(); # token stream
    $tokenIndex = 0;
    $words = array(); # index of words
    foreach($matches as $matched)
    {
        foreach($matched as $type => $match)
        {
            if (is_numeric($type)) continue;
            list($string, $offset) = $match;
            if ($offset < 0) continue;
    
    
            $token = new stdClass;
            $token->type = $type;
            $token->offset = $offset;
            $token->length = strlen($string);
    
            if ($token->type === 'word')
            {
                if (!isset($words[$string]))
                {
                    $words[$string] = array('string' => $string, 'tokens' => array());
                }
                $words[$string]['tokens'][] = &$token;
                $token->string = &$words[$string]['string'];
            } else {
                $token->string = $string;
            }
    
    
            $tokens[$tokenIndex] = &$token;
            $tokenIndex++;
            unset($token);
        }
    }
    

    Exemplary you can then output all words:

    # list all words
    
    foreach($words as $word)
    {
        printf("Word '%s' used %d time(s)
    ", $word['string'], count($word['tokens']));
    }
    

    Which would give you with the sample text:

    Word 'I' used 3 time(s)
    Word 'need' used 1 time(s)
    Word 'a' used 4 time(s)
    Word 'systematic' used 1 time(s)
    Word 'way' used 2 time(s)
    Word 'of' used 1 time(s)
    Word 'replacing' used 1 time(s)
    Word 'each' used 2 time(s)
    Word 'word' used 5 time(s)
    Word 'in' used 3 time(s)
    Word 'string' used 3 time(s)
    Word 'separately' used 1 time(s)
    Word 'by' used 1 time(s)
    Word 'providing' used 1 time(s)
    Word 'my' used 1 time(s)
    Word 'own' used 1 time(s)
    Word 'input' used 1 time(s)
    Word 'for' used 1 time(s)
    Word 'want' used 2 time(s)
    Word 'to' used 5 time(s)
    Word 'do' used 2 time(s)
    Word 'this' used 2 time(s)
    Word 'on' used 2 time(s)
    Word 'the' used 7 time(s)
    Word 'command' used 1 time(s)
    Word 'line' used 1 time(s)
    Word 'So' used 1 time(s)
    Word 'program' used 1 time(s)
    Word 'reads' used 1 time(s)
    Word 'and' used 5 time(s)
    ... (and so on)
    

    Then you do the job on the word tokens only. For example replacing one string with another:

    # change one word (and to AND)
    
    $words['and']['string'] = 'AND';
    

    Finally you concatenate the tokens into a single string:

    # output the whole text
    
    foreach($tokens as $token) echo $token->string;
    

    Which gives with the sample text again:

    I need a systematic way of replacing each word in a string separately by providing my own input for each word. I want to
     do this on the command line.
    
    So the program reads in a string, AND asks me what I want to replace the first word with, AND then the second word, AND 
    then the third word, AND so on, until all words have been processed.
    
    The sentences in the string have to remain well-formed, so the algorithm should take care not to mess up punctuation AND
     spacing.
    
    Is there a proper way to do this?
    

    Job done. Ensure that word tokens are only replaced with valid word tokens, so tokenize the user-input as well and give errors if it's not a single word token (does not matches the word pattern).

    Code/Demo

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 在现有系统基础上增加功能
  • ¥15 远程桌面文档内容复制粘贴,格式会变化
  • ¥15 关于#java#的问题:找一份能快速看完mooc视频的代码
  • ¥15 这种微信登录授权 谁可以做啊
  • ¥15 请问我该如何添加自己的数据去运行蚁群算法代码
  • ¥20 用HslCommunication 连接欧姆龙 plc有时会连接失败。报异常为“未知错误”
  • ¥15 网络设备配置与管理这个该怎么弄
  • ¥20 机器学习能否像多层线性模型一样处理嵌套数据
  • ¥20 西门子S7-Graph,S7-300,梯形图
  • ¥50 用易语言http 访问不了网页