dongou2019 2014-01-06 18:29
浏览 28
已采纳

计算文本中单词的出现次数

I have a text in which I would like to calculate occurences of the phrase "lorem ipsum dolor".

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ipsum lorem dolor Curabitur ac risus nunc. Dolor ipsum lorem.

The algorithm should be counting occurrences even if the searching phrase is written in different order. I've highlighted expected results. Is there any better way to achieve that than using regular expression with every possible combination?

In this case the result should be equal to 3

  • Lorem ipsum dolor
  • Ipsum lorem dolor
  • Dolor ipsum lorem

The phrase will have about 3-4 words and string will be a content of web page.

  • 写回答

4条回答 默认 最新

  • dqqn32019 2014-01-06 19:10
    关注
    $haystack = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ipsum lorem dolor Curabitur ac risus nunc. Dolor ipsum lorem.';
    $needle = 'Lorem ipsum dolor';
    
    $hayWords = str_word_count(
        strtolower($haystack), 
        1
    );
    $needleWords = str_word_count(
        strtolower($needle), 
        1
    );
    $needleWordsCount = count($needleWords);
    
    $foundWords = array_intersect(
        $hayWords, 
        $needleWords
    );
    
    $count = array_reduce(
        array_keys($foundWords),
        function($counter, $item) use ($foundWords, $needleWordsCount) {
            for($i = $item; $i < $item + $needleWordsCount; ++$i) {
                if (!isset($foundWords[$i]))
                    return $counter;
            }
            return ++$counter;
        },
        0
    );
    
    var_dump($count);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度