dounai6613 2014-04-09 11:38
浏览 34
已采纳

获得包含特定单词的所有句子

I am trying to get all the sentence from text which contains set of sentences:

Here is my code and

http://ideone.com/fork/O9XtOY

<?php
$var = array('one','of','here','Another');
$str = 'Start of sentence one. This is a wordmatch one two three four! Another, sentence here.';
foreach ($var as $val)
{
    $m =$val; // word 
    $regex = '/[A-Z][^\.\!\;]*('.$m.')[^\.;!]*/';
    //
    if (preg_match($regex, $str, $match))
    {
        echo $match[0];     
        echo "
";
    }
}
  1. Why did it not print last sentence twice though I here and Another both appears in it
  2. How can I skip sentence in the list if it already present? Want to remove the redundancy. I want to store sentence in some data structure/variable to use all such sentences later
  • 写回答

2条回答 默认 最新

  • dongye1942 2014-04-09 13:31
    关注

    I'd say your approach is a bit too convoluted. It's easier to:

    1. first get all sentences,
    2. and then filter this set by your criteria.

    E.g:

    // keywords to search for
    $needles = array('one', 'of', 'here', 'Another');
    
    // input text
    $text = 'Start of sentence one. This is a wordmatch one two three four! Another, sentence here.';
    
    // get all sentences (the pattern could be too simple though)
    if (preg_match_all('/.+?[!.]\s*/', $text, $match)) {
    
      // select only those fitting the criteria
      $hits = array_filter($match[0], function ($sentence) use($needles) {
    
        // check each keyword
        foreach ($needles as $needle) {
          // return early on first hit (or-condition)
          if (false !== strpos($sentence, $needle)) {
            return true;
          }
        }
    
        return false;
      });
    
      // log output
      print_r($hits);
    }
    

    demo: http://ideone.com/pZfOb5


    Notes regarding:

    if (preg_match_all('/.+?[!.]\s*/', $text, $match)) {
    

    About the pattern:

    .+?   // select at least one char, ungreedy
    [!.]  // until one of the given sentence
          // delimiters is found (could/should be extended as needed)
    \s*   // add all following whitespace
    

    array_filter($match[0], function ($sentence) use($needles) {
    

    array_filter just does what it's name suggests. It returns a filtered version of the input array (here $match[0]). The supplied callback (the inline function) get's called for each element of the array and should return true/false for whether the current element should be part of the new array. The use-syntax allows access to the $needles-array, which is needed inside the function.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 stm32代码移植没反应
  • ¥15 matlab基于pde算法图像修复,为什么只能对示例图像有效
  • ¥100 连续两帧图像高速减法
  • ¥15 组策略中的计算机配置策略无法下发
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作
  • ¥15 用友U8:向一个无法连接的网络尝试了一个套接字操作,如何解决?
  • ¥30 我的代码按理说完成了模型的搭建、训练、验证测试等工作(标签-网络|关键词-变化检测)
  • ¥50 mac mini外接显示器 画质字体模糊