dtwy2858
2011-12-15 16:44
浏览 54
已采纳

维基百科样式包括 - 循环检测PHP

I have a somewhat interesting query..

I may have over simplified the example, but ill do my best to describe my problem.

I am building a very simple implementation of a wiki from scratch, everything going well until I realized I need Cycle Detection to prevent endless loops of data populating the page and well Overflowing the stack heap.

Database structure is basic, well its more intricate that what is shown but for the purposes of this post the two columns is all we need.

The Content field is straight forward, it stores the Content of the Page or WikiPart links i.e [[n]] to link to another part and includes, links are shopwn as [[n]] and includes are {{n}}.

+---------------------------+
| id    |  Content          |
+---------------------------+
|  1    | see {{2}} here    | 
+---------------------------+
|  2    | {{1}} here [[4]]  | 
+---------------------------+
|  4    | {{1}}             | 
+---------------------------+



$html_for_screen = readData($this->Content);

function readData($wikipage) {

    $str = "";

    //Convert any wiki links to HTML Links
    $wikipage = Converter::convertWikink($wikipage);

    //Get ALL Include Link matches into array
    $wiki_inc = RegEx::getMatches(wikipage); 

    //Iterate through the Matches
    foreach($wiki_inc as $wiki) {
         //traverse through each match. 
         //but I assume here is where I would eventually have the trouble
         //With infinant loops
         $str .= readData($wiki);
    }

    return $str;

}

The Question: How would I prevent Wiki parts endlessly including eachother. i.e WikiPart 1 includes WikiPart2.. but WikiPart 2 includes WikiPart1

The parse or readData() function would just continue looping.

regards

图片转代码服务由CSDN问答提供 功能建议

我有一个有趣的查询..

我可能已经简化了 这个例子,但我尽力描述我的问题。

我正在从头开始构建一个非常简单的wiki实现,一切顺利,直到我意识到我需要循环检测以防止无休止 填充页面的数据循环和井溢出堆栈堆。

数据库结构是基本的,它显示的内容更复杂但是为了本文的目的,这两列是我们所有的 需要。

内容字段是直接的,它存储页面或WikiPart链接的内容,即[[n]]链接到另一个部分并包括,链接被加入[[ n]]和包括{{n}}。

  + -----------------------  ---- + \ N |  id | 内容| 
 + --------------------------- + 
 |  1 | 在这里看{{2}}  
 + --------------------------- + \ N |  2 |  {{1}}这里[[4]] |  
 + --------------------------- + \ N |  4 |  {{1}} |  
 + --------------------------- + 
 
 
 
 $ html_for_screen = readData($ this-> Content)  ; 
 
function readData($ wikipage){
 
 $ str =“”; 
 
 //将任何wiki链接转换为HTML链接
 $ wikipage = Converter :: convertWikink($ wikipage); 
  
 //获取所有包含链接匹配到数组
 $ wiki_inc = RegEx :: getMatches(wikipage);  
 
 //遍历Matches 
 foreach($ wiki_inc as $ wiki){
 //遍历每个匹配。  
 //但我假设这里是我最终遇到麻烦的地方
 //使用infinant循环
 $ str。= readData($ wiki); 
} 
 
返回$ str; 
 
  } 
   
 
 

问题: 我如何无休止地阻止Wiki部分包括彼此。 i.e WikiPart 1包括WikiPart2 ..但WikiPart 2包含WikiPart1

parse或readData()函数将继续循环。

问候

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • douche3791 2011-12-15 17:21
    已采纳

    Actually if you encounter a cycle, you can't resolve any longer. Example:

    1: {{2}}
    2: {{1}}
    

    This will create an endless loop:

    1 -> 2 -> 1 -> 2 -> ...
    

    As any computers resources are limited, endless loops will result in a crash.

    So what can you do? You could detect that and then error out by using a stack:

    function readData($wikipage)
    {
        static $stack = array();
        if (in_array($wikipage, $stack))
        {
            throw new Exception(sprintf('Circular reference detected: %s -> %s', implode(' -> ', $stack), $wikipage));
        }
        $stack[] = $wikipage;
    
        ... (your existing code)
    
        array_pop($stack);
    }
    

    Additionally you can control recursion limit by using count($stack) to determine the nesting level.

    Actually throwing an exception might not be the proper reaction to the cyclic reference, but it shows how the detection work. You can decide on your own how you'd like to deal with the case, e.g. returning FALSE or not resolving the field any longer etc..

    Edit: Getting creative here:

    If the output is HTML you could make the user resolve the problem as well. If such a cyclic reference is detected, some AJAX marker could be inserted that would request in some form of overlay within the browser that snippet which was unable to obtain on the server side. Such an overlay will then contain the cyclic reference again (being able to overlay again) so that the user would be able to see the cyclic reference interactively.

    点赞 打赏 评论
  • dongyan5641 2011-12-15 17:05

    You could track your inclusion with a stack (or a set). If you find the Page you are about to include somewhere in the stack you stop.

    You could also just set a recursion limit like 30 or something, this is not very clean, but works.

    点赞 打赏 评论