douxingmou4533 2011-10-11 04:19
浏览 132
已采纳

strtok()的问题

I have been wrestling with this for a while. I know it's a lot of code to look at, but I have no idea where the problem lies and can't seem to narrow it down. I will bounty it.

I wrote this class to parse bbcodes. It uses strtok() primarily, and the class works great unless you put two tags right next to each other, and I can't for the life of me figure out why.

For instance [b] [i]test1[/i] [/b] results in <strong> <em>test1</em> </strong>. Yet [b][i]test1[/i][/b] results in <strong>i]test1/b]</strong>. The last </strong> tag is only in there because the parser automatically closes tags it could not find a closing tag for in the string. It somehow misses the [i] and [/b] tags completely.

Here's the class as well as the one subclass it uses for setting up the various bbcodes. The subclass is basically just a data structure with no behaviours.

<?php
    // beware images can contain any url/any get request. beware of csrf
    class Lev_TextProcessor_Extension_BbCode {

        protected $elements = array();
        protected $openTags = array();

        public function __construct() {
            $this->elements['b'] = new Lev_TextProcessor_Extension_BbCode_Element('<strong>', '</strong>');
            $this->elements['i'] = new Lev_TextProcessor_Extension_BbCode_Element('<em>', '</em>');
            $this->elements['u'] = new Lev_TextProcessor_Extension_BbCode_Element('<span style="text-decoration: underline;">', '</span>');
            $this->elements['s'] = new Lev_TextProcessor_Extension_BbCode_Element('<span style="text-decoration: line-through;">', '</span>');
            $this->elements['size'] = new Lev_TextProcessor_Extension_BbCode_Element('<span style="font-size: ', '</span>', 'px;">');
            $this->elements['color'] = new Lev_TextProcessor_Extension_BbCode_Element('<span style="color: ', '</span>', ';">');
            $this->elements['center'] = new Lev_TextProcessor_Extension_BbCode_Element('<div style="text-align: center;">', '</div>', '', true, true, false);
            $this->elements['url'] = new Lev_TextProcessor_Extension_BbCode_Element('<a href="', '</a>', '">');
            $this->elements['email'] = new Lev_TextProcessor_Extension_BbCode_Element('<a href="mailto:', '</a>', '">');
            $this->elements['img'] = new Lev_TextProcessor_Extension_BbCode_Element('<img src="', '" alt="" />', '', false, false, true);
            $this->elements['youtube'] = new Lev_TextProcessor_Extension_BbCode_Element('<object width="400" height="325"><param name="movie" value="http://www.youtube.com/v/{param}"></param><embed src="http://www.youtube.com/v/', '" type="application/x-shockwave-flash" width="400" height="325"></embed></object>', '', false, false, false);
            $this->elements['code'] = new Lev_TextProcessor_Extension_BbCode_Element('<pre><code>', '</code></pre>', '', true, false, false);
        }

        public function processText($input) {
            // pre processing
            $input = htmlspecialchars($input, ENT_NOQUOTES);
            $input = nl2br($input);
            $input = str_replace(array("
", ""), '', $input);
            // start main processing
            $output = '';
            $allow_child_tags = true;
            $allow_child_quotes = true;

            $string_segment = strtok($input, '[');

            do {
                // check content for quotes
                if ($allow_child_quotes === false) {
                    if (strpos($string_segment, '"') === false) {
                        $output .= $string_segment;
                    }
                } else {
                    // add content to output
                    $output .= $string_segment;
                }

                $tag_contents = strtok(']');

                if (strpos($tag_contents, '/') === 0) {
                    // closing tag
                    $tag = substr($tag_contents, 1);
                    if (isset($this->elements[$tag]) === true && array_search($tag, $this->openTags) !== false) {
                        // tag found
                        do {
                            // close tags till matching tag found
                            $last_open_tag = array_pop($this->openTags);
                            $output .= $this->elements[$last_open_tag]->htmlAfter;
                        } while ($last_open_tag !== $tag);
                        $allow_child_tags = true;
                        $allow_child_quotes = true;
                    }
                } else {
                    // opening tag
                    // separate tag name from argument if there is one
                    $equal_pos = strpos($tag_contents, '=');
                    if ($equal_pos === false) {
                        $tag_name = $tag_contents;
                    } else {
                        $tag_name = substr($tag_contents, 0, $equal_pos);
                        $tag_argument = substr($tag_contents, $equal_pos + 1);
                    }
                    if (isset($this->elements[$tag_name]) === true) {
                        // tag found
                        if (($this->elements[$tag_name]->allowParentTags === true || count($this->openTags) === 0) && $allow_child_tags === true) {
                            // add tag to open tag list and set flags
                            $this->openTags[] = $tag_name;
                            $allow_child_tags = $this->elements[$tag_name]->allowChildTags;
                            $allow_child_quotes = $this->elements[$tag_name]->allowChildQuotes;
                            $output .= $this->elements[$tag_name]->htmlBefore;
                            // if argument exists
                            if ($equal_pos !== false) {
                                if (strpos($tag_argument, '"') === false) {
                                    $output .= $tag_argument;
                                }
                                $output .= $this->elements[$tag_name]->htmlCenter;
                            }
                        }
                    }
                }

                $string_segment = strtok('[');
            } while ($string_segment !== false);
            // close left over tags
            while ($tag = array_pop($this->openTags)) {
                $output .= $this->elements[$tag]->htmlAfter;
            }
            return $output;
        }
    }
?>

<?php

    class Lev_TextProcessor_Extension_BbCode_Element {

        public $htmlBefore;
        public $htmlAfter;
        public $htmlCenter;
        public $allowChildQuotes;
        public $allowChildTags;
        public $allowParentTags;

        public function __construct($html_before, $html_after, $html_center = '', $allow_child_quotes = true, $allow_child_tags = true, $allow_parent_tags = true) {
            if ($allow_child_quotes === false && $allow_child_tags === true) throw new Lev_TextProcessor_Exception('You may not allow child tags if you do not allow child quotes.');
            $this->htmlBefore = $html_before;
            $this->htmlAfter = $html_after;
            $this->htmlCenter = $html_center;
            $this->allowChildQuotes = $allow_child_quotes;
            $this->allowChildTags = $allow_child_tags;
            $this->allowParentTags = $allow_parent_tags;
        }
    }
?>

edit

Fixed by creating the following class for tokenizing.

<?php

    // unlike PHP's strtok() function, this class will not skip over empty tokens.
    class Lev_TextProcessor_Tokenizer {

        protected $string;

        public function __construct($string) {
            $this->string = $string;
        }

        public function getToken($token) {
            $segment_length = strcspn($this->string, $token);
            $token = substr($this->string, 0, $segment_length);
            $this->string = substr($this->string, $segment_length + 1);
            return $token;
        }
    }
?>
  • 写回答

1条回答 默认 最新

  • dongpai2468 2011-10-11 19:19
    关注

    Although I don't think this is really the solution it seems this is the only way I'm going to get my point across.

    It could be something with the way strtok() works that to get the results you want.

    Although not perfect I was able to obtain results close to what you were expecting with this:

     <?
     $data1 = strtok('[b][i]test1[/i][/b]','[');
     $data2 = strtok(']');
     $data3 = strtok('[');
     $data4 = strtok(']');
     $data5 = strtok('[');
     $data6 = strtok(']');
     var_dump($data1, $data2,$data3, $data4, $data5, $data6);
     /*
      OUTPUT
        string(2) "b]"
        string(1) "i"
        string(5) "test1"
        string(2) "/i"
        string(3) "/b]"
        bool(false)
     * /
     ?>
    

    As I said it isn't perfect but maybe seeing this will help you on your way to handling this solution. I personally have never handled BBCode with this type parsing instead using preg_match().

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用
  • ¥15 C++ yoloV5改写遇到的问题
  • ¥20 win11修改中文用户名路径
  • ¥15 win2012磁盘空间不足,c盘正常,d盘无法写入
  • ¥15 用土力学知识进行土坡稳定性分析与挡土墙设计
  • ¥70 PlayWright在Java上连接CDP关联本地Chrome启动失败,貌似是Windows端口转发问题
  • ¥15 帮我写一个c++工程
  • ¥30 Eclipse官网打不开,官网首页进不去,显示无法访问此页面,求解决方法
  • ¥15 关于smbclient 库的使用