douyou1901 2015-12-16 10:50
浏览 39
已采纳

使用preg_split分割和弦和单词

I'm working on a little piece of code playing handling song tabs, but i'm stuck on a problem.

I need to parse each song tab line and to split it to get chunks of chords on the one hand, and words in the other.

Each chunk would be like :

$line_chunk = array(
    0 => //part of line containing one or several chords
    1 => //part of line containing words
);

They should stay "grouped". I mean by this that it should split only when the function reaches the "limit" between chords and words.

I guess I should use preg_split to achieve this. I made some tests, but I've been only able to split on chords, not "groups" of chords:

$line_chunks = preg_split('/(\[[^]]*\])/', $line, -1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);

Those examples shows you what I would like to get :

on a line containing no chords :

$input = '{intro}';

$results = array(
    array(
        0 => null,
        1 => '{intro}
    )
);

on a line containing only chords :

$input = '[C#] [Fm] [C#] [Fm] [C#] [Fm]';

$results = array(
    array(
        0 => '[C#] [Fm] [C#] [Fm] [C#] [Fm]',
        1 => null
    )
);

on a line containing both :

$input = '[C#]I’m looking for [Fm]you [G#]';

$results = array(
    array(
        0 => '[C#]',
        1 => 'I’m looking for'
    ),
    array(
        0 => '[Fm]',
        1 => 'you '
    ),
    array(
        0 => '[G#]',
        1 => null
    ),
);

Any ideas of how to do this ?

Thanks !

  • 写回答

2条回答 默认 最新

  • duanmu1736 2015-12-16 11:09
    关注

    preg_split isn't the way to go. Most of the time, when you have a complicated split task to achieve, it's more easy to try to match what you are interested by instead of trying to split with a not easy to define separator.

    A preg_match_all approach:

    $pattern = '~ \h*
    (?|        # open a "branch reset group"
        ( \[ [^]]+ ] (?: \h* \[ [^]]+ ] )*+ ) # one or more chords in capture group 1
        \h*
        ( [^[
    ]* (?<=\S) )  # eventual lyrics (group 2)
      |                      # OR
        ()                   # no chords (group 1)
        ( [^[
    ]* [^\s[] )   # lyrics (group 2)
    )          # close the "branch reset group"
    ~x';
    
    if (preg_match_all($pattern, $input, $matches, PREG_SET_ORDER)) {
        $result = array_map(function($i) { return [$i[1], $i[2]]; }, $matches);
        print_r($result);
    }
    

    demo

    A branch reset group preserves the same group numbering for each branch.

    Note: feel free to add:

    if (empty($i[1])) $i[1] = null;    
    if (empty($i[2])) $i[2] = null;
    

    in the map function if you want to obtain null items instead of empty items.

    Note2: if you work line by line, you can remove the from the pattern.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100