duanmao1975 2012-12-10 15:30
浏览 95
已采纳

用于分解配方列表元素的RegEx语法

I am processing a list of recipe ingredients, an example of which looks like this:

Peanuts, Wheat Starch, Vegetable Oil, Modified Starch, Sugar, Mumbai Spice Flavour [Onion Powder, Herbs and Spices (Cumin, Curry Powder, Chilli Powder, Coriander), Garlic Powder, Potassium Chloride, Yeast Extract, Yeast Powder (contains Gluten and Barley), Citric Acid, Flavouring (contains Barley, Soya, Wheat, Celery)], Rice Flour, Salt, Colours (Concentrated Beetroot Juice, Curcumin, Paprika Extract).

I wish to explode each ingredient into an array (using PHP), seperated by commas. The problem I have is that some ingredients are sub-divided. In this example, the components of 'Mumbai Spice Flavour' are delimited by square brackets, and contains some ingredients, the sub-ingredients are which are then delimited by regular brackets.

A standard:

explode(",", $recipeStr) 

will give me a very messy result, so I'm looking for a Regular Expression statement that will explode each distinct element into an array, to take account of the optional square brackets, and optional sub-brackets. It also needs to be able to handle brackets that are not nested within square brackets.

The desired result would be an array list that looks like:

-Peanuts
-Wheat Starch
-Vegetable Oil
-Modified Starch
-Sugar
-Mumbai Spice Flavour [Onion Powder, Herbs and Spices (Cumin, Curry Powder, Chilli Powder, Coriander), Garlic Powder, Potassium Chloride, Yeast Extract, Yeast Powder (contains Gluten and Barley), Citric Acid, Flavouring (contains Barley, Soya, Wheat, Celery)]
-Rice Flour
-Salt
-Colours (Concentrated Beetroot Juice, Curcumin, Paprika Extract)

I am not very good at RegEx syntax, and so if any answer could also explain the syntax logic that would be greatly appreciated.

  • 写回答

3条回答 默认 最新

  • dongxuan2015 2012-12-10 15:43
    关注

    This seems to work (but maybe it's not the best solution) :)

    preg_match_all('/\w[\w\s-]*(?:\[.*?\]|\(.*?\))?/', $string, $matches);
    

    It's checking word character followed by 0 or more characters/spaces/dashes (add anything you want to capture to this group), then followed either by [...] or (...) or nothing (but brackets of the same type cannot be nested

    So you can have:

    - something
    - anything [...]
    - something different (...)
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 python天天向上类似问题,但没有清零
  • ¥30 3天&7天&&15天&销量如何统计同一行
  • ¥30 帮我写一段可以读取LD2450数据并计算距离的Arduino代码
  • ¥15 C#调用python代码(python带有库)
  • ¥15 矩阵加法的规则是两个矩阵中对应位置的数的绝对值进行加和
  • ¥15 活动选择题。最多可以参加几个项目?
  • ¥15 飞机曲面部件如机翼,壁板等具体的孔位模型
  • ¥15 vs2019中数据导出问题
  • ¥20 云服务Linux系统TCP-MSS值修改?
  • ¥20 关于#单片机#的问题:项目:使用模拟iic与ov2640通讯环境:F407问题:读取的ID号总是0xff,自己调了调发现在读从机数据时,SDA线上并未有信号变化(语言-c语言)