doumingchen3628 2014-05-15 13:55
浏览 33
已采纳

在太空之后获得一部分字符串

I'm receiving string from the Wikipedia APi which look like this:

{{Wikibooks|Wikijunior:Countries A-Z|France}} {{Sister project links|France}} * [http://www.bbc.co.uk/news/world-europe-17298730 France] from the [[BBC News]] * [http://ucblibraries.colorado.edu/govpubs/for/france.htm France] at ''UCB Libraries GovPubs'' *{{dmoz|Regional/Europe/France}} * [http://www.britannica.com/EBchecked/topic/215768/France France] ''Encyclopædia Britannica'' entry * [http://europa.eu/about-eu/countries/member-countries/france/index_en.htm France] at the [[European Union|EU]] *{{Wikiatlas|France}} *{{osmrelation-inline|1403916}} * [http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=FR Key Development Forecasts for France] from [[International Futures]] ;Economy *{{INSEE|National Institute of Statistics and Economic Studies}} * [http://stats.oecd.org/Index.aspx?QueryId=14594 OECD France statistics] 

I have to use both the actual url's, and the description of the url. So for example, for [http://www.bbc.co.uk/news/world-europe-17298730 France] from the [[BBC News]] I need to have "http://www.bbc.co.uk/news/world-europe-17298730" and also "France] from the [[BBC News]] " but without the brackets, like so "France from the BBC News".

I managed to get the first parts, by doing the following:

if(preg_match_all('/\[http(.*?)\s/',$result,$extmatch)) {           
   $mt= str_replace("[[","",$extmatch[1]);

But I don't know how to go around getting the second part (I'm quite weak at regex unfortunately :-( ).

Any ideas?

  • 写回答

2条回答 默认 最新

  • doucong8553 2014-05-15 15:08
    关注

    A solution not using regex:

    1. Explode the string at '*'
    2. Ditch the parts starting with '{';
    3. Remove all the brackets
    4. Explode the String at 'space'
    5. The first part is the link
    6. Glue back together the rest for the description

    The code:

    $parts=explode('*',$str);
    $links=array();
    foreach($parts as $k=>$v){
        $parts[$k]=ltrim($v);
        if(substr($parts[$k],0,1)!=='['){
            unset($parts[$k]);
            continue;
            }
        $parts[$k]=preg_replace('/\[|\]/','',$parts[$k]);
        $subparts=explode(' ',$parts[$k]);
        $links[$k][0]=$subparts[0];
            unset($subparts[0]);
        $links[$k][1]=implode(' ',$subparts);
        }
    
    echo '<pre>'.print_r($links,true).'</pre>';
    

    The result:

    Array
    (
        [1] => Array
            (
                [0] => http://www.bbc.co.uk/news/world-europe-17298730
                [1] => France from the BBC News 
            )
    
        [2] => Array
            (
                [0] => http://ucblibraries.colorado.edu/govpubs/for/france.htm
                [1] => France at ''UCB Libraries GovPubs'' 
            )
    
        [4] => Array
            (
                [0] => http://www.britannica.com/EBchecked/topic/215768/France
                [1] => France ''Encyclopædia Britannica'' entry 
            )
    
        [5] => Array
            (
                [0] => http://europa.eu/about-eu/countries/member-countries/france/index_en.htm
                [1] => France at the European Union|EU 
            )
    
        [8] => Array
            (
                [0] => http://www.ifs.du.edu/ifs/frm_CountryProfile.aspx?Country=FR
                [1] => Key Development Forecasts for France from International Futures ;Economy 
            )
    
        [10] => Array
            (
                [0] => http://stats.oecd.org/Index.aspx?QueryId=14594
                [1] => OECD France statistics 
            )
    
    )
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置