drgzmmy6379
2017-04-05 12:51 阅读 111
已采纳

从mysql数据库字符串字段中提取所有文件链接URL到列表

I need get a list of all file url in one of my database field.

mysql database, article table

`id` | `subject` | `content`

the value of content is html text with one or more file url, for example:

<p>this is the answer for ..., you can refer to below screenshot:</p>
<img src="http://the_url_of_image_here/imagename.jpg/>

<p>or refer to below document</p>

<a href="http://the_url_of_doc_here/guide.ppt>guide</a>
<a href="http://the_url_of_doc_here/sample.dox>sample</a>

there are 2 types of files

  1. image,with extension jpg,jpeg,png,bmp,gif
  2. document, with extension doc,docx,ppt,pptx,xls,xlsx,pdf,xps

I did a lot goolge, look like it's hard to do it only with mysql, php would make it easy, I write my codes but it can not work.

Thanks cars10, I solved it.

function export_articles_link()
{
    global $date_from, $date_to;
    $filename = "kb_articles_link_".$date_from."_".$date_to.".xlsx";
    header('Content-disposition: attachment;        filename="'.XLSXWriter::sanitize_filename($filename).'"');
    header("Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet");
    header('Content-Transfer-Encoding: binary');
    header('Cache-Control: must-revalidate');
    header('Pragma: public');
    $query = 'SELECT `content` FROM `kb_articles` WHERE ((DATE(`dt`) BETWEEN \'' . $date_from . '\' AND \'' . $date_to . '\') AND (`content` LIKE \'%<img src=%\' or `content` LIKE \'%<a href="http:%\')) order by id asc';
    $result = mysql_query($query);
    $writer = new XLSXWriter(); 
    $img_list = array();
    while ($row=mysql_fetch_array($result))
    {
        $text = $row['content'];
        preg_match_all('!http://.+\.(?:jpe?g|png|gif|ppt?|xls?|doc?|pdf|xdw)!Ui', $text, $matches);
        $img_list = $matches[0];
        foreach ($img_list as $url)
        {
        $writer->writeSheetRow('Sheet1', array($url)); // if more than one url it will be put on first column
        }
    };
    $writer->writeToStdOut();
    exit(0);
}

share with others who need a work sample,hope it save your time.

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

1条回答 默认 最新

  • 已采纳
    dongwei3866 dongwei3866 2017-04-05 13:45

    You should change your central loop to something like

    $image_list=array(); // prepare an empty array for collection
    while ($row=mysql_fetch_array($result))
    {
        $text = $row['content'];
        preg_match_all('!http://.+?\.(?:jpe?g|png|gif|pptx?|xlsx?|docx?|pdf|xdw)!i', $s, $matches);
        $img_list=array_merge($image_list,$matches[0]);  // append to array       
    }
    $writer->writeSheetRow('Sheet1', $image_list);
    

    Since you did not clearly specify what was wrong I just guessed and went ahead: The regular expression is slightly different from your original and also the way I structured the loop (yes, only one is needed). preg_match_all only needs to be called only once for each $text and then you merge the results from $matches[0] into your $img_list-array.

    I also removed your U-modifier, which was inverting the "greediness" of the whole regexp. Instead I added a ? after the + to make this one quantifier "non-greedy".

    I prepared a little minimalistic demo here: http://rextester.com/JDVMS87065

    点赞 评论 复制链接分享

相关推荐