dousi2013 2017-08-09 21:04
浏览 92

从MYSQL数据列中提取URL有效或无效

I have a column within my MYSQL database headline TEXT NOT NULL I would like to go through this column for each of my records and extract the URL if it has one. The URL will be stored in another column article_url VARCHAR (225) NULL. If there is no URL then it would just put NULL in the column.

Lastly, I would like this to update each time a new record is inserted into the database. Below is what I have thus far.

UPDATE: here is an example of the headline column of my data Drastic decline in Social Sciences intake *: http:// bit.ly/2vXzPhQ pic.twitter.com/PAZvG3tX17 as you can see there is a URL in the data set however the URL is broken, all URLs are broken with a space between http:// and the rest of the URL. So I need to get a function in PHP that can find the Valid or Invalid URL when it gets the Invalid URL it will fix it.

this is what I have thus far.

$result = $conn->query($sql);

$reg_exp = "/^(http(s?): \/\/)?(www\.)+[a-zA-Z0-9\.\-\_]+(\.[a-zA-Z]{2,3})+(\/[a-zA-Z0-9\_\-\s\.\/\?\%\#\&\=]*)?$/";

if ($result->num_rows > 0) 
{
    // output data of each row
    while($row = $result->fetch_assoc()) 
    {
        echo "id: " . $row["id"]. " headline: " . $row["headline"]. "<br>";
        if(preg_match($reg_exp, $row["headline"]) == TRUE)
        {
            echo "URL is valid format";
        }
        else
        {
            echo "URL is invalid format";
        }
    }
} 
else 
{
    echo "0 results";
}
  • 写回答

1条回答 默认 最新

  • dqqn32019 2017-08-10 19:56
    关注

    For a MySQL only solution you have four parts to parts solve:

    1. one to run as update on existing data entries

    2. one to run as AFTER INSERT Trigger

    3. For both 1+2 you need the correct REGEX-Syntax of MySQL. The php-regex might be a good starting point to do a SELECT headline, Headline REGEXP '(insert your regex here)' FROM table;. with that SELECT you can iterative check on your database, if your regex is correct.

    4. In this answer you find a version of regexp_replace, which you can add to your database. This can be used in the UPDATE table SET artice_url = IF(...REGEXP...) regexp_replace(headline,...) (the ... needs also to be filled correctly) as well as in the Trigger.

    评论

报告相同问题?

悬赏问题

  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 关于大棚监测的pcb板设计
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)
  • ¥15 Vue3地图和异步函数使用