dongxiaowei_1234 2014-07-31 08:18 采纳率: 0%
浏览 15
已采纳

字符串中的带状链接

I'm trying to create a strict chat filter for those that choose the strict version. I would like to block all URLs except a few whitelisted ones (youtube, prntscr, facebook, etc) to prevent people from sending porn, IP grabbers, virus downloads, etc.

I know I could do this with a few extra lines of code, however is there a way to do this using a regular expression? I would like to have it check if the string contains a URL but the URL is not a whitelisted one (youtube.com for example).

I'm looking to implement this in both Python and PHP, but I only need the regex since I know how to simply use regular expressions in both languages.

Thanks

Edit: to be clear - this is for a strict mode on a chat system. The messages a user sends could be anything from "Hello" to "http://unsafelink.com go there!!"

  • 写回答

1条回答 默认 最新

  • dsa89029 2014-07-31 08:30
    关注

    Check the snippet below

    $message = array(
      "Dude I saw her on youtube",
      "I just opened an account on youtuber.com",
      "I'm watching an amazing prank, check this out youtube.com/gfsddfh784",
      "Dude, isn't this girl forbidden.com/hot-chick/123 Mery from our school?",
      "Take a look google.com?search=how%20to%20hack%20a%20wireless",
      "Ask someone on stackoverflow.com :p",
      "I found this great snippet on stackoverflow!",
      "He's all day on xxx.net"
      );
    
    $url = '/(((https?:\/\/)?www)?\.?[a-z0-9]+\.[a-z0-9]+[a-z0-9\-\/?&#%=]+)/';
    $whitelist = "/\b(youtube|stackoverflow|google|twitter|facebook|prntscr)\b/";
    
    // check messages like this
    foreach ($message as &$line){
      if(preg_match($url, $line, $match)){
        echo $match[0] , preg_match($whitelist, $match[0]) ? " -> Safe" : " -> Unsafe" , '<br />';
      } 
    }
    
    echo "<hr />";
    
    // or like this
    foreach ($message as &$line){
      if(preg_match($url, $line, $match) && !preg_match($whitelist, $match[0])){
        echo $match[0]  . " -> Unsafe" . '<br />';
      } 
    }
    

    Output:

    youtuber.com -> Unsafe
    youtube.com/gfsddfh784 -> Safe
    forbidden.com/hot-chick/123 -> Unsafe
    google.com?search=how%20to%20hack%20a%20wireless -> Safe
    stackoverflow.com -> Safe
    xxx.net -> Unsafe
    ------------------------------------------------------------
    youtuber.com -> Unsafe
    forbidden.com/hot-chick/123 -> Unsafe
    xxx.net -> Unsafe
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 什么设备可以研究OFDM的60GHz毫米波信道模型
  • ¥15 不知道是该怎么引用多个函数片段
  • ¥15 爬取1-112页所有帖子的标题但是12页后要登录后才能 我使用selenium模拟登录 账号密码输入后 会报错 不知道怎么弄了
  • ¥30 关于用python写支付宝扫码付异步通知收不到的问题
  • ¥50 vue组件中无法正确接收并处理axios请求
  • ¥15 隐藏系统界面pdf的打印、下载按钮
  • ¥15 基于pso参数优化的LightGBM分类模型
  • ¥15 安装Paddleocr时报错无法解决
  • ¥15 python中transformers可以正常下载,但是没有办法使用pipeline
  • ¥50 分布式追踪trace异常问题