doutui7955 2014-05-16 06:33
浏览 146

检查字符串是否包含url并获取url php的内容

Here is a preety presentable example of what i want to do dynamically

Suppose someone enters a string in a textarea like this

"The best search engine is www.google.com."

or maybe

"The best search engine is https://www.google.co.in/?gfe_rd=cr&ei=FLB1U4HHG6aJ8Qfc1YHIBA."

Then i want to highlight the link as stackoverflow does. And also i want to file_get_contents to get one image , a short description and title of the page.

Most probably i wanna check if the string contains a url or not -> two times.

  • On keyup of textarea using jQuery and therefore using the get_file_contents
  • When the string is recieved by php.

Possibly how can i do this?

UPDATE

function parseHyperlinks($text) {
// The Regular Expression filter
$reg_exUrl1 = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
$reg_exUrl2 = "/[\w\d\.]+\.(com|org|ca|net|uk)/";
// The Text you want to filter for urls

// Check if there is a url in the text
if(preg_match($reg_exUrl1, $text, $url)) {

       // make the urls hyper links
       return preg_replace($reg_exUrl1, "<a class=\"content-link link\" href=\"{$url[0]}\">{$url[0]}</a> ", $text);

} else if(preg_match($reg_exUrl2, $text, $url)){

       return preg_replace($reg_exUrl2, "<a class=\"content-link link\" href=\"{$url[0]}\">{$url[0]}</a> ", $text);

}else{

       // if no urls in the text just return the text
       return $text;

}
}
  • This works only if $str='www.google.com is the best' or $str='http://www.google.com is best' but not if $str='http://stackoverflow.com/ and www.google.com is the best'
  • 写回答

1条回答 默认 最新

  • drgdn82648 2014-05-16 12:29
    关注

    First off you create the html then you need to an AJAX to request to the server. Consider this sample codes:

    HTML/jQuery:

    <!-- instead of textarea, you could use an editable div for styling highlights, or if you want, just use a plugin -->
    <div id="textarea" 
        style="
        font-family: monospace;
        white-space: pre;
        width: 300px;
        height: 200px;
        border: 1px solid #ccc;
        padding: 5px;">For more tech stuff, check out http://www.tomshardware.com/ for news and updates.</div><br/>
    <button type="button" id="scrape_site">Scrape</button><br/><br/>
    <!-- i just used a button to hook up the scraping, you can just bind it on a keyup/keydown. -->
    
    <div id="site_output" style="width: 500px;">
        <label>Site: <p id="site" style="background-color: gray;"></p></label>
        <label>Title: <p id="title" style="background-color: gray;"></p></label>
        <label>Description: <p id="description" style="background-color: gray;"></p></label>
        <label>Image: <div id="site_image"></div></label>
    </div>
    
    <script type="text/javascript" src="jquery.min.js"></script>
    <script type="text/javascript">
    $(document).ready(function(){
    
        $('#scrape_site').on('click', function(){
            var value = $.trim($('#textarea').text());
            $('#site, #title, #description').text('');
            $('#site_image').empty();
            $.ajax({
                url: 'index.php', // or you php that will process the text
                type: 'POST',
                data: {scrape: true, text: value},
                dataType: 'JSON',
                success: function(response) {
                    $('#site').text(response.url);
                    $('#title').text(response.title);
                    $('#description').text(response.description);
                    $('#site_image').html('<img src="'+response.src+'" id="site_image" />');
                }
            });
        });
    
        // you can use an editable div so that it can be styled,
        // theres to much code already in the answer, you can just get a highlighter plugin to ease your pain
        $('#textarea').each(function(){
            this.contentEditable = true;
        });
    
    });
    </script>
    

    And on your php that will process, in this case (index.php):

    if(isset($_POST['scrape'])) {
    
        $text = $_POST['text'];
    
        // EXTRACT URL
        $reg_exurl = "/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/";
        preg_match_all($reg_exurl, $text, $matches);
        $usedPatterns = array();
        $url = '';
        foreach($matches[0] as $pattern){
            if(!array_key_exists($pattern, $usedPatterns)){
                $usedPatterns[$pattern] = true;
                $url = $pattern;
            }
        }
    
        // EXTRACT VALUES (scraping of title and descriptions)
    
        $doc = new DOMDocument();
        $doc->loadHTMLFile($url);
        $xpath = new DOMXPath($doc);
        $title = $xpath->query('//title')->item(0)->nodeValue;
        $description = $xpath->query('/html/head/meta[@name="description"]/@content');
        if ($description->length == 0) {
            $description = "No description meta tag :(";
            // Found one or more descriptions, loop over them
        } else {
            foreach ($description as $info) {
                $description = $info->value . PHP_EOL;
            }
        }
    
        $data['description'] = $description;
        $data['title'] = $title;
        $data['url'] = $url;
    
        // SCRAPING OF IMAGE (the weirdest part)
        $image_found = false;
        $data['src'] = '';
        $images = array();
    
        // get all possible images and this is a little BIT TOUGH
        // check for og:image (facebook), some sites have this, so first lets take a look on this meta
        $facebook_ogimage = $xpath->query("/html/head/meta[@property='og:image']/@content");
        foreach($facebook_ogimage as $ogimage) {
            $data['src'] = $ogimage->nodeValue;
            $image_found = true;
        }
    
        // desperation search (get images)
        if(!$image_found) {
            $image_list = $xpath->query("//img[@src]");
            for($i=0;$i<$image_list->length; $i++){
                if(strpos($image_list->item($i)->getAttribute("src"), 'ad') === false) {
                    $images[] = $image_list->item($i)->getAttribute("src");
                }
            }
    
            if(count($images) > 0) {
                // if at least one, get it
                $data['src'] = $images[0];
            }
        }
    
        echo json_encode($data);
        exit;
    
    }
    ?>
    

    Note: Although this is not perfect, you can just use this as a reference to just improved on it and make it more dynamic as you could.

    评论

报告相同问题?

悬赏问题

  • ¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
  • ¥15 cmd cl 0x000007b
  • ¥20 BAPI_PR_CHANGE how to add account assignment information for service line
  • ¥500 火焰左右视图、视差(基于双目相机)
  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统