doushi9780 2015-09-03 16:01
浏览 172
已采纳

如何从浏览器中读取.vcf文件?

I am trying to retrieve all the email addresses from the exhibitors of the IFA Berlin. This is pretty easy to crawl though.

But as a tricky part, they just allow us to download a .vcf file or to send an email (throught their server I guess). I would like to find that email address without downloading that vcf file. Otherwise I could download it and read it easily using PHP (since my crawler is also in PHP).

This is also my first question here after lurking for years! Nice meeting you guys.

  • 写回答

1条回答 默认 最新

  • donglan9517 2015-09-03 16:56
    关注

    How to read .vcf file from browser?

    This file will always be a file download and never displayed in a browser. One way to make it work is to setup a custom browser extension, which temporary stores the file and parses the microformat and displays the information.

    PHP scraping approach

    There are vcard parsers out there: https://github.com/nuovo/vCard-parser but i think you could base this on a RegExp solution: /EMAIL;INTERNET:(.*)/.

    Let's pretend, your first scraping run gives you a list of attendee IDs, then your second (vcard) scraping run could fetch and extract the name and emails by ID:

    <?php
    
    function getVcard($id) {
        return file_get_contents('http://www.virtualmarket.ifa-berlin.de/?Action=attendeeVcard&id=' . $id);
    }
    
    function getEmailFromVcard($vcard)
    {
        preg_match('/EMAIL;INTERNET:(.*)/', $vcard, $matches);
        if(isset($matches[1])) {
            return $matches[1];
        }
    }
    
    function getNameFromVcard($vcard)
    {
        preg_match('/N:(.*);;/', $vcard, $matches);
        if(isset($matches[1])) {
            $array = explode(';', $matches[1]);
            $name = trim($array[1]) . ' ' . trim($array[0]);
            return $name;
        }
    }
    
    $id = 1775586;
    
    $vcard = getVcard($id);
    $email = getEmailFromVcard($vcard);
    $name = getNameFromVcard($vcard);
    
    echo $name . ' ' . $email;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配
  • ¥15 Power query添加列问题
  • ¥50 Kubernetes&Fission&Eleasticsearch
  • ¥15 報錯:Person is not mapped,如何解決?