PHP从网站获取某些信息但来自所有页面

I want to extract a href attribute but this attributes especially has mailto function. and i want to do this not just for one link but all links belongs to main webpage.

I tried this:

<?php

$url = "https://www.omurcanozcan.com";

$html = file_get_contents( $url);

libxml_use_internal_errors( true);
$doc = new DOMDocument;
$doc->loadHTML( $html);
$xpath = new DOMXpath( $doc);
$node = $xpath->query( "//a[@href='mailto:']")->item(0);


echo $node->textContent; // This will print **GET THIS TEXT**

 ?>

I expect for instance a code is

<a href='mailto:omurcan@omurcanozcan.com'>omurcan@omurcanozcan.com</a>

I want to echo

<p>omurcan@omurcanozcan.com</p>

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanlu1950 2019-06-03 14:31
关注
The main problem is that in your XPath, you are checking for

//a[@href='mailto:']

This will looks for a href attribute which only contains mailto:, what you want is where the href starts with mailto:, you can do this using starts-with()...

$node = $xpath->query( "//a[starts-with(@href,'mailto:')]")->item(0);

The second thing is that I don't think your page is fully loaded when you get the content, a common test I do is to save the HTML once I've loaded it so I can check it out first...

$url = "https://www.omurcanozcan.com"; $html = file_get_contents( $url); file_put_contents("a.html", $html);

If you then look in a.html you can see the HTML it is using, in the content I cannot see any mailto: links.
解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

php使用curl爬取页面,json数据获取不完整 json php 有问必答
2021-08-02 16:03

回答 2 已采纳你访问的是同一个url?你爬取的是列表内容。并没有去请求详细内容
将某些用户重定向到不同的页面（PHP） php
2017-01-18 17:54

回答 2 已采纳 You need to cut off your email string, if you want to compare only the last part. $cmp = ".abc@em
PHP从另一个页面获取数据 php
2014-11-15 17:05

回答 1 已采纳 $_SESSION['some_array'] = array(1,3,4); as @u_mulder said. Thanks.
基于PHP+MySQL的小型购物系统网站
2022-05-19 00:33

biyezuopinvip的博客用户模块主要实现了一个简单的购物网站的用户购物过程，完整的购物过程为：用户注册 → 用户登录 → ...管理员模块主要实现了管理员的简单商品管理功能，主要功能有：管理员的注册和登录添加商品删除商品查看所有订单。
如何使用PHP从字符串中获取某些特定数据？ php
2017-03-09 14:56

回答 1 已采纳 I believe you are wanting to use the stristr() function. This will search a string for a proved st
php禁止当前页面显示某些字符 php
2018-01-16 09:02

回答 2 已采纳全局过滤 http://blog.41ms.com/post/41.html
Zabbix中php7.4升级到8.1,访问页面报错 docker php
2021-12-04 12:50

回答 2 已采纳写个 php 测试页，访问看看
PHP与Web页面交互
2022-01-03 00:11

菇毒的博客 PHP与Web页面交互基础知识
PHP cURL字符串获取字符串某些部分之前和之后的所有内容 php
2018-02-28 09:25

回答 2 已采纳 this is a json string, so you can simply do: $data = json_decode($string); echo $data->id;
PHP：如何获取请求页面并获取正文和http错误代码 php
2019-08-07 12:59

回答 2 已采纳 You should try with curl $ch = curl_init('https://httpstat.us/404'); curl_setopt($ch, CURLOPT_RET
如何在所有页面中更新包含php而不刷新 php
2016-10-09 17:51

回答 2 已采纳 You cannot update only part of a page with PHP without using AJAX. The Javascript AJAX call will
PHP页面设置独立访问密码(页面加密)
2022-02-01 13:49

软希源码的博客对某些php页面设置单独的访问密码,如果密码不正确则无法查看内容,相当于对页面进行了一个加密。代码如下： <?php header('Content-type:text/html;charset=utf-8'); $password = "1234"; // 这里是正确密码 ...
如何从php中获取数据库中的所有数据 mysql php sql
2014-10-17 11:04

回答 4 已采纳 You could just use ->fetchAll() method with this: $result = $stmt->fetchAll(PDO::FETCH_ASSO
PHP如何获取网页源码？
2022-01-23 11:20

晚风资源组的博客用CURL，某些不太懂得朋友可能会发现CURL也获取不到，其实是你没配置对有一个属性 curl_setopt($ch, CURLOPT_ENCODING, "gzip,deflate"); 这样就可以了完整代码： PHP根据URL获取网页源码 - 王昊的个人博客 ...
php 修改pdf文件内容,pdf中怎么擦掉某些内容
2021-05-01 12:08

荒腔走兽的博客 pdf中怎擦掉某些内容的方法：首先安装pdf编辑器，并打开pdf文档；然后选择“编辑内容”，并选择需要删除的对象；最后按“delete”键即可删除。本文操作环境：Windows7系统，迅捷PDF转换器&&PDF 1.7，Dell G3...
没有解决我的问题, 去提问

悬赏问题

¥15 执行 virtuoso 命令后，界面没有，cadence 启动不起来
¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
¥20 有关区间dp的问题求解
¥15 多电路系统共用电源的串扰问题
¥15 slam rangenet++配置
¥15 有没有研究水声通信方面的帮我改俩matlab代码
¥15 ubuntu子系统密码忘记
¥15 保护模式-系统加载-段寄存器
¥15 电脑桌面设定一个区域禁止鼠标操作
¥15 求NPF226060磁芯的详细资料

PHP从网站获取某些信息但来自所有页面

1条回答 默认 最新

悬赏问题

1条回答默认最新