使用Simple HTML DOM Parser从HTML中提取数据

For a college project, I am creating a website with some back end algorithms and to test these in a demo environment I require a lot of fake data. To get this data I intend to scrape some sites. One of these sites is freelance.com.To extract the data I am using the Simple HTML DOM Parser but so far I have been unsuccessful in my efforts to actually get the data I need.

Here is an example of the HTML layout of the page I intend to scrape. The red boxes mark the required data.

Screenshot of HTML Code on Freelance.com

Here is the code I have written so far after following some tutorials.

<?php
include "simple_html_dom.php";
// Create DOM from URL
$html = file_get_html('http://www.freelancer.com/jobs/Website-Design/1/');

//Get all data inside the <tr> of <table id="project_table">
foreach($html->find('table[id=project_table] tr') as $tr) {

    foreach($tr->find('td[class=title-col]') as $t) {
        //get the inner HTML
        $data = $t->outertext;
        echo $data;
    }
}

?>

Hopefully someone can point me in the right direction as to how I can get this working.

Thanks.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongxizhe9755 2013-11-07 22:19
关注
The raw source code is different, that's why you're not getting the expected results...

You can check the raw source code using ctrl+u, the data are in table[id=project_table_static], and the cells td have no attributes, so, here's a working code to get all the URLs from the table:

$url = 'http://www.freelancer.com/jobs/Website-Design/1/'; // Create DOM from URL $html = file_get_html($url); //Get all data inside the <tr> of <table id="project_table"> foreach($html->find('table#project_table_static tbody tr') as $i=>$tr) { // Skip the first empty element if ($i==0) { continue; } echo "<br/>\$i=".$i; // get the first anchor $anchor = $tr->find('a', 0); echo " => ".$anchor->href; } // Clear dom object $html->clear(); unset($html);

Demo
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

使用PHP Simple HTML DOM Parser从html中提取dom元素 html php
2016-01-05 19:48

回答 1 已采纳 There are several problems: getElementsByTagName apparently returns a single node, not an array,
使用PHP Simple HTML DOM Parser提取HTML纯 html php
2016-09-25 15:51

回答 1 已采纳 $escapedHtmlChars = ""; $htmlElements = ""; $html = file_get_html('https://my.playstation.com/obai
使用php Simple HTML DOM Parser php
2018-11-06 13:51

回答 2 已采纳 The animal names are in the attributes, you can use getAttribute: $html = file_get_html('zoo.xml'
探索HTML解析的神器：Simple Html Dom Parser for PHP
2024-08-13 08:06

萧崧锟的博客 simple_html_dom项目地址:https://gitcode.com/gh_mirrors/si/simple_html_dom 在Web开发中，有时我们需要对HTML进行深入的操作和提取信息，这通常是一项挑战。而今天，我们将向您推荐一款强大的PHP库——Simple ...
如何在Simple HTML Dom Parser中处理http错误 html php
2017-01-08 21:29

回答 1 已采纳 Nevermind, I feel really stupid now. All I had to do was if($r1pro){ <--do normal stuff if no
我怎么找到这个div？（PHP Simple HTML DOM Parser） html php
2017-12-14 13:13

回答 2 已采纳 I made a change to your code where I am searching for the class: <?php include('simple_htm
使用Simple HTML DOM Parser检索值 php
2014-06-27 09:48

回答 1 已采纳 Assuming the DOM is in $dom: $value = $dom->find("td.tabData a", 0)->plainText
Simple HTML DOM Parser 教程
2024-08-12 08:46

杜璟轶Freda的博客 Simple HTML DOM Parser 教程 simple_html_dom项目地址:...Simple HTML DOM Parser 是一个PHP库，用于解析HTML文档并提供类似于jQuery的API来操作DOM。这个库适用于处理不合规或非结构化的HTML代码，它能够帮你轻松...
我可以使用SIMPLE HTML DOM PARSER解析php吗？ html php
2011-01-29 10:00

回答 2 已采纳 If it's just a PHP file with a .php ending, you can parse it no problem. The file extension doesn'
如何在PHP Simple HTML DOM Parser中过滤图像类型 php
2011-11-30 07:33

回答 2 已采纳 Just check if the src of each item ends with .gif: foreach($html->find('img') as $element) {
使用php从html页面中的特定行提取数据 html php
2016-08-05 08:42

回答 2 已采纳 Store the file source into an array with $source = file('filename.html'); and extract line 12 and
推荐开源项目：PHP Simple HTML DOM Parser
2024-06-11 09:57

张姿桃Erwin的博客推荐开源项目：PHP Simple HTML DOM Parser...PHP Simple HTML DOM Parser是一个轻量级且易于使用的库，专门设计用于帮助开发者解析和操作HTML文档。它允许您通过CSS选择器轻松地找到并修改页面元素，极大地简化了PH...
PHP解析 Simple HTML DOM Parser类
2014-12-05 11:30

高效快速分析和获取HTML内容，对抓取过来的内容进行分析和特定内容提取很方便
php中html解析器,PHP Simple HTML DOM解析器
2021-04-22 09:36

AshdollR的博客 Simple HTML DOM parser帮我们很好地解决了使用 php html 解析问题。可以通过这个php类来解析html文档，对其中的html元素进行操作 (PHP5+以上版本)。解析器不仅仅只是帮助我们验证html文档；更能解析不符合W3C标准...
Laravel开发-php-simple-html-dom-parser
2019-08-28 12:36

在实际使用中，我们可以通过以下方式调用`php-simple-html-dom-parser`： ```php use Sunra\PhpSimple\HtmlDomParser; $html = '<html><body><h1>Hello, World!</h1></body></html>'; $dom = HtmlDomParser::str_...
探索高效解析HTML的新境界：PHP Simple HTML DOM Parser深度剖析
2024-06-03 10:05

蒋素萍Marilyn的博客 simplehtmldomThis is a mirror of the Simple HTML DOM Parser at项目地址:https://gitcode.com/gh_mirrors/si/simplehtmldom 在当今的互联网时代，处理网页数据已成为开发中的常规操作，而PHP Simpl...
推荐文章：深入浅出PHP Simple HTML DOM Parser —— 网络数据抓取利器
2024-08-24 08:36

孔振冶Harry的博客推荐文章：深入浅出PHP Simple HTML DOM Parser —— 网络数据抓取利器 simplehtmldomThis is a mirror of the Simple HTML DOM Parser at项目地址:https://gitcode.com/gh_mirrors/si/simplehtmldom 在数字时代，...
如何使用 PHP Simple HTML DOM Parser 轻松获取网页中的特定数据
2024-08-01 13:59

亿牛云爬虫专家的博客网页数据的抓取已经成为数据分析、市场调研等...今天，我们将探讨如何使用 PHP Simple HTML DOM Parser 轻松获取网页中的特定数据。PHP Simple HTML DOM Parser 是一个轻量级库，允许我们轻松地解析和抓取 HTML 内容。
PHP Advanced HTML DOM Parser:简单html dom的直接替代品-开源
2021-05-13 08:10

**PHP高级HTML DOM解析器：简单HTML DOM的直接替代品** 在PHP开发中，处理HTML文档是一...它不仅提供了`Simple HTML DOM Parser`所没有的特性，还在性能和用户体验方面有所改进，是进行网页数据提取和处理的理想工具。
PHP Simple HTML DOM Parser: 简易且高效的HTML解析库
2024-03-26 09:44

郁英忆的博客 PHP Simple HTML DOM Parser: 简易且高效的HTML解析库去发现同类优质开源项目:https://gitcode.com/ 是一个轻量级的PHP库，专为解析和操作HTML文档而设计。它提供了简洁的API，使开发者能够轻松地对HTML文档进行...
没有解决我的问题, 去提问

悬赏问题

¥15 做个有关计算的小程序
¥15 MPI读取tif文件无法正常给各进程分配路径
¥15 如何用MATLAB实现以下三个公式（有相互嵌套）
¥30 关于#算法#的问题：运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题求各位帮我解答一下
¥15 setInterval 页面闪烁，怎么解决
¥15 如何让企业微信机器人实现消息汇总整合
¥50 关于#ui#的问题：做yolov8的ui界面出现的问题
¥15 如何用Python爬取各高校教师公开的教育和工作经历
¥15 TLE9879QXA40 电机驱动
¥20 对于工程问题的非线性数学模型进行线性化

使用Simple HTML DOM Parser从HTML中提取数据

1条回答 默认 最新

悬赏问题

1条回答默认最新