使用php从html页面中的特定行提取数据

this was my original question I was stuck and tried to solve my problem by trying something and got stuck again

I need to extract name of candidate and his id from a pdf ,so after using pdfparser I extracted the text and downloaded the html page using php

<?php
$filename = 'filename.html';
header('Content-disposition: attachment; filename=' . $filename);
header('Content-type: text/html');
// ... the rest of your file
?>
<?php

// Include Composer autoloader if not already done.
include 'C:\Users\amite\Downloads\pdfparser-master (1)\pdfparser-master\vendor\autoload.php';

// Parse pdf file and build necessary objects.
$parser = new  \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('C:\Users\amite\Desktop\Data\001.ApplicationForm-CSE-2015-1-omokop (3).pdf');

$text = $pdf->getText();
echo $text;


?>

I did this cause the info I need that was on line 12 and 13 of the view source page and this was was with all the pdf's I need ,so after downloading the html file I used the code below to see the source page of html file

<?php
show_source("filename.html");
?>

now when I run the above program I got the source page of html file which I downloaded, now I need to extract data from line 12 and 13 , the output of program looks like this :-

<html>
 text
 text
text
text 
text 
text

there are no tags except html tag and info I need is on line 12,13, if you need any clarification please ask me I will tell you. how should I extract text from line 12,13, if there is another way tell me pls. I am stuck again, if the question is vague I will clarify it or improve it, please help me.

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duangou1551 2016-08-05 08:55
关注
Store the file source into an array with $source = file('filename.html'); and extract line 12 and 13 via array index 11 and 12 like this echo $source[11]; //line 12

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

PHP、web前端开发-CMS系统前台的数据表设计.pptx
2022-12-16 12:38

- 使用mysqli_fetch_assoc()函数提取查询结果的一行数据，转换为关联数组。 - 对于多行数据，可以使用while循环配合mysqli_fetch_assoc()逐行提取，并将结果存储在一个二维数组中。在CMS系统的前台页面，例如...
HtmlDom.rar_HTML dom_PHP HTMLdom
2022-09-22 18:22

HTML DOM（Document Object Model）是HTML和XML文档的树型结构表示，它为网页提供了一种标准的编程接口，使得开发者可以通过JavaScript或PHP等语言来操作网页中的元素，实现动态更新、添加、删除以及修改页面内容。...
基于JavaScript与PHP构建的网页内容搜索与抓取工具-用户输入处理与数据传递-网页内容抓取与信息检索-前端用户交互与后端数据爬取-实时响应与数据处理-动态网页内容解析与提取.zip
2025-11-21 09:53

通过编写或使用现有的爬虫程序，可以访问互联网上的网页，并获取页面内容。在这个过程中，通常需要遵循robots.txt协议以确保不违反网站的爬取策略。抓取到的网页内容可能包含多种格式的数据，如HTML、XML、JSON等，...
PHP preg_match_all 获取html中固定的标签内容
2022-06-26 21:50

夏已微凉、的博客 PHP preg_match_all 获取html中固定的标签内容
php传递给前端的参数,详解前端在html页面之间传递参数的方法
2021-04-04 08:33

到处有战场真是烦的博客因此跳转页面时，我们需要传递一个参数过去，这样我们才能通过这个参数进行数据请求，然后根据后台返回的数据来生成页面。因此，通过a标签跳转的方式，肯定是行不通的。我们经常写form表单，提交时，可以传...
XPath在数据采集中的应用：从XML和HTML中提取数据
2023-10-10 11:33

小小卡拉眯的博客 XPath，全称XML Path Language，是一种在XML文档中查找信息的语言。...XPath是一种强大的语言，用于在XML和HTML文档中定位和提取数据。它提供了一组丰富的路径选择和谓词过滤器，可以灵活地选择目标节点或节点集合。
基于PHP的ABC网络硬盘带提取码PHP版源码.zip
2023-07-23 00:05

提取码通常是一种访问控制机制，用户在下载文件前需要输入正确的代码才能访问，确保了只有知道特定提取码的人才能下载文件。在PHP编程中，实现这样的功能需要掌握以下几个关键知识点： 1. **文件上传**：PHP中的`...
PHP实例开发源码—LogAnalyzer Web前端工具.zip
2022-11-11 22:24

4. **前端界面**：LogAnalyzer作为Web工具，其前端通常使用HTML、CSS和JavaScript构建，可能使用Bootstrap或Vue.js等框架提升用户体验。开发者可能使用PHP的模板引擎（如Smarty）将后端数据渲染成视图。 5. **数据...
PHP获取数据库表中的数据插入新的表再原删除数据方法
2020-12-19 21:40

在本文中，我们将探讨一种PHP方法，用于从一个数据库表中获取数据，将这些数据插入到一个新的表中，然后在原始表中删除这些数据。这个过程通常在数据迁移、备份或整理时非常有用。首先，我们需要了解路由和控制器...
基于PHP的HTMLJS互换工具源码.zip
2023-08-27 23:37

3. **HTMLJS互换**：此工具可能提供了将HTML中的JavaScript代码提取出来，或者将JavaScript逻辑内联到HTML中的功能。这可能是为了优化代码结构，提高页面加载速度，或者便于代码管理和维护。 4. **PHP处理HTML和...
没有解决我的问题, 去提问

使用php从html页面中的特定行提取数据

2条回答 默认 最新

2条回答默认最新