duandongji2231
duandongji2231
2016-08-05 08:42

使用php从html页面中的特定行提取数据

已采纳

this was my original question I was stuck and tried to solve my problem by trying something and got stuck again

I need to extract name of candidate and his id from a pdf ,so after using pdfparser I extracted the text and downloaded the html page using php

<?php
$filename = 'filename.html';
header('Content-disposition: attachment; filename=' . $filename);
header('Content-type: text/html');
// ... the rest of your file
?>
<?php

// Include Composer autoloader if not already done.
include 'C:\Users\amite\Downloads\pdfparser-master (1)\pdfparser-master\vendor\autoload.php';

// Parse pdf file and build necessary objects.
$parser = new  \Smalot\PdfParser\Parser();
$pdf    = $parser->parseFile('C:\Users\amite\Desktop\Data\001.ApplicationForm-CSE-2015-1-omokop (3).pdf');

$text = $pdf->getText();
echo $text;


?>

I did this cause the info I need that was on line 12 and 13 of the view source page and this was was with all the pdf's I need ,so after downloading the html file I used the code below to see the source page of html file

<?php
show_source("filename.html");
?> 

now when I run the above program I got the source page of html file which I downloaded, now I need to extract data from line 12 and 13 , the output of program looks like this :-

<html>
 text
 text
text
text 
text 
text   

there are no tags except html tag and info I need is on line 12,13, if you need any clarification please ask me I will tell you. how should I extract text from line 12,13, if there is another way tell me pls. I am stuck again, if the question is vague I will clarify it or improve it, please help me.

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

2条回答

  • duangou1551 duangou1551 5年前

    Store the file source into an array with $source = file('filename.html'); and extract line 12 and 13 via array index 11 and 12 like this echo $source[11]; //line 12

    点赞 评论 复制链接分享
  • dongyongyin5339 dongyongyin5339 5年前

    Is this what you need?

    <?php
    $str = "1text
     2text
    3text
    4text 
    5text 
    6text
    7text 
    8text 
    9text
    10text 
    11text 
    12text
    13text
    ";
    $k = array_slice(explode("
    ",$str),11,1);
    print_r($k);
    
    点赞 评论 复制链接分享