this was my original question I was stuck and tried to solve my problem by trying something and got stuck again
I need to extract name of candidate and his id from a pdf ,so after using pdfparser I extracted the text and downloaded the html page using php
<?php $filename = 'filename.html'; header('Content-disposition: attachment; filename=' . $filename); header('Content-type: text/html'); // ... the rest of your file ?> <?php // Include Composer autoloader if not already done. include 'C:\Users\amite\Downloads\pdfparser-master (1)\pdfparser-master\vendor\autoload.php'; // Parse pdf file and build necessary objects. $parser = new \Smalot\PdfParser\Parser(); $pdf = $parser->parseFile('C:\Users\amite\Desktop\Data\001.ApplicationForm-CSE-2015-1-omokop (3).pdf'); $text = $pdf->getText(); echo $text; ?>
I did this cause the info I need that was on line 12 and 13 of the view source page and this was was with all the pdf's I need ,so after downloading the html file I used the code below to see the source page of html file
<?php show_source("filename.html"); ?>
now when I run the above program I got the source page of html file which I downloaded, now I need to extract data from line 12 and 13 , the output of program looks like this :-
<html> text text text text text text
there are no tags except html tag and info I need is on line 12,13, if you need any clarification please ask me I will tell you. how should I extract text from line 12,13, if there is another way tell me pls. I am stuck again, if the question is vague I will clarify it or improve it, please help me.