Similar question might be asked many times but I have a bit complex one.
I know when we want to parse only the text between <title>
tag in this scenario,
<title>My work</title>
<p>This is my work.</p> <p>Learning regex.</p>
we can form a Regex like this:
>([^<]*)<
But that works only because the <title>
tag is on the top. But if the tag is the second one, it won't work.
Okay, my scenario is,
<td class="td1" headers="searchth1">JAVA1</td>
<td class="td2" headers="searchth2">JAVA2</td>
<td class="td3" headers="searchth3">JAVA3</td>
<td class="td1" headers="searchth1">PHP1</td>
<td class="td2" headers="searchth2">PHP2</td>
<td class="td3" headers="searchth3">PHP3</td>
There are many similar tags in the file, and I want to retrieve only the text between <td class="td1" headers="searchth1">
and </td>
tags.
And, I've used '#<td class="td1" headers="searchth1">(.*)</td>#'
, which is working fine. But it is also including all other <td>
tags in the output, which I don't want.
I want only the texts Java1
and PHP1
and I guess if I could able to retrieve the text between the tags by excluding the tags, I may acieve it.
Am I correct? or Wrong? If so, how to achieve what I want?
Thanks in advance!!