First of all, I've seen a good deal of similar questions. I know regex or dom can be used, but I can't find any good examples of DOM and regex makes me pull my hair. In addition, I need to pull out multiple values from the html source, some simply contents, some attributes.
Here is an example of the html I need to get info from:
<div class="log">
<div class="message">
<abbr class="dt" title="time string">
DATA_1
</abbr>
:
<cite class="user">
<a class="tel" href="tel:+xxxx">
<abbr class="fn" title="DATA_2">
Me
</abbr>
</a>
</cite>
:
<q>
DATA_3
</q>
</div>
</div>
The "message" block may occur once or hundreds of times. I am trying to end up with data like this:
array(4) {
[0] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
[1] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
[2] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
[3] => array(3) {
["time"] => "DATA_1"
["name"] => "DATA_2"
["message"] => "DATA_3"
}
}
I tried using simplexml but it only seems to work on very simple html pages. Could someone link me to some examples? I get really confused since I need to get DATA_2 from a title attribute. What do you think is the best way to extract his data? It seems very similar to XML extraction which I have done, but I need to use some other method.