2013-04-03 05:05


I am using HTMlsimpledom to scrap an old website for a client who no longer has the access to the DB,.

The site is a coding nightmare where there are almost no usable classes, and it uses tables mostley.

( To make things a bit more complicated - the site is in some foreign language which produces strange characters unknown to me. (but thats a sidenote , not a real problem) )

Anyhow - All is fine and working , except that on ONE category of pages, it has a JS anchor trigger to open a phone number in some sort of distorted/ oldfashioned Ajax / JS .

the link presents like this :

<a class="icon1 clickTip" onclick="showIndexPhone(1517);showIndexMobile(1517);" title="">Phone</a>

So of course I went to search the showIndexPhone() function which is :

function showIndexPhone(id)
/*var oContent = oValue = null;
if ((oContent = document.getElementById("IndexPhone_" + id)) && (oValue = document.getElementById("IndexPhoneValue_" + id)))
oContent.innerHTML = oValue.innerHTML;*/
/*var oCounter = new Image();
oCounter.src = "./ctr.asp?id=" + id + "&type=2";*/
var xmlDoc = dbsRequest("./ctr.asp?id=" + id + "&type=2");

So apparently the showIndexPhone() will just construct a simple URL that will look like :

.. and will also show on the firebug console where id is the article id and the type is the info type (phone,fax,name etc..)

and then pass it to yet another function dbsRequest(URL) which looks like so :

function dbsRequest(URL) {
try {
var xmlHTTP;
if (dbsBrowserType == "ie") { // code for IE
if (window.XMLHttpRequest) {
xmlHTTP = new XMLHttpRequest();"GET",URL, false);
return xmlHTTP.responseXML.documentElement;
else if (window.ActiveXObject) {
xmlHTTP = new ActiveXObject("Microsoft.XMLHTTP");"GET",URL, false);
return xmlHTTP.responseXML.documentElement;
else if (dbsBrowserType == "ns" || dbsBrowserType == "op") { // code for Mozilla, Opera.
xmlHTTP = new XMLHttpRequest();"GET",URL, false);
return xmlHTTP.responseXML.documentElement;
catch (e) {
//if (dbsBrowserType == "ie")
// alert("error: " + e.description);
//else if (dbsBrowserType == "ns" || dbsBrowserType == "op")
// alert("error: " + e);
return null;

And then I just got stuck.

I am extremly bad in JS , AJAX and the like, and when I saw that the response is actually some form of XML - I just got stuck. I got about 13,000 entries - and manually clicking each is not really an option..

Is there any way of getting those fields which are triggered by that JS ?

Or alternatively , is there a way to find/construct the URL of the response and parse it to php ??


A real live page example can be found here :

I have no idea what language it is, but anyhow you need to click on the green phone icon to the right to see the result phone ..

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答


  • doutuo4285 doutuo4285 8年前

    The response to that AJAX request is just raw XML data. showIndexPhone() is retrieving the data and setting the variable xmlDoc to it, but it's not doing anything else with it. It's doesn't actually show it anywhere, despite the function name.

    Anyway, you can use file_get_contents() in PHP to download the XML data, and then use one of its several XML parsers to deconstruct it. There's a whole chapter in the Programming PHP book explaining how to process XML.

    点赞 评论 复制链接分享