dongya4089 2015-11-04 13:15
浏览 84

如何从多个cURL调用生成XML(使用PHP)?

guys.

I'm with serious trouble trying to solve this.

The scenario:

Here at work we use the Vulnerability Management tool QualysGuard. Skipping all technical details, this tool basically detects vulnerabilities in all servers and for each vulnerability in each server it creates a Ticket Number. From the UI I can access all these tickets and download a CSV file with all of them. The other way of doing it is by using the API. The API uses some cURL calls to access the database and retrieve the info that I specify in the parameters.

The method:

I'm using a script like this to get the data:

<?php
$username="myUserName"; 
$password="myPassword"; 
$proxy= "myProxy";
$proxyauth = 'myProxyUser:myProxyPassword';


$url="https://qualysapi.qualys.com/msp/ticket_list.php?"; //This is the official script, provided by Qualys, for doing this task.

$postdata = "show_vuln_details=0&SINCE_TICKET_NUMBER=1&CURRENT_STATE=Open&ASSET_GROUPS=All"; 

$ch = curl_init(); 
curl_setopt ($ch, CURLOPT_URL, $url); 
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($ch, CURLOPT_PROXY, $proxy);
curl_setopt($ch, CURLOPT_PROXYUSERPWD, $proxyauth);
curl_setopt ($ch, CURLOPT_TIMEOUT, 60); 
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 0); 
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt ($ch, CURLOPT_REFERER, $url); 
curl_setopt($ch, CURLOPT_USERPWD, $username . ":" . $password);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata); 
curl_setopt ($ch, CURLOPT_POST, 1); 
$result = curl_exec ($ch); 

$xml = simplexml_load_string($result);
?>

The script above works fine. It connects to the API, pass some parameters to it and the ticket_list.php file generates an XML file with all I need.

The Problems:

1-) This script only allows a limit of 1000 results in the XML file it returns. If my request has generated more than 1000 results, the script creates a TAG like this, at the end of the XML:

<TRUNCATION last="5066">Truncated after 1000 records</TRUNCATION>

In this case, I would need to execute anoter cURL call, with the parameters bellow:

$postdata = "show_vuln_details=0&SINCE_TICKET_NUMBER=5066&CURRENT_STATE=Open&ASSET_GROUPS=All";

2-) There are approximately 300,000 tickets in Qualys' database (cloud), and I need to download all of them and insert in MY database, which is used by an application that I'm creating. This application has some forms, which are filled by the user and a bunch of queries are run against the database.

The doubt: What would be the best way for me to do the task above? I've got some ideas, but I'm at a complete loss. I thought:

**1-)**Create a function that does the call above, parses the xml and if the tag TRUNCATION exists, it gets its value and call itself again, doing it recursively until a result without the tag TRUNCATIONcomes. The problem with this one is that I weren't able to merge the XML results of each call, and I'm not sure if it would cause memory issues, since it would be needed nearly 300 cURL calls. This script would be executed automatically by using the server's cronTab in a non-business period.

2-) Instead of retrieving all the data, I make the forms that I've mentioned post the data to the script and make the cURL calls with the parameters that the user POSTed. But again I'm not sure if that would be good, since I would still need to do multiple calls, depending on the parameters that the user sends.

3-) This is a crazy one: Use some sort of Macro software to record me while I log in the UI, go to the page where the tickets are located, click the download button, check the CSV option and click to download again. Then, export this script to some language like python or java, create a task in the cronTab and create a script that parses the CSV downloaded and inserts the data to the database. (Crazy or not? =P )

Any help is very welcome, maybe the answer is right before my eyes and I haven't gotten yet. Thanks in advance!

  • 写回答

1条回答 默认 最新

  • dongwu3596 2015-11-04 13:27
    关注

    I believe the proper way would involve a queue worker, however, If I were you I'd make your script grab 5 of these XML files in one execution- grab 1, insert rows, remove from memory, repeat. Then, I'd test it by running it a few times manually to see what sort of execution time and memory it requires. Once you've got a good idea of the execution time and you can see memory will not be a problem, schedule a cron for a little under double that time. If all goes well it should be about a minute between runs and you can have it all in your DB within an hour.

    评论

报告相同问题?

悬赏问题

  • ¥15 基于卷积神经网络的声纹识别
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题