dtkz3186 2015-06-23 15:05
浏览 60

用php从xml中提取信息

I am trying to write a php script that will pull information from an xml file and place it into a database. I have created a L.A.M.P. stack on CENTOS 6.6. The script below works in the sense that it recognize the total number of inputs in the XML file, but no information is being extracted because each section has sub tags. Is there something I can add to my code to print all sub tags within each tag of input into the database along with its text.

#!/usr/bin/php
<?php
// sample XML data
$data = <<<XML
<entry type="CVE" name="CVE-2003-0002" seq="2003-0002"
published="2003-01-17" modified="2015-04-14" severity="Medium"
CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
<desc>
<descript source="cve">Multiple ethernet Network Interface 'Card' (NIC)   device drivers do not pad frames with null bytes, which allows remote attackers to obtain information from previous packets or kernel memory by using malformed packets, as demonstrated by Etherleak.
</descript>
</desc>
<loss_types>
<conf/>
</loss_types>
<vuln_types>
<design/>
</vuln_types>
<range>
<network/>
</range>
<refs>
<ref source="CERT-VN" url="http://www.kb.cert.org/vuls/id/412115" adv="1">VU#412115</ref>
<ref source="BUGTRAQ" url="http://www.securityfocus.com/archive/1/archive/1/535181/100/0/threaded">20150402 NEW : VMSA-2015-0003 VMware product updates address critical information disclosure issue in JRE</ref>
<ref source="REDHAT" url="http://www.redhat.com/support/errata/RHSA-2003-025.html">RHSA-2003:025</ref>
<ref source="CONFIRM" url="http://www.oracle.com/technetwork/topics/security/cpujan2015-1972971.html">http://www.oracle.com/technetwork/topics/security/cpujan2015-1972971.html</ref>
<ref source="MISC" url="http://www.atstake.com/research/advisories/2003/atstake_etherleak_report.pdf">http://www.atstake.com/research/advisories/2003/atstake_etherleak_report.pdf</ref>
<ref source="ATSTAKE" url="http://www.atstake.com/research/advisories/2003/a010603-1.txt" adv="1">A010603-1</ref><ref source="FULLDISC" url="http://seclists.org/fulldisclosure/2015/Apr/5">20150402 NEW : VMSA-2015-0003 VMware product updates address critical information disclosure issue in JRE</ref>
<ref source="MISC" url="http://packetstormsecurity.com/files/131271/VMware-Security-Advisory-2015-0003.html">http://packetstormsecurity.com/files/131271/VMware-Security-Advisory-2015-0003.html</ref><ref source="BUGTRAQ" url="http://marc.theaimsgroup.com/?l=bugtraq&m=104222046632243&w=2" adv="1">20030110 More information regarding Etherleak</ref>
<ref source="VULNWATCH" url="http://archives.neohapsis.com/archives/vulnwatch/2003-q1/0016.html">20030110 More information regarding Etherleak</ref>
<ref source="BUGTRAQ" url="http://www.securityfocus.com/archive/1/archive/1/307564/30/26270/threaded">20030117 Re: More information regarding Etherleak</ref>
<ref source="BUGTRAQ" url="http://www.securityfocus.com/archive/1/archive/1/305335/30/26420/threaded">20030106 Etherleak: Ethernet frame padding information leakage (A010603-1)</ref>
<ref source="REDHAT" url="http://www.redhat.com/support/errata/RHSA-2003-088.html">RHSA-2003:088</ref><ref source="OSVDB" url="http://www.osvdb.org/9962">9962</ref>
<ref source="OVAL" url="http://oval.mitre.org/repository/data/getDef?id=oval:org.mitre.oval:def:2665" sig="1">oval:org.mitre.oval:def:2665</ref>
</refs>
<vuln_soft>
<prod name="freebsd" vendor="freebsd">
<vers num="4.2"/>
<vers num="4.3"/>
<vers num="4.4"/>
<vers num="4.5"/>
<vers num="4.6"/>
<vers num="4.7"/>
</prod>
<prod name="linux_kernel" vendor="linux">
<vers num="2.4.1"/>
<vers num="2.4.10"/>
<vers num="2.4.11"/>
<vers num="2.4.12"/>
<vers num="2.4.13"/>
<vers num="2.4.14"/>
<vers num="2.4.15"/>
<vers num="2.4.16"/>
<vers num="2.4.17"/>
<vers num="2.4.18"/>
<vers num="2.4.19"/>
<vers num="2.4.2"/>
<vers num="2.4.20"/>
<vers num="2.4.3"/>
<vers num="2.4.4"/>
<vers num="2.4.5"/>
<vers num="2.4.6"/>
<vers num="2.4.7"/>
<vers num="2.4.8"/>
<vers num="2.4.9"/>
</prod>
<prod name="windows_2000" vendor="microsoft">
<vers num="" edition=":advanced_server"/> 
<vers num="" edition=":server"/>
<vers num="" edition=":professional"/>
<vers num="" edition=":datacenter_server"/>
<vers num="" edition="sp1:datacenter_server"/>
<vers num="" edition="sp1:advanced_server"/>
<vers num="" edition="sp1:professional"/>
<vers num="" edition="sp1:server"/>
<vers num="" edition="sp2:datacenter_server"/>
<vers num="" edition="sp2:advanced_server"/>
<vers num="" edition="sp2:professional"/>
<vers num="" edition="sp2:server"/>
</prod>
<prod name="windows_2000_terminal_services" vendor="microsoft">
<vers num="" edition="sp1"/>
<vers num="" edition="sp2"/>
</prod>
<prod name="netbsd" vendor="netbsd">
<vers num="1.5"/>
<vers num="1.5.1"/>
<vers num="1.5.2"/>
<vers num="1.5.3"/>
<vers num="1.6"/>
</prod>
</vuln_soft>
</entry>
XML;

// gather XML data

// database connection settings
$host = 'localhost';
$database = 'cve';
$user = 'admin';
$pass = 'admin';
$table = 'vulnerabilities';

try {
// connect to database
$dbh = new PDO('mysql:host=' . $host . ';dbname=' . $database, $user, $pass);

// prepare xml and iterator
$xml = new SimpleXMLIterator($data);
$itr = new RecursiveIteratorIterator($xml);
// loop through XML data
foreach ($itr as $key => $value) {

    // prepare an insert statement
    $statement = $dbh->prepare("INSERT INTO $table (identifier,seq,published,modified,severity,cvss_verison,cvss_score,cvss_base_score,cvss_impact_subscore,cvss_exploit_subscore,cvs_vector,information,loss_types,vuln_types,impact_area,refs,vuln_soft) VALUES (':name',':seq',':published',':modified',':severity',':CVSS_verison',':CVSS_score',':CVSS_base_score',':CVSS_impact_subscore',':CVSS_exploit_subscore',':CVSS_vector',':desc',':loss_types',':vuln_types',':range',':ref',':vuln_soft')");

    // bind your XML data to named parameters for the insert statement
    $statement->bindParam(':name', $value->attributes()->identifier);
    $statement->bindParam(':seq', $value->attributes()->seq);
    $statement->bindParam(':published', $value->attributes()->published);
    $statement->bindParam(':modified', $value->attributes()->modified);
    $statement->bindParam(':severity', $value->attributes()->severity);
    $statement->bindParam(':CVSS_version', $value->attributes()->cvss_verison);
    $statement->bindParam(':CVSS_score', $value->attributes()->cvss_score);
    $statement->bindParam(':CVSS_base_score', $value->attributes()->cvss_base_score);
    $statement->bindParam(':CVSS_impact_subscore', $value->attributes()->cvss_impact_subscore);
    $statement->bindParam(':CVSS_exploit_subscore', $value->attributes()->cvss_exploit_subscore);
    $statement->bindParam(':CVS_vector', $value->attributes()->cvs_vector);
    $statement->bindParam(':desc',$value->attributes()->information);
    $statement->bindParam(':loss_types',$value->attributes()->loss_types);
    $statement->bindParam(':vuln_types',$value->attributes()->vuln_types);
    $statement->bindParam(':range',$value->attributes()->impact_area);
    $statement->bindParam(':refs',$value->attributes()->refs);
    $statement->bindParam(':vuln_soft',$value->attributes()->vuln_soft);


    // insert XML data into database table
    $statement->execute();
}

$dbh = null;
} catch (PDOException $e) {
print "There was an error: " . $e->getMessage() . "
";
die();
}

?>

I need to collect all data from entry tag and place it into database. Example of xml code with information in tag:

<entry type="CVE" name="CVE-2003-0001" seq="2003-0001"
 published="2003-01-17" modified="2015-04-14" severity="Medium"
 CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
 CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
 CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)">

I then need to collect all data with in entry tag by recording the subtags data and text of the tags with the entry tag. Example of xml code with sub tags:

<refs>
<ref source="reference information">Reference information</ref></refs>
<ref source="reference information">Reference information</ref></refs>
<ref source="reference information">Reference information</ref></refs>
<ref source="reference information">Reference information</ref></refs>
</refs>
</entry>

The current script as detailed above return the following warnings and on fatal error: PHP Warning: SimpleXMLElement::__construct(): Entity: line 6: parser error : Extra content at the end of the document in /home/ant244/Documents/extract.php on line 112

PHP Warning:  SimpleXMLElement::__construct(): <desc> in /home/ant244/Documents/extract.php on line 112

PHP Warning:  SimpleXMLElement::__construct(): ^ in /home/ant244/Documents/extract.php on line 112

PHP Fatal error:  Uncaught exception 'Exception' with message 'String could not be parsed as XML' in /home/ant244/Documents/extract.php:112

Stack trace:
#0 /home/ant244/Documents/extract.php(112):  SimpleXMLElement->__construct('<entry type="CV...')

#1 {main}
  thrown in /home/ant244/Documents/extract.php on line 112
  • 写回答

1条回答 默认 最新

  • dro44817 2015-06-23 18:41
    关注

    Background

    If I'm understanding your question correctly, one approach might involve looping through your XML entry data while using the found data pieces as named parameters inside prepared SQL statements. Prepared statements can help keep your database input fairly clean (see the "How do I make my database queries secure from SQL injection?" section on the PHP tag wiki page for more information).

    This approach might look something like the example code below. The code below shows how prepared statements can be used for your database work, and how XML data can be accessed with a $value->attributes()->name format inside a foreach loop (where name matches individual attributes from your XML entries):

    Code Example 1 (prepared statements)

    <?php
    
    // sample XML data
    $data = <<<XML
    <root>
    <entry type="CVE" name="CVE-2003-0001" seq="2003-0001"
    published="2003-01-17" modified="2015-04-14" severity="Medium"
    CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
    CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
    CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
    <entry type="CVE" name="CVE-2003-0002" seq="2003-0002"
    published="2003-01-17" modified="2015-04-14" severity="Medium"
    CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
    CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
    CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
    </root>
    XML;
    
    // gather XML data
    $xml = simplexml_load_string($data);
    
    // database connection settings
    $host = 'localhost';
    $database = 'your_database';
    $user = 'your_username';
    $pass = 'your_password';
    $table = 'your_database_table';
    
    try {
        // connect to database
        $dbh = new PDO('mysql:host=' . $host . ';dbname=' . $database, $user, $pass);
    
        // loop through XML data
        foreach ($xml->entry as $key => $value) {
    
            // prepare an insert statement
            $statement = $dbh->prepare("INSERT INTO $table (name, seq) VALUES (:name, :seq)");
    
            // bind your XML data to named parameters for the insert statement
            $statement->bindParam(':name', $value->attributes()->name);
            $statement->bindParam(':seq', $value->attributes()->seq);
    
            // insert XML data into database table
            $statement->execute();
        }
    
        $dbh = null;
    } catch (PDOException $e) {
        print "There was an error: " . $e->getMessage();
        die();
    }
    
    ?>
    

    When it comes to working with nested tags, however, it might be a good idea to use an iterator. In the case of your example XML (simplified for the code example below) using an iterator could look like this:

    Code Example 2 (iterators)

    <?php
    
    // sample XML data
    $data = <<<XML
    <root>
    <entry>
    <refs>
    <ref source="reference_information_1">Reference information 1</ref>
    <ref source="reference_information_2">Reference information 2</ref>
    </refs>
    </entry>
    </root>
    XML;
    
    // prepare XML data and iterator
    $xml = new SimpleXMLIterator($data);
    $itr = new RecursiveIteratorIterator($xml);
    
    // iterate over each relevant tag
    foreach ($itr as $key => $value) {
      echo $key . ": " . $value . "
    ";
      echo "source attribute: " . $value->attributes()->source . "
    ";
    }
    
    ?>
    

    This code produces the following output:

    ref: Reference information 1
    source attribute: reference_information_1
    ref: Reference information 2
    source attribute: reference_information_2
    

    Code Example 3 (prepared statements + iterators)

    <?php
    
    // sample XML data
    $data = <<<XML
    <root>
    <entry type="CVE" name="CVE-2003-0001" seq="2003-0001"
    published="2003-01-17" modified="2015-04-14" severity="Medium"
    CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
    CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
    CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
    <entry type="CVE" name="CVE-2003-0002" seq="2003-0002"
    published="2003-01-17" modified="2015-04-14" severity="Medium"
    CVSS_version="2.0 incomplete approximation" CVSS_score="5.0"
    CVSS_base_score="5.0" CVSS_impact_subscore="2.9"
    CVSS_exploit_subscore="10.0" CVSS_vector="(AV:N/AC:L/Au:N/C:P/I:N/A:N)" />
    </root>
    XML;
    
    // database connection settings
    $host = 'localhost';
    $database = 'your_database';
    $user = 'your_username';
    $pass = 'your_password';
    $table = 'your_database_table';
    
    try {
        // connect to database
        $dbh = new PDO('mysql:host=' . $host . ';dbname=' . $database, $user, $pass);
    
        // prepare XML data and iterator
        $xml = new SimpleXMLIterator($data);
        $itr = new RecursiveIteratorIterator($xml);
    
        // iterate over each relevant tag
        foreach ($itr as $key => $value) {
    
            // prepare an insert statement
            $statement = $dbh->prepare("INSERT INTO $table (name, seq) VALUES (:name, :seq)");
    
            // bind your XML data to named parameters for the insert statement
            $statement->bindParam(':name', $value->attributes()->name);
            $statement->bindParam(':seq', $value->attributes()->seq);
    
            // insert XML data into database table
            $statement->execute();
        }
    
        $dbh = null;
    } catch (PDOException $e) {
        print "There was an error: " . $e->getMessage();
        die();
    }
    
    ?>
    

    Conclusion

    Prepared statements and iterators can each provide safe and convenient ways to work with XML and database-related applications. In the case of your program, it might be helpful to combine ideas from these two code examples (by using the iterators from the Second Code Example for the // loop through XML data section of the First Code Example) as shown in the Third Code Example.

    评论

报告相同问题?

悬赏问题

  • ¥15 安卓adb backup备份应用数据失败
  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
  • ¥50 成都蓉城足球俱乐部小程序抢票
  • ¥15 yolov7训练自己的数据集
  • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)
  • ¥15 电力市场出清matlab yalmip kkt 双层优化问题
  • ¥30 ros小车路径规划实现不了,如何解决?(操作系统-ubuntu)