dsns47611 2013-04-10 12:59
浏览 37
已采纳

文本提取器php

I have this page test1.php on this other page test.php i have this php code running:

 <?php 
    libxml_use_internal_errors(true); 
    $doc = new DOMDocument(); 
    $doc->loadHTMLFile("http://inviatapenet.gethost.ro/sop/test1.php"); 
    $xpath = new DOMXpath($doc); 
    $elements = $xpath->query("//*[@type='text/javascript']/@fid");
        if (!is_null($elements)) {
            foreach ($elements as $element) {
                $nodes = $element->childNodes;
                foreach ($nodes as $node) {
                    echo $node->nodeValue. "
";
                }
            }
        }
?>

But shows nothing.

i'm trying to get from that page, only the content of fid="x8qfp3cvzbxng8e" :

From this Line

<script type="text/javascript"> fid="x8qfp3cvzbxng8e"; v_width=640;
v_height=360; </script>

The output shold be:

x8qfp3cvzbxng8e

Wath I have to do?

  • 写回答

1条回答 默认 最新

  • dtgta48604 2013-04-10 13:00
    关注

    if you want only fid content use this regex

     preg_match_all('~fid="(.*?)"~si',$Text,$Match);
     print_r($Match);
    

    output for your sample

     Array
    (
       [0] => Array
        (
            [0] => fid="x8qfp3cvzbxng8e"
        )
    
       [1] => Array
        (
            [0] => x8qfp3cvzbxng8e
        )
    
    )
    

    try this for extract text this not show any script content but if you want can remove condition of this

     function extractText($node) {
         if($node==NULL)return false;    
         if (XML_TEXT_NODE === $node->nodeType || XML_CDATA_SECTION_NODE === $node->nodeType) {
             return $node->nodeValue;
         } else if (XML_ELEMENT_NODE === $node->nodeType || XML_DOCUMENT_NODE === $node->nodeType || XML_DOCUMENT_FRAG_NODE === $node->nodeType) {
           if ('script' === $node->nodeName) return '';
    
           $text = '';
           foreach($node->childNodes as $childNode) {
              $text .= extractText($childNode);
           }
           return $text;
         }
    }
    

    sample

     $Text=file_get_contents("http://inviatapenet.gethost.ro/sop/test1.php");
     preg_match_all('~fid="(.*?)"~si',$Text,$Match);
     $fid=$Match[1][1];
     echo $fid;
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥100 连续两帧图像高速减法
  • ¥15 组策略中的计算机配置策略无法下发
  • ¥15 如何绘制动力学系统的相图
  • ¥15 对接wps接口实现获取元数据
  • ¥20 给自己本科IT专业毕业的妹m找个实习工作
  • ¥15 用友U8:向一个无法连接的网络尝试了一个套接字操作,如何解决?
  • ¥30 我的代码按理说完成了模型的搭建、训练、验证测试等工作(标签-网络|关键词-变化检测)
  • ¥50 mac mini外接显示器 画质字体模糊
  • ¥15 TLS1.2协议通信解密
  • ¥40 图书信息管理系统程序编写