You can use DOM and the following XPath for this:
/html/body//h1[contains(.,'Blue Violin')]
This would match all h1 element inside the body element containing the phrase "Blue Violin" either directly or in a subnode. If it should only occur in the direct TextNode, change the .
to text()
. The results are returned in a DOMNodeList
.
Since you only want to know if the phrase appears, you can use the following code:
$dom = new DOMDocument;
$dom->load('NewFile.xml');
$xPath = new DOMXPath($dom);
echo $xPath->evaluate('count(/html/body//h1[contains(.,"Blue Violin")])');
which will return the number of nodes matching this XPath. If your markup is not valid XHTML, you will not be able to use loadXML
. Use loadHTML
or loadHTMLFile
instead. In addition, the XPath will execute faster if you give it a direct path to the nodes. If you only have one h1, h2 and h3 anyway, substitute the //h1
with a direct path.
Note that contains
is case-sensitive, so the above will not match anything due to the Mixed Case used in the search phrase. Unfortunately, DOM (or better the underlying libxml) does only support XPath 1.0. I am not sure if there is an XPath function to do a case-insensitive search, but as of PHP 5.3, you can also use PHP inside an XPath, e.g.
$dom = new DOMDocument;
$dom->load('NewFile.xml');
$xpath = new DOMXPath($dom);
$xpath->registerNamespace("php", "http://php.net/xpath");
$xpath->registerPHPFunctions();
echo $xpath->evaluate('count(/html/body//h1[contains(php:functionString("strtolower", .),"blue violin")])');
so in case you need to match Mixed Case phrases or words, you can lowercase all text in the searched nodes before checking it with contains
or use any other PHP function you may find useful here.