PHP将一串html拆分为一个类名为tag的数组

I need to take a string of html text like:

<p>This is a line with no spans<br>
This is a line <span class="second">This is secondary</span><br>  
This is another line <span class="third">And this is third</span> <span class="four">this is four</span></p>

And have it end up as an array in PHP like:

array(
    "This is a line with no spans",
    array(
      "This is a line",
      second => "This is secondary",
    ),
    array(
      "This is another line",
      third => "And this is third",
      four => "this is four"
    )
);

Getting each line into it's own value was easy, I just split the text on <br> and that works fine, but getting lines to be split with the class name I can't quite get. I feel like php's preg_split may hold the key, but I kind of suck with regular expressions and I can't get it figured out.

Any ideas?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doujia7517 2011-08-13 22:42
关注
It's not a good idea to use regular expressions to parse HTML (cite). It's just not a suitable tool; see @JAAulde's answer.

The best way is to do it purely with the DOM. Loop through all child nodes (including text nodes) to format the array the way you want. Like this:

$p = // get paragraph tag... $lines = array(); $pChildren = $p->childNodes; for ($i = 0; $i < $pChildren->length; $i++) { $line = array(); $child = $pChildren->item($i); if ($child instanceof DOMText) { $line[] = $child->wholeText; } elseif ($child instanceof DOMElement) { if (strtolower($child->tagName) == 'br') { $lines[] = $line; $line = array(); } elseif (strtolower($child->tagName) == 'span' && $child->hasAttribute('class')) { $line[$child->getAttribute('class')] = $child->nodeValue; } } }

Warning: treat the above as pseudo-code, it has not been tested at all, just going from experience and the manual.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

报告相同问题？

关注问题

PHP将一串html拆分为一个类名为tag的数组 php
2011-08-13 22:06

回答 3 已采纳 It's not a good idea to use regular expressions to parse HTML (cite). It's just not a suitable to
php方法的参数是类名和一个变量 php
2019-03-13 13:39

回答 1 已采纳这里的作用是声明参数#info是SetNumRequest类，其实这个时候#info已经是一个对象了，不是一个简单的变量了，而这个$info的类型就是SetNumRequest类型的.
将javascript数组解析为具有相同类名的相应输入字段 html javascript jquery php
2017-06-19 15:01

回答 1 已采纳 getElementsByClassName functions returns an array(Read more here) and hence you need to use array
后端逆袭，一份不可多得的PHP学习指南
2020-09-23 11:49

掘金-我是哪吒的博客 php是一种超文本预处理器的学习语言，它是一种被广泛应用的开放源代码的多用途的脚本语言，它可嵌入到HTML中，尤其是适合web开发。 PHP是一种在服务器端执行的嵌入HTML文档的脚本语言。语言的风格类似于C语言，现在...
判断一个类名增加一个class html html5 javascript
2021-12-07 12:04

回答 3 已采纳原生的话： <div class="other">ddd</div> <div class="now active">ddd1</div> <sc
滚动图增删另外一个类名 html javascript 前端
2021-12-10 17:02

回答 2 已采纳 <link href="https://cdn.bootcdn.net/ajax/libs/Swiper/4.5.0/css/swiper.min.css" rel="stylesheet"&g
类名相同却只有第一个可以拥有style css3 html5 javascript
2022-09-25 19:41

回答 3 已采纳 querySelector只能获取到与选择器匹配的第一个节点，使用querySelectorAll才可以获取选择器匹配到的所有节点
一个合格的初级前端工程师需要掌握的模块笔记
2021-02-04 09:43

掘金-我是哪吒的博客文章目录一个合格的初级前端工程师需要掌握的模块笔记前言Web模块html基本结构标签属性事件属性文本标签多媒体标签列表表格表单标签其他语义化标签网页结构模块划分CSS代码语法CSS 放置位置CSS的继承选择器的种类...
Php将字符串转换为正确的类型 php
2014-11-05 20:39

回答 2 已采纳 floatval or intval will do what you ask, separately. If you want to retrieve the type... doesn´t
为什么类名**变量名可以表示数组 c++
2022-03-28 19:50

回答 1 已采纳相当于二维数组
js 判断是否有类名，在另外一个在增删类名 html jquery 前端
2022-01-10 17:34

回答 8 已采纳 <script src="http://code.jquery.com/jquery-migrate-1.2.1.min.js"></script> <script&gt
php学习
2022-04-14 17:51

拓海AE的博客 PHP（全称：PHP：Hypertext Preprocessor，即"PHP：超文本预处理器"）是一种通用开源脚本语言。 PHP 脚本在服务器上执行。 PHP 可免费下载使用。 PHP 文件是什么？ PHP 文件可包含文本、HTML、JavaScript代码和 ...
PHP中伪变量$this指向自己的类名是什么意思？ php 有问必答
2023-04-17 17:10

回答 4 已采纳 $this指向实例本身，Page属性是不是打错了？小写的p吧？
PHP面试题(一)
2018-03-24 11:56

钟长森的博客用PHP实现一个双向队列(使用deque) deque，全名double-ended queue，是一种具有队列和栈的性质的数据结构。双端队列中的元素可以从两端弹出，其限定插入和删除操作在表的两端进行。双向队列（双端队列）就像是一个...
PHP面试题大全
2019-12-20 09:54

Rudon滨海渔村的博客系统限制，只显示了2902行，请下载完整版： ...回答：PHP全称：Hypertext Preprocessor，是一种用来开发动态网站的服务器脚本语言。问题：什么是MVC？回答：MVC由Model（模型）, View（视图...
ThinkPHP5源码学习篇--Hook.php
2018-10-16 22:15

voilaf的博客在学习TP5源码的过程中，经常有执行Hook::listen()的地方，一查发现这是TP5的行为拓展，为的是当应用程序执行到定义的标签时，能够拦截下来执行一些公共的逻辑。对AOP(面向切面编程)了解的不多，只知道在Java的实现...
PHP 基础知识总结
2017-06-06 09:51

ato'ng的博客服务器端语言：在网页传送到客户端前将之解释并执行完毕，简单来说，我们是看不到PHP代码的，我们看到的都是HTML解析之后的代码，而我们则是利用PHP操控HTML，使网页由静态转为动态效果。嵌入到HTML：使用特殊的...
没有解决我的问题, 去提问

悬赏问题

¥15 无线电能传输系统MATLAB仿真问题
¥50 如何用脚本实现输入法的热键设置
¥20 我想使用一些网络协议或者部分协议也行，主要想实现类似于traceroute的一定步长内的路由拓扑功能
¥30 深度学习，前后端连接
¥15 孟德尔随机化结果不一致
¥15 apm2.8飞控罗盘bad health，加速度计校准失败
¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
¥15 谁有desed数据集呀
¥20 手写数字识别运行c仿真时，程序报错错误代码sim211-100
¥15 关于#hadoop#的问题

PHP将一串html拆分为一个类名为tag的数组

3条回答 默认 最新

悬赏问题

3条回答默认最新