douwen8424 2012-06-07 00:08
浏览 39
已采纳

将字符串拆分为HTML实体

I'm trying to use PHP to create a JSON representation of a paragraph of text, keeping information about links/formatting etc.

Essentially, I want to convert this string:

"Hello <a href='www.google.com'>World!</a>.  How are <b>you</b> today?"

Into these 7 JSON objects:

"1": {
    "_id": "1",
    "_type": "TEXT",
    "value": "Hello "
},
"2": {
    "_id": "2",
    "_type": "TEXT",
    "value": "World!",
    "_attributes": {
        "3": {
            "_id": "3",
            "_type": "LINK",
            "src": "www.google.com"
        }
    }
},
"4": {
    "_id": "4",
    "_type": "TEXT",
    "value": " How are "
},
"5": {
    "_id": "5",
    "_type": "TEXT",
    "value": "you",
    "_attributes": {
        "6": {
            "_id": "6",
            "_type": "FORMATTING",
            "bold": true,
        }
    }
},
"7": {
    "_id": "7",
    "_type": "TEXT",
    "value": " today?"
}

I've hunted the internet/google and found plenty about splitting HTML, but can't seem to describe what I want. I need to separate the plain text from the link/formatting and create a single entity for each.

The "FORMATTING" attribute just adds "bold"/"underline"/"subscript" etc fields as appropriate.

Nested tags will simply create multiple attributes for their text entity.

I don't yet know how I'd handle a 2-word hyperlink that has one word bolded... perhaps it'll have to have 2 hyperlink attributes.

Any help MUCH appreciated!!

  • 写回答

1条回答 默认 最新

  • douwen2072 2012-11-02 06:03
    关注

    A DOMDocument is what you need. If you can live with slightly different names, you barely have to do any work, too.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线