I'm trying to use PHP to create a JSON representation of a paragraph of text, keeping information about links/formatting etc.
Essentially, I want to convert this string:
"Hello <a href='www.google.com'>World!</a>. How are <b>you</b> today?"
Into these 7 JSON objects:
"1": {
"_id": "1",
"_type": "TEXT",
"value": "Hello "
},
"2": {
"_id": "2",
"_type": "TEXT",
"value": "World!",
"_attributes": {
"3": {
"_id": "3",
"_type": "LINK",
"src": "www.google.com"
}
}
},
"4": {
"_id": "4",
"_type": "TEXT",
"value": " How are "
},
"5": {
"_id": "5",
"_type": "TEXT",
"value": "you",
"_attributes": {
"6": {
"_id": "6",
"_type": "FORMATTING",
"bold": true,
}
}
},
"7": {
"_id": "7",
"_type": "TEXT",
"value": " today?"
}
I've hunted the internet/google and found plenty about splitting HTML, but can't seem to describe what I want. I need to separate the plain text from the link/formatting and create a single entity for each.
The "FORMATTING" attribute just adds "bold"/"underline"/"subscript" etc fields as appropriate.
Nested tags will simply create multiple attributes for their text entity.
I don't yet know how I'd handle a 2-word hyperlink that has one word bolded... perhaps it'll have to have 2 hyperlink attributes.
Any help MUCH appreciated!!