I am parsing an HTML page. At some point I am getting the text between a div and using html_entity_decode to print that text.
The problem is that the page contains characters like this star ★
or others like shapes like ⬛︎, ◄, ◉, etc. I have checked and these characters are not encoded on the source page, they are like you see them normally.
The page is using charset="UTF-8"
So, when I use
html_entity_decode($string, ENT_QUOTES, 'UTF-8');
The star, for example, is "decoded" to â˜
$string is being obtained by using
document.getElementById("id-of-div").innerText
I would like to decode them correctly. How do I do that in PHP?
NOTE: I have tried htmlspecialchars_decode($string, ENT_QUOTES);
and it produces the same result.