I have a PHP script that imports and parses XML files and saves the data into the database:
- Database collation:
utf8_general_ci
, charset:utf8
- Page's charset :
utf-8
- XML files:
ANSI
, contains smart quotes (from MS Word)
So during import I do a utf8_encode()
on the text from the XML files prior to saving into the database and subsequently displaying on the page.
But when successfully imported, and saved into DB,
- Database: smart quotes are saved as
?
character (viewed from CMD) - Page: smart quotes are displayed as boxes
Any ideas as to why the smart quotes are not being converted correctly, even when using utf8_encode()
?
EDIT:
@Tomalak: The XML files are actually .txt
, no XML declaration (<?xml ... ?>
), and no root element. My script actually adds a root element just so the parser works:
utf8_encode('<article>' . file_get_contents($xmlfile) . '</article>');
Seems like I need to add an XML declaration..? If so, how should it look like?