I'm reading my music directory to populate a JSON for jPlayer, as follow:
<?php
//tried utf-8, shift_jis, etc. No difference
header('Content-Type: application/json; charset=SHIFT_JIS');
//cant be blank so i put . to make current file dir as base
$Directory = new RecursiveDirectoryIterator('.');
$Iterator = new RecursiveIteratorIterator($Directory);
$Regex = new RegexIterator($Iterator, '/^.+\.mp3$/i', RecursiveRegexIterator::GET_MATCH);
//instead of glob(*/*.mp3) because isnt recursive
$filesJson = [];
foreach ($Regex as $key => $value) {
$whatever = str_ireplace(['.mp3','.\\'], '', $key);
$filesJson['mp3'][] = [
'title' => htmlspecialchars($whatever),
'mp3' => $key
];
}
echo json_encode($filesJson);
exit();
?>
The problem lies in files which filename isn't standard UTF-8 - as Latin, Japanese and Korean ones. Examples:
Japanese
Korean
Latin (pt-br)
Which converts into ?
, or simply becomes null
when parsing latin names ( Geração
or 4º
for e.g.)
So, how make the filenames/paths be parsed correctly with different kinds of languages? The header charset isn't helping.
Info:
XAMPP with Apache2 + PHP 5.4.2 at Win7 x86
Update #1:
Tried @infinity's answer but no changes. Still ?
on JP, null
on Latin.
<?php
header('Content-Type: application/json; charset=UTF-8');
mb_internal_encoding('UTF-8');
$Directory = new RecursiveDirectoryIterator('.');
$Iterator = new RecursiveIteratorIterator($Directory);
$Regex = new RegexIterator($Iterator, '/^.+\.mp3$/i', RecursiveRegexIterator::GET_MATCH);
$filesJson = [];
foreach ($Regex as $key => $value) {
$whatever = mb_substr($key, 2, mb_strlen($key)-6, "utf-8"); // 2 to remove .\ and -6 to remove .mp3 (-4 + -2)
$filesJson['mp3'][] = [
'title' => $whatever, //tried with and without htmlspecialchars
'mp3' => $key
];
}
echo json_encode($filesJson);
exit();
?>
If I use HTML-ENTITIES
instead of utf-8
on mb_substr()
, latin characters works but asian still ?
.