I have a mysql table with words in unicode using signs like ḥ
, ḫ
š
, etc.
The columns in the table are defined as utf8mb4_general_ci
and recognize the above signs.
In the header of the webpage I put
<meta http-equiv="Content-Type" content="text/html; charset=utf8mb4">
This webpage contains a form sending data to a php page. In the beginning of the php page I put:
mysqli_set_charset($con,"utf8mb4");
In this page, I do a mysql search and I get an array and it is this array ($result
) must be sorted by its keys using a lookup array of characters that I have produced which includes single and multi-byte characters.
This is the array:
Array (
[nṯr] => Array ( [0] => Ka.C.Coptite.urkVIII,176b [1] => Ka.C.Coptite.urkVIII,177,1 )
[n] => Array ( [0] => Ka.C.Coptite.urkVIII,176c [1] => Ka.C.Coptite.urkVIII,177,1 [2] => Ka.C.Coptite.urkVIII,177,2 )
[nḫȝḫȝ] => Array ( [0] => Ka.C.Coptite.urkVIII,176c )
[nwj] => Array ( [0] => Ka.C.Coptite.urkVIII,176c )
[nfr] => Array ( [0] => Ka.C.Coptite.urkVIII,176c [1] => Ka.C.Coptite.urkVIII,177,2 )
[nḥḥ] => Array ( [0] => Ka.C.Coptite.urkVIII,176e [1] => Ka.C.Coptite.urkVIII,177,1 [2] => Ka.C.Coptite.urkVIII,177,1 )
[nḏ] => Array ( [0] => Ka.C.Coptite.urkVIII,177,1 )
)
What I do is:
uksort($result, 'compare_keys_by_alphabet');
This refers to the function:
function compare_keys_by_alphabet($a, $b)
{
static $alphabet = array( 1 => "-" , 2 => "," , 3 => ".", 4 => "ȝ", 5 => "j", 6 => "ʿ", 7 => "w", 8 => "b", 9 => "p", 10 => "f", 11 => "m", 12 => "n", 13 => "r", 14 => "h", 15 => "ḥ", 16 => "ḫ", 17 => "ẖ", 18 => "s", 19 => "š", 20 => "q", 21 => "k", 22 => "g", 23 => "t", 24 => "ṯ", 25 => "d", 26 => "ḏ", 27 => "⸗", 28 => "/", 29 => "(", 30 => ")", 31 => "[", 32 => "]", 33 => "<", 34 => ">", 35 => "{", 36 => "}", 37 => "'", 38 => "*", 39 => "#", 40 => "I", 41 => "0", 42 => "1", 43 => "2", 44 => "3", 45 => "4", 46 => "5", 47 => "6", 48 => "7", 49 => "8", 50 => "9", 51 => "&", 52 => "@", 53 => "%");
return compare_by_alphabet($alphabet, $a, $b);
}
using:
function compare_by_alphabet(array $alphabet, $str1, $str2) {
$c = max(strlen($str1), strlen($str2));
for ($i = 0; $i < $c; $i++) {
$s1 = $str1[$i];
$s2 = $str2[$i];
//if ($s1===$s2) continue;
$i1 = array_search($s1, $alphabet);
//if ($i1===false) continue;
$i2 = array_search($s2, $alphabet);
//sif ($i2===false) continue;
if ($i2==$i1) continue;
if ($i1 < $i2) return -1;
else return 1;
}
return 0;
}
This worked perfect with the non-unicode alphabet:
static $alphabet2 = array( 1 => '-' , 2 => ',' , 3 => '.' , 4 => "A", 5 => "j", 6 => "a", 7 => "w", 8 => "b", 9 => "p", 10 => "f", 11 => "m", 12 => "n", 13 => "r", 14 => "h", 15 => "H", 16 => "x", 17 => "X", 18 => "s", 19 => "S", 20 => "q", 21 => "k", 22 => "g", 23 => "t", 24 => "T", 25 => "d", 26 => "D", 27 => "=", 28 => "/", 29 => "(", 30 => ")", 31 => "[", 32 => "]", 33 => "<", 34 => ">", 35 => "{", 36 => "}", 37 => "'", 38 => "*", 39 => "#", 40 => "I", 41 => "1", 42 => "2", 43 => "3", 44 => "4", 45 => "5", 46 => "6", 47 => "7", 48 => "8", 49 => "9", 50 => "0", 51 => "&", 52 => "@", 53 => "%");
but once I replaced for example H
(nr 15) in alphabet2 with ḥ
in alphabet1 it didn't work anymore.
I suppose it has to do with recognizing the unicode, because as long as the words do not contain any special signs, the order is correct; but all words containing special signs are put at the beginning of the result.
I tried to look at unicode normalization; but I'm really only an amateur, so this is quite difficult.
Is this the problem or is there another problem and how can I fix it?