I have an issue with charsets and how they are encoded in a request I send. I have a test case where I want the code to end up with the exact same md5-hash on both sides. While still being the character 'å', obviously. (So not converted into some broken char or just '?')
The source input is utf8 and contains a norwegian character, for example "båt".
This input will then be sent to an API that wants data to be latin1 / ISO-8859-1.
One goal is also to avoid having to add utf8_decode to the receiving end.
So this is the very simplified code of what I've sent until now:
$password_send = 'båt';
echo "Test 1: " . md5( $password ) . "
";
$params = array('password' => $password);
$request = xmlrpc_encode_request($module, $params);
And this is how the receiving end treats it. It basically just converts it into an md5 hash and sends it to another method. No other conversion of the incoming data has been done.
$_hash = md5( $password_receive );
echo "Test 2: {$_hash}
";
Member::updatePassword($member_id, $_hash);
I need the $_hash to be (when 'båt' is sent) to end up as the hash 7e2cdd98fccee62723784a815a2ecdcb. Since this is the md5-hash that 'båt' resolves into when the password 'båt' is saved on the site itself (and not trough the API)
So when I send 'båt' in the API-request, then on the receiver end, it ends up with: fd9cac747daca144726dc579c32f48a, which is wrong. When I check the md5() of 'båt' before I send it, then it is also displayed as fd9cac747daca144726dc579c32f48ae.
I guess this is expected, since I don't use utf8_decode yet, but if I change what I send, like so: $password_send = utf8_decode('båt');
Then it still doesn't end up with the correct hash on the receiver end, then it ends up with: b865deb1e3b0891a41c5444c00893a0f
However, if I also add utf8_decode on the receiver end, like so: $_hash = utf8_decode($password_receive), then it ends up with the hash I need it to be: 7e2cdd98fccee62723784a815a2ecdcb
But this seems very wrong... Having to do utf8_decode on both sides. And while this hash is now correct on the receiving end too, the issue is that I don't want to change any code on the receiving end. And it doesn't work to just do utf8_decode twice before I send the value, because then I just end up with the hash c2d1fbc45e123f65edd74401ef58dd6a on the receiving end (which is the equivalent of doing md5('b?t'). It only worked when I do utf8_decode once before I send it, and once on the receiver end.
So I started to realize that xmlrpc_encode_request probably is the culprit, in that it maybe did some conversion on it's own. First I checked what a var_dump of $request said, in the cases where the $password_send value has NOT been utf8_decoded. And that is:
<string>båt</string>
When I do utf8_decode on the value $password_send before it's made into an xmlrpc request, then it is:
<string>båt</string>
Then I read the documentation on xmlrpc_encode_request. And I've tried various combinations of output_options, but none of them seems to work. In every scenario I still have to do utf8_decode in the code on the input data on the receiver end to end up with the exact same md5 that I need.
I realize this might be somewhat confusing. I would really really appreciate it if someone is able to help me out here. By giving me some pointer on what I should do or try. Because I've gotten completely lost on this issue now :(