I have stored text in my database as utf-8. The text may contain some ascii art from the code point U+2500 to U+25FF range in the utf-8 table, perhaps even other code point ranges. I'm not sure since this is user edited. The text is delivered to the client using JSON over REST, and somehow the text is distorted along the way.
The utf-8 character range I'm referring to here, is of a three byte nature. For instance, a set of 0xe2 0x96 0x93
would in utf-8 equal ▓
or as plain text ▓. Instead of displaying this single character, the client displays each individual byte as it's own character, which would end up displaying â–‘
Not sure how to attack this. I've tried to figure out a way to convert the text server side using php, but the utf-8 table is just to large for me, and checking for every potential three byte combination is just overwelming. This should be easy. I've seen this done before, but then the entire page was rendered server side. This is an angularJs page.
Can anybody give me pointers as to solve this problem? Thank you.
Edit: stripped down code (php rendered)
<head>
<meta charset="utf-8">
...
</head>
<body>
...
<?
$link = new mysqli($mysql_host, $mysql_user, $mysql_pass, $mysql_db);
$result = $link->query('select descr from my_table where id = 1');
?>
<div class="well"><?=$result['descr']; ?></div>
...
</body>
Would display:
░░ ░░
▒░ ░▒ ▓░ ░▓ ▓░ ░▓ █░ ░█ █ █ █▓ ▓█ ▓▓▌ ▐▓▓ ▀██▄ ▄██▀ ▄ ▄████▓▓▓█▄█▀▀ ▀▀ ▄ ▄▄▄▌▄▓████████████████████████▓▀▓████▄ ▄
The same database table queried using Laravel and sent over http async (also largely stripped down code):
class MyModel extends Eloquent {
public function getDescriptionAttribute($value) {
return $value;
}
}
class MyModelController extends Controller {
public function getModel ($modelId) {
return Response::json(MyModel::findOrFail($modelId));
}
}
angular.module(_SERVICES_).factory('MyModelService', ['Restangular', function (Restangular) {
'use strict';
return {
get: function (id) {
return Restangular.one('model', id).get();
}
}
}
angular.module(_CONTROLLERS_).controller('MyModelCtrl', ['$scope', '$routeParams', 'MyModelService',
function ($scope, $routeParams, MyModelService) {
MyModelService.get($routeParams.modelId).then(function (response) {
$scope.model = response;
}
}
}
<head>
<meta charset="utf-8">
...
</head>
<body ng-controller="MyModelCtrl">
....
<div class="well" ng-bind-html="model.descr"></div>
...
</body>
Would display
â–‘ â–‘ â–‘â–‘ â–‘â–‘ â–’â–‘ â–‘â–’ â–’â–‘ â–‘â–’ â–“â–‘ â–‘â–“ â–“â–‘ â–‘â–“ █░ â–‘â–ˆ â–ˆ â–ˆ █▓ â–“â–ˆ â–“â–“â–Œ â–â–“â–“ ▀██▄ ▄██▀ â–„ ▄████▓▓▓█▄█▀▀ ▀▀ â–„ ▄▄▄▌▄▓████████████████████████▓▀▓████▄ â–„
Edit2: Request/Response headers
PHP rendered page
Request Headers
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8,nb;q=0.6,nn;q=0.4,no;q=0.2
Cache-Control: no-cache Connection:keep-alive
Host: example.com Pragma:no-cache
Referer: http://example.com/mymodel.php?id=1
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36
Response Headers
Cache-control: private
Connection: Keep-Alive
Content-Encoding: gzip
Content-Type: text/html;charset=utf-8
Date: Tue, 29 Jul 2014 20:18:10 GMT
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Keep-Alive: timeout=5, max=100
Pragma: no-cache Server:Apache/2.4.9 (Unix) OpenSSL/1.0.1e-fips mod_bwlimited/1.4
Transfer-Encoding: chunked
Vary: Accept-Encoding,User-Agent
X-Powered-By: PHP/5.4.28
Async request/response
Request Headers
Accept: application/json, text/plain, */*
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8,nb;q=0.6,nn;q=0.4,no;q=0.2
Cache-Control: no-cache
Connection: keep-alive
Host: example.com
Pragma: no-cache
Referer: http://www.example.com/modelview/1
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/36.0.1985.125 Safari/537.36
Response Headers
Cache-Control: max-age=0
Cache-Control: no-cache
Connection: Keep-Alive
Content-Encoding: gzip
Content-Length: 3669
Content-Type: application/json;charset=utf-8
Date: Wed, 30 Jul 2014 12:29:11 GMT
Expires: Wed, 30 Jul 2014 12:29:11 GMT
Keep-Alive: timeout=5, max=96
Server: Apache/2.4.9 (Unix) OpenSSL/1.0.1e-fips mod_bwlimited/1.4
Vary: Accept-Encoding,User-Agent
X-Frame-Options: SAMEORIGIN
X-Powered-By: PHP/5.4.28
X-UA-Compatible: IE=edge,chrome=1