dou4064 2014-08-22 05:54 采纳率: 100%

使用UFPDF的FPDF Unicode支持

I am struggeling around a long time with this and I suspect other users do as well.

First I have to say that I have no alternative to FPDF because I use a lot of other FPDF modules, so please try not to recommend to use another library Like TCPDF.

I really need to make FPDF to be able to handle UTF-8 characters in a stable way.

What I already found out:

There is an extension called UFPDF http://acko.net/blog/ufpdf-unicode-utf-8-extension-for-fpdf/

The extension supports TrueType fonts only for now but it should work for me. The .ttf file has to be converted by a tool called ttf2ufm and the resulting .ufm and the source .ttf are convertet to font.php, font.z and font.ctg.z file by using the given tool makefontuni.php.

So far so good. So I tried to convert the Arial font from my computer. (arial.ttf, arialbd.ttf, arialbi.ttf, ariali.ttf)

It worked and I was able to produce a test.pdf with unicode characters. But the was a error popup shown by AdobeReader which says something like: Bad Parameter - the font ArialMT contains bad /Widths.

I noticed that all characters had the same width (i suspect the default widths) so I tried to debug.

I found out that UPDF adds the widths to the PDF like this:

charnumber [width] charnumber [width]

85 [276] (for the "u" character)

And I found out that some characters had a negative index value:

-70 [266]

The index values are created by ttf2ufm. If i look at the resulting arial.ufm i found entries like this:

U -70 ; WX 450 ; N uni06BE ; G 1003 ; B -70 256 788 1136 ;

I suspected that U is the index in utf-8 table and I modified the makefontuni.php to make it to ignore negative values for U. Created the font.php, font.z and font.ctg.z again and it worked. The Error-Notice was not shown and the characters was shown up with the correct width.

So the first question is: Why does ttf2ufm produce negative values for U? Is this correct? And if it is correct why is the AdobeReader not able to handle it?

I hoped that was all but it is not.

I did some more tests by using the BOLD font and the lower "u" character was shown as a strange sign when using arial bold.

I debugged again and I found this line for the "u" character in arialbd.ufm

U 117 ; WX 611 ; N u ; G 88 ; B 141 -24 1107 1062 ;

I searched for "U 117" in that file and I found another character beginning with "U 117 ;". I already removed it so I cannot post the line here. However this was the wrong char shown in pdf and after removing it the u had been displayed correctly.

So the second question is: What is the reason why ttf2ufm produces a .ufm file with 2 characters with the same index? This happens only for arialbd.ttf not for arial.ttf.

However i solved it for now hoping there are no other double-index-characters.

More issues:

I recognized that the resulting arial.php contains the character widths:

$cw=array(
    32=>278, 160=>278, 33=>278, 34=>355, 35=>556, 36=>556, 
    37=>889, 38=>667, 39=>191, 40=>333, 41=>333, 42=>389, 43=>584, 
    44=>278, 45=>333, 173=>333, [...]

The arial.php in non-unicode version contains the $cw array, too. But it uses the character itself as index, not the index number:

$cw=array(  
    chr(0)=>750,chr(1)=>750,chr(2)=>750,chr(3)=>750,chr(4)=>750,
    chr(5)=>750,chr(6)=>750,chr(7)=>750,chr(8)=>750,chr(9)=>750,chr(10)=>750,
    chr(11)=>750,chr(12)=>750, [...]

And fpdf.php sometimes tries to access the $cw value and some other modules do it, too to be able to compute the width of given string. All of this failed for UFPDF.

I tried to fix it by modify fpdf.php and all modules that try to access $cw like this:

I created a method called charlength in fpdf class:

function charlength($char) 
{
    $cw = &$this->CurrentFont['cw'];
    return $cw[$char];
}

And made FPDF to call charlength whenever it wants to access $this->CurrentFont['cw']:

function GetStringWidth($s)
{
    // Get width of a string in the current font
    $s = (string)$s;
    // $cw = &$this->CurrentFont['cw']; // Old FPDF-Code
    $w = 0;
    $l = strlen($s);
    for($i=0;$i<$l;$i++) {
        // $w += $cw[$s[$i]]; // Old FPDF-Code
        $w += $this->charlength($s[$i]); // My replacement
    }
    return $w*$this->FontSize/1000;
}

In ufpdf.php i override the method charlength like this:

function charlength($char) {    
    $cw = &$this->CurrentFont['cw'];
    $utf8dec = $this->ordutf8($char, $offset);        
    if(!isset($cw[$utf8dec])) {
        return 0;
    }
    return $cw[$utf8dec];
}


function ordutf8($string, &$offset) {
    $string = class_stringTools::utf8_decode($string);
    $code = ord(substr($string, $offset,1));
    if ($code >= 128) {        //otherwise 0xxxxxxx
        if ($code < 224) $bytesnumber = 2;                //110xxxxx
        else if ($code < 240) $bytesnumber = 3;        //1110xxxx
        else if ($code < 248) $bytesnumber = 4;    //11110xxx
        else return -1;
        $codetemp = $code - 192 - ($bytesnumber > 2 ? 32 : 0) - ($bytesnumber > 3 ? 16 : 0);
        for ($i = 2; $i <= $bytesnumber; $i++) {
            $offset ++;
            $code2 = ord(substr($string, $offset, 1)) - 128;        //10xxxxxx
            $codetemp = $codetemp*64 + $code2;
        }
        $code = $codetemp;
    }
    $offset += 1;
    if ($offset >= strlen($string)) $offset = -1;
    return $code;
}

The ordutf8 method is from php.net but i had to modify it because i got strage values for $code one time the value of $code was 252 which results in an undefined $bytenumber.

However it seems to work for now but I am not very happy with editing the source of fpdf.php and the source of other modules. And I am wondering that nobody else reports the issues i struggled with.

I know i have written very much but i want to know if everyone had the same issues. What do you think about the last modifications? Do you have some improvements? I really need a stable way to make FPDF to support unicode characters. Please help me.

It is a shame that the author of ufpdf has no time to support this.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

报告相同问题？

关注问题

UFPDF SMS-开源
2021-08-04 04:26

这是一个 PHP 内容管理系统，用于管理多人创意写作角色扮演，也称为 SIMMING。
FPDF－UTF8
2013-11-13 18:35

支持FPDF的utf8编码类库。虽然FPDF现在已经停止升级了，但FPDF可谓是状元老级的PDF程序库，短短的几走程序就可以产生出PDF档案。最可怕的是现今天的PHP PDF...由于FPDF不支持UTF-8，所以有善心人世做了一个UFPDF我们。
UFPDF 解决 PDF上西方文字的问题
2010-12-20 17:46

这个是一个用unicode来解决西方字体在pdf上的问题的这个是一个用unicode来解决西方字体在pdf上的问题的这个是一个用unicode来解决西方字体在pdf上的问题的
php生成pdf文档,PHP生成PDF文件类库大全[开源]
2021-03-26 14:45

流浪者李田所的博客虽然PHP有附PDFlib，不过使用起来实在有点复杂。(PHP说明文件中的范例)FPDF虽然现在已经停止更新了，但FPDF可谓是元老级的PDF链接...(FDPF的范例)UFPDF由于FPDF不支持UTF-8，所以有善心人士做了一个UFPDF出来。FPDI...
php文件保存类库,PHP生成PDF文件类库大全[开源]
2021-04-09 11:35

登高至远的博客虽然PHP有附PDFlib，不过使用起来实在有点复杂。(PHP说明文件中的范例)FPDF虽然现在已经停止更新了，但FPDF可谓是元老级...(FDPF的范例)授权方式:任你处置官方网址:外链网址已屏蔽UFPDF由于FPDF不支持UTF-8，所以有...
php library,[PHP] 免費好用的 PDF Library 大搜集
2021-04-11 11:37

赶稿某张的博客雖然 PHP 有附 PDFlib，不過使用起來實在有點複雜。(PHP 說明文件中的範例)FPDF雖然現在已經停止更新了，但 FPDF ...(可謂程式界的桃生純太)(FDPF 的範例)UFPDF由於 FPDF 不支援 UTF-8 ，所以有善心人士做了一個 UF...
php pdf english french duch……应用攻略,ufpdf
2010-12-20 17:42

weixin_33809981的博客 http://acko.net/node/56 这个关于用unicode觉得西方字体在pdf上的应用的问题大名鼎鼎的UFPDF
ttf2ufm makefontuni
2014-01-29 16:32

Setting up a Truetype font for usage with UFPDF: 1) Generate the font's .ufm metrics file by processing it with the provided ttf2ufm program (modified ttf2pt1). For example: $ ttf2ufm -a -F myfont....
PHP生成PDF文件类库大全[开源]
2013-08-29 22:08

weixin_34006965的博客这是一个纯PHP的库，它没有使用PDFlib。完全免费。没有任何license的限制。 2）iText http://itextpdf.com/ 。这是一个基于Java的库。iText#则是一个基于.NET的库。使用MPL/LGPL的license。 ...
免費好用的 PDF Library 大收集
2007-11-24 12:45

bjbs_270的博客 url:http://joesen.f2blog.com/read-24.html虽然 PHP 有附 PDFlib，不过使用起来实在有点复杂。FPDF 虽然现在已经停止更新了，但 FPDF 可谓是元老级的 PDF 程式库，短短的几行程式就可以产生出 PDF 档案。最可怕的是...
tcpdf生成pdf
2014-08-11 17:03

weixin_34343689的博客这几天从fpdf，ufpdf到tcpdf遇到了很多问题，添加字体，文件编码，样式等等，到今天终于差不多了，留文纪念。虽然我很笨-但别人能做到的我也能做！转载于:...
没有解决我的问题, 去提问

悬赏问题

¥20 Vs code Mac系统 PHP Debug调试环境配置
¥60 大一项目课，微信小程序
¥15 求视频摘要youtube和ovp数据集
¥15 在启动roslaunch时出现如下问题
¥15 汇编语言实现加减法计算器的功能
¥20 关于多单片机模块化的一些问题
¥30 seata使用出现报错，其他服务找不到seata
¥35 引用csv数据文件（4列1800行），通过高斯-赛德尔法拟合曲线，在选取（每五十点取1点）数据，求该数据点的曲率中心。
¥20 程序只发送0X01,串口助手显示不正确,配置看了没有问题115200-8-1-no，如何解决？
¥15 Google speech command 数据集获取

码龄粉丝数原力等级 --

使用UFPDF的FPDF Unicode支持

0条回答默认最新

悬赏问题

使用UFPDF的FPDF Unicode支持

0条回答 默认 最新

悬赏问题

0条回答默认最新