doushi2845 2019-04-28 11:21
浏览 69
已采纳

PDF文件是否可以定义0页,否则会导致0作为页面大小?

I have a PHP script using Imagick, but there is the risk of a NAN error, should a PDF file provided by a user contain no pages or have a page with no height or no width. I am not sure if this is possible in a PDF structure. Also making a jpeg from a page number larger than the total pages will cause an error. Is it generally possible a valid PDF file wrapper is sent but without actual page content?

The core question: How can we count and measure pages for a proper error capture before entering the conversion from PDF to JPEG?

In the function below I assume it might be possible to have 0 height or 0 width. And use the code if($imH==0){$imH=1;} but having code based on an assumption doesn't feel right.

parts of the function were adopted from an article by umidjons: https://gist.github.com/umidjons/11037635

PHP code:

function genPdfThumbnail ( $src, $targ, $size=256, $page=1 ){

    if(file_exists($src) && !is_dir($src)): // source path must be available and cannot be a directory

        if(mime_content_type($src) != 'application/pdf'){return FALSE;} // source is not a pdf file returns a failure

        $sepa   =   '/'; // using '/' as path separation for nfs on linux.
        $targ   =   dirname($src).$sepa.$targ;
        $size   =   intval($size); // only use as integer, default is 256
        $page   =   intval($page); // only use as integer, default is 1     
        $page--; // default page 1, must be treated as 0 hereafter
        if ($page<0){$page=0;} // we cannot have negative values

        $img    =   new Imagick($src."[$page]");
        $imH    =   $img->getImageHeight();
        $imW    =   $img->getImageWidth();

        if ($imH==0) {$imH=1;} // if the pdf page has no height use 1 instead
        if ($imW==0) {$imW=1;} // if the pdf page has no width use 1 instead
        $sizR   =   round($size*(min($imW,$imH)/max($imW,$imH))); // relative pixels of the shorter side

        $img    ->  setImageColorspace(255); // prevent image colors from inverting
        $img    ->  setImageBackgroundColor('white'); // set background color before flatten
        $img    =   $img->flattenImages(); // prevent black zones on transparency in pdf
        $img    ->  setimageformat('jpeg');

        if ($imH == $imW){$img->thumbnailimage($size,$size);} // square page 
        if ($imH < $imW) {$img->thumbnailimage($size,$sizR);} // landscape page orientation
        if ($imH > $imW) {$img->thumbnailimage($sizR,$size);} // portrait page orientation      
        if(!is_dir(dirname($targ))){mkdir(dirname($targ),0777,true);} // if not there make target directory

        $img    ->  writeimage($targ);
        $img    ->  clear();
        $img    ->  destroy();

        if(file_exists( $targ )){ return $targ; } // return the path to the new file for further processing

    endif;
    return FALSE; // source file not available or Imagick didn't create jpeg file, returns a failure

}

call the function e.g. like:

$newthumb = genPdfThumbnail('/nfs/vsp/server/u/user/public_html/any.pdf','thumbs/any.p01.jpg',150,'01');
  • 写回答

2条回答 默认 最新

  • duanlu1959 2019-04-28 12:17
    关注

    Sure, a PDF file is a container format that can contain pretty much anything, including (only) metadata with 0 pages. But even so, with this code it's quite possible to request a thumbnail for page 21 on a document that only contains 5 pages.

    If that happens, the problem will occur on this line:

    $img    =   new Imagick($src."[$page]");
    

    This will throw an exception if the provided page does not exist. You can catch that exception and handle it however you want:

    try {
        $img = new Imagick($src."[$page]");
    } except (ImagickException $error) {
        return false;
    }
    

    If you want to read the number of pages beforehand, you can try to let Imagick parse the document first:

    $pdf = new Imagick($src);
    $pages = $pdf->getNumberImages();
    

    The function name is a bit misleading, see this comment in the PHP manual:

    "For PDFs this function indicates the number of pages on the PDF, NOT images that might be embedded within the PDF."

    Here as well, if the PDF document is invalid in some way, this can throw an exception so you might want to catch that and handle it:

    try {
        $pdf = new Imagick($src);
        $pages = $pdf->getNumberImages();
    } except (ImagickException $error) {
        return false;
    }
    
    if ($pages < $page) {
        return false;
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 matlab在安装时报错 无法找到入口 无法定位程序输入点
  • ¥15 收益高的广告联盟有哪些
  • ¥15 Android Studio webview 的使用问题, 播放器横屏全屏
  • ¥15 删掉jdk后重新下载,Java web所需要的eclipse无法使用
  • ¥15 uniapp正式环境中通过webapi将本地数据推送到设备出现的跨域问题
  • ¥15 xui建立节点,显示错误
  • ¥15 关于#单片机#的问题:开始、复位、十进制的功能可以实现,但是切换八进制的功能无法实现(按下按键也没有效果),把初始状态调成八进制,也是八进制可以实现但是切换到十进制不行(相关搜索:汇编语言|计数器)
  • ¥15 VINS-Mono或Fusion中feature_manager中estimated_depth是特征的深度还是逆深度?
  • ¥15 谷歌浏览器如何备份抖音网页数据
  • ¥15 分别有什么商家下面需要非常多的骑手为它工作?