douqian4411 2014-10-22 07:05
浏览 20
已采纳

如何从示例中获取页码(使用PHP)

I have different versions of filenames.

How can i get 123.pdf, 124.pdf and 125.pdf from it? The length of filenames can vary, 14-5678 is not relevant for this time and should be ignored.

  • 14-5678_jobname_0123_.p1.PDF
  • 14-5678_jobname_0123_.p2.PDF
  • 14-5678_jobname_0125_.p1.PDF
  • Weired_filename_0123_bla_14-5678_jobname.p1.PDF
  • Weired_filename_0123_bla_14-5678_jobname.p2.PDF
  • Weired_filename_0125_bla_14-5678_jobname.p1.PDF
  • 14-5678_jobname_0123.p1.PDF
  • 14-5678_jobname_0123.p2.PDF
  • 14-5678_jobname_0125.p1.PDF
  • 0123_14-5678_jobname.p1.PDF
  • 0123_14-5678_jobname.p2.PDF
  • 0125_14-5678_jobname.p1.PDF
  • jobname_0123_14-5678.p1.PDF
  • jobname_0123_14-5678.p2.PDF
  • jobname_0125_14-5678.p1.PDF

Tried for hours with regexp testers, I'm now totally stuck. Would love some PHP-Code which can do this job.

  • 写回答

1条回答 默认 最新

  • dongpang1898 2014-10-22 07:40
    关注

    You need to match a series of four numbers that are not preceded by a dash:

    /[^-](\d{4})/
    

    Decomposing the regex:

    • [^-]: not a dash
    • \d{4}: four digits
    • (\d{4}): capture the digits

    You can then add .pdf to get your file name.

    Example with preg_replace and the file names you've given above in an array:

    foreach ($files as $f) {
        echo "$f => " . preg_replace("/.*?[^-]*(\d{4}).+/", "$1.pdf", $f) . PHP_EOL;
    }
    

    ETA: if you want to factor in the page number, you could use this code:

    foreach ($files as $f) {
        # this saves the four digits of the PDF name, and the number in p1/p2
        preg_match("/.*?[^-]*(\d{4}).*?p(\d+)\.pdf/i", $f, $matches);
        # if the number (from p1/p2) is greater than 1, add it to the PDF name number
        if ($matches[2] > 1) {
            $matches[1] += $matches[2] - 1;
        }
        # format the pdf name to be four digits long, with zero padding for shorter names
        echo "$f => " . sprintf('%04d.pdf',  $matches[1]) . PHP_EOL;
    }
    

    Output:

    14-5678_jobname_0123_.p1.PDF => 0123.pdf
    14-5678_jobname_0123_.p2.PDF => 0124.pdf
    14-5678_jobname_0125_.p1.PDF => 0125.pdf
    Weired_filename_0123_bla_14-5678_jobname.p1.PDF => 0123.pdf
    Weired_filename_0123_bla_14-5678_jobname.p2.PDF => 0124.pdf
    Weired_filename_0125_bla_14-5678_jobname.p1.PDF => 0125.pdf
    14-5678_jobname_0123.p1.PDF => 0123.pdf
    14-5678_jobname_0123.p2.PDF => 0124.pdf
    14-5678_jobname_0125.p1.PDF => 0125.pdf
    0123_14-5678_jobname.p1.PDF => 0123.pdf
    0123_14-5678_jobname.p2.PDF => 0124.pdf
    0125_14-5678_jobname.p1.PDF => 0125.pdf
    jobname_0123_14-5678.p1.PDF => 0123.pdf
    jobname_0123_14-5678.p2.PDF => 0124.pdf
    jobname_0125_14-5678.p1.PDF => 0125.pdf
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 winform的chart曲线生成时有凸起
  • ¥15 msix packaging tool打包问题
  • ¥15 finalshell节点的搭建代码和那个端口代码教程
  • ¥15 用hfss做微带贴片阵列天线的时候分析设置有问题
  • ¥15 Centos / PETSc / PETGEM
  • ¥15 centos7.9 IPv6端口telnet和端口监控问题
  • ¥20 完全没有学习过GAN,看了CSDN的一篇文章,里面有代码但是完全不知道如何操作
  • ¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
  • ¥20 海浪数据 南海地区海况数据,波浪数据
  • ¥20 软件测试决策法疑问求解答