doubu4826 2013-11-14 21:29
浏览 37
已采纳

哪个更好:sha1_file(f)还是sha1(file_get_contents(f))?

I want to create a hash of a file which size minimum 5Mb and can extend to 1-2 Gb. Now tough choice arise in between these two methods although they work exactly same.

Method 1: sha1_file($file)
Method 2: sha1(file_get_contents($file))

I have tried with 10 Mb but there is no much difference in performance. But on higher data scale. What's better way to go?

  • 写回答

1条回答 默认 最新

  • dsgft1486 2013-11-14 21:38
    关注

    Use the most high-level form offered unless there is a compelling reason otherwise.

    In this case, the correct choice is sha1_file. Because sha1_file is a higher-level function that only works with files. This 'restriction' allows it to take advantage of the fact that the file/source can be processed as a stream1: only a small part of the file is ever read into memory at a time.

    The second approach guarantees that 5MB-2GB of memory (the size of the file) is wasted/used as file_get_contents reads everything into memory before the hash is generated. As the size of the files increase and/or system resources become limited this can have a very detrimental effect on performance.


    1 The source for sha1_file can be found on github. Here is an extract showing only lines relevant to stream processing:

    PHP_FUNCTION(sha1_file)
    {       
        stream = php_stream_open_wrapper(arg, "rb", REPORT_ERRORS, NULL);
        PHP_SHA1Init(&context);    
        while ((n = php_stream_read(stream, buf, sizeof(buf))) > 0) {
            PHP_SHA1Update(&context, buf, n);
        }    
        PHP_SHA1Final(digest, &context);    
        php_stream_close(stream);
    }
    

    By using higher-level functions, the onus of a suitable implementation is placed on the developers of the library. In this case it allowed the use of a scaling stream implementation.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 下图接收小电路,谁知道原理
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度
  • ¥30 关于#r语言#的问题:如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
  • ¥15 ETLCloud 处理json多层级问题
  • ¥15 matlab中使用gurobi时报错
  • ¥15 这个主板怎么能扩出一两个sata口
  • ¥15 不是,这到底错哪儿了😭
  • ¥15 2020长安杯与连接网探