douyun3022 2017-03-21 03:34
浏览 6
已采纳

自上次修改以来如何获取文件的添加内容

I'm working on a project in golang that needs to index recently added file content (using framework called bleve), and I'm looking for a solution to get content of a file since last modification. My current work-around is to record the last indexed position of each file, and during indexing process later on I only retrieve file content starting from the previous recorded position.

So I wonder if there's any library or built-in functionality for this? (doesn't need to be restricted to go, any language could work)

I'll really appreciate it if anyone has a better idea than my work-around as well!

Thanks

  • 写回答

2条回答 默认 最新

  • dongzhe3171 2017-03-21 06:17
    关注

    It depends on how the files change.

    If the files are append-only, then you only need to record the last offset where you stopped indexing, and start from there.

    If the changes can happen anywhere, and the changes are mostly replacing old bytes with new bytes (like changing pixels of an image), then perhaps you can consider computing checksum for small chucks, and only index those chunks that has different checksums.

    You can check out crypto package in Go standard library for computing hashes.

    If the changes are line insertion/deletion to text files (like changes to source code), then maybe a diff algorithm can help you find the differences. Something like https://github.com/octavore/delta.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 matlab生成电测深三层曲线模型代码
  • ¥50 随机森林与房贷信用风险模型
  • ¥50 buildozer打包kivy app失败
  • ¥30 在vs2022里运行python代码
  • ¥15 不同尺寸货物如何寻找合适的包装箱型谱
  • ¥15 求解 yolo算法问题
  • ¥15 虚拟机打包apk出现错误
  • ¥15 用visual studi code完成html页面
  • ¥15 聚类分析或者python进行数据分析
  • ¥15 三菱伺服电机按启动按钮有使能但不动作