dongyu3659 2015-05-04 08:49
浏览 41
已采纳

使用Go的archive / zip创建具有Unicode文件名的zip存档

package main

import (
    "archive/zip"
    "fmt"
    "io"
    "os"
    "path/filepath"
    "strings"
)

func main() {
    var (
        Path = os.Args[1]
        Name = os.Args[2]
    )

    File, _ := os.Create(Name)
    PS := strings.Split(Path, "\\")
    PathName := strings.Join(PS[:len(PS)-1], "\\")
    os.Chdir(PathName)
    Path = PS[len(PS)-1]
    defer File.Close()
    Zip := zip.NewWriter(File)
    defer Zip.Close()
    walk := func(Path string, info os.FileInfo, err error) error {
        if err != nil {
            fmt.Println(err)
            return err
        }
        if info.IsDir() {
            return nil
        }
        Src, _ := os.Open(Path)
        defer Src.Close()
        fmt.Println(Path)
        FileName, _ := Zip.Create(Path)
        io.Copy(FileName, Src)
        Zip.Flush()
        return nil
    }
    if err := filepath.Walk(Path, walk); err != nil {
        fmt.Println(err)
    }
}

This mydir path :

-----root
    |---2015-05(dir)
         |---中文.go
    |---package(dir)
    |---你好.go

When I use this code directory, Chinese will be garbled. Who can help me solve the problem.

  • 写回答

2条回答 默认 最新

  • doumanshan6314 2015-05-04 11:02
    关注

    The problem is that by default in zip entry names only the ASCII characters are allowed by the Zip specification, more specifically: (Source: APPENDIX D)

    APPENDIX D.1 The ZIP format has historically supported only the original IBM PC character encoding set, commonly referred to as IBM Code Page 437. This limits storing file name characters to only those within the original MS-DOS range of values and does not properly support file names in other character encodings, or languages. To address this limitation, this specification will support the following change.

    Later support for Unicode names has been added. This can be marked with a special bit referred to as general purpose bit 11, also called Language encoding flag (EFS):

    Section 4.4.4 - General purpose bit flag - Bit 11 - Language encoding flag (EFS). If this bit is set, the filename and comment fields for this file MUST be encoded using UTF-8.

    APPENDIX D.2 If general purpose bit 11 is unset, the file name and comment should conform to the original ZIP character encoding. If general purpose bit 11 is set, the filename and comment must support The Unicode Standard, Version 4.1.0 or greater using the character encoding form defined by the UTF-8 storage specification. The Unicode Standard is published by the The Unicode Consortium (www.unicode.org). UTF-8 encoded data stored within ZIP files is expected to not include a byte order mark (BOM).

    The general purpose bit flag is present and supported by Go: it is the Flags field of the FileHeader struct. Unfortunately Go doesn't have methods to set this bit, and by default it is 0.

    So the easiest way to add support for Unicode names is to simply set bit 11 to one. Instead of

    FileName, _ := Zip.Create(Path)
    

    Start your zip entry with:

    h := &zip.FileHeader{Name:Path, Method: zip.Deflate, Flags: 0x800}
    FileName, _ := Zip.CreateHeader(h)
    

    The first line creates a FileHeader in which 0x800 (bit 11) value is set for the Flags field which tells that the file name will be encoded using UTF-8 (which is what Go does when it writes a string to an io.Writer).

    Note:

    By doing this, UTF-8 filenames will be preserved, but not all zip reader/extractor supports it. For example on Windows, the windows file handler, the Windows Explorer will not decode it as UTF-8, but for example a more serious Zip handler (e.g. SecureZip) will see the UTF-8 file names and will extract the file names properly (using UTF-8 decoding).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 matlab数字图像处理频率域滤波
  • ¥15 在abaqus做了二维正交切削模型,给刀具添加了超声振动条件后输出切削力为什么比普通切削增大这么多
  • ¥15 ELGamal和paillier计算效率谁快?
  • ¥15 file converter 转换格式失败 报错 Error marking filters as finished,如何解决?
  • ¥15 ubuntu系统下挂载磁盘上执行./提示权限不够
  • ¥15 Arcgis相交分析无法绘制一个或多个图形
  • ¥15 关于#r语言#的问题:差异分析前数据准备,报错Error in data[, sampleName1] : subscript out of bounds请问怎么解决呀以下是全部代码:
  • ¥15 seatunnel-web使用SQL组件时候后台报错,无法找到表格
  • ¥15 fpga自动售货机数码管(相关搜索:数字时钟)
  • ¥15 用前端向数据库插入数据,通过debug发现数据能走到后端,但是放行之后就会提示错误