dongyu3659 2015-05-04 08:49
浏览 41
已采纳

使用Go的archive / zip创建具有Unicode文件名的zip存档

package main

import (
    "archive/zip"
    "fmt"
    "io"
    "os"
    "path/filepath"
    "strings"
)

func main() {
    var (
        Path = os.Args[1]
        Name = os.Args[2]
    )

    File, _ := os.Create(Name)
    PS := strings.Split(Path, "\\")
    PathName := strings.Join(PS[:len(PS)-1], "\\")
    os.Chdir(PathName)
    Path = PS[len(PS)-1]
    defer File.Close()
    Zip := zip.NewWriter(File)
    defer Zip.Close()
    walk := func(Path string, info os.FileInfo, err error) error {
        if err != nil {
            fmt.Println(err)
            return err
        }
        if info.IsDir() {
            return nil
        }
        Src, _ := os.Open(Path)
        defer Src.Close()
        fmt.Println(Path)
        FileName, _ := Zip.Create(Path)
        io.Copy(FileName, Src)
        Zip.Flush()
        return nil
    }
    if err := filepath.Walk(Path, walk); err != nil {
        fmt.Println(err)
    }
}

This mydir path :

-----root
    |---2015-05(dir)
         |---中文.go
    |---package(dir)
    |---你好.go

When I use this code directory, Chinese will be garbled. Who can help me solve the problem.

  • 写回答

2条回答 默认 最新

  • doumanshan6314 2015-05-04 11:02
    关注

    The problem is that by default in zip entry names only the ASCII characters are allowed by the Zip specification, more specifically: (Source: APPENDIX D)

    APPENDIX D.1 The ZIP format has historically supported only the original IBM PC character encoding set, commonly referred to as IBM Code Page 437. This limits storing file name characters to only those within the original MS-DOS range of values and does not properly support file names in other character encodings, or languages. To address this limitation, this specification will support the following change.

    Later support for Unicode names has been added. This can be marked with a special bit referred to as general purpose bit 11, also called Language encoding flag (EFS):

    Section 4.4.4 - General purpose bit flag - Bit 11 - Language encoding flag (EFS). If this bit is set, the filename and comment fields for this file MUST be encoded using UTF-8.

    APPENDIX D.2 If general purpose bit 11 is unset, the file name and comment should conform to the original ZIP character encoding. If general purpose bit 11 is set, the filename and comment must support The Unicode Standard, Version 4.1.0 or greater using the character encoding form defined by the UTF-8 storage specification. The Unicode Standard is published by the The Unicode Consortium (www.unicode.org). UTF-8 encoded data stored within ZIP files is expected to not include a byte order mark (BOM).

    The general purpose bit flag is present and supported by Go: it is the Flags field of the FileHeader struct. Unfortunately Go doesn't have methods to set this bit, and by default it is 0.

    So the easiest way to add support for Unicode names is to simply set bit 11 to one. Instead of

    FileName, _ := Zip.Create(Path)
    

    Start your zip entry with:

    h := &zip.FileHeader{Name:Path, Method: zip.Deflate, Flags: 0x800}
    FileName, _ := Zip.CreateHeader(h)
    

    The first line creates a FileHeader in which 0x800 (bit 11) value is set for the Flags field which tells that the file name will be encoded using UTF-8 (which is what Go does when it writes a string to an io.Writer).

    Note:

    By doing this, UTF-8 filenames will be preserved, but not all zip reader/extractor supports it. For example on Windows, the windows file handler, the Windows Explorer will not decode it as UTF-8, but for example a more serious Zip handler (e.g. SecureZip) will see the UTF-8 file names and will extract the file names properly (using UTF-8 decoding).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 Matlab在app上输入带有矩阵形式的初始条件发生错误
  • ¥15 CST仿真别人的模型结果仿真结果S参数完全不对
  • ¥15 误删注册表文件致win10无法开启
  • ¥15 请问在阿里云服务器中怎么利用数据库制作网站
  • ¥60 ESP32怎么烧录自启动程序
  • ¥50 html2canvas超出滚动条不显示
  • ¥15 java业务性能问题求解(sql,业务设计相关)
  • ¥15 52810 尾椎c三个a 写蓝牙地址
  • ¥15 elmos524.33 eeprom的读写问题
  • ¥15 用ADS设计一款的射频功率放大器