dongweihuan8610 2016-04-01 16:10
浏览 158
已采纳

在Golang中将文本输入标准化为ASCII

I am building a small tool which parses a user's input and finds common pitfalls in writing and flags them so the user can improve their text. So far everything works well except for text that has curly quotes compared to normal ASCII straight quotes. I have a hack now which will do a string replacement for opening (and closing) single curly quotes and double opening (and close) curly quotes like so:

cleanedData := bytes.Replace([]byte(data), []byte("’"), []byte("'"), -1)

I feel like there must be a better way to handle this in the stdlib so I can also convert other non-ascii characters to an ascii equivalent. Any help would be greatly appreciated.

  • 写回答

1条回答 默认 最新

  • dongshuobei1037 2016-04-01 16:37
    关注

    The strings.Map function looks to me like what you want.

    I don't know of a generic 'ToAscii' type function, but Map has a nice approach for mapping runes to other runes.

    Example (updated):

    func main() {
        data := "Hello “Frank” or ‹François› as you like to be ‘called’"
        fmt.Printf("Original: %s
    ", data)
        cleanedData := strings.Map(normalize, data)
        fmt.Printf("Cleaned: %s
    ", cleanedData)
    }
    
    func normalize(in rune) rune {
        switch in {
        case '“', '‹', '”', '›':
            return '"'
        case '‘', '’':
            return '\''
        }
        return in
    }
    

    Output:

    Original: Hello “Frank” or ‹François› as you like to be ‘called’
    Cleaned: Hello "Frank" or "François" as you like to be 'called'
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 乘性高斯噪声在深度学习网络中的应用
  • ¥15 运筹学排序问题中的在线排序
  • ¥15 关于docker部署flink集成hadoop的yarn,请教个问题 flink启动yarn-session.sh连不上hadoop,这个整了好几天一直不行,求帮忙看一下怎么解决
  • ¥30 求一段fortran代码用IVF编译运行的结果
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥30 python代码,帮调试,帮帮忙吧