dongweihuan8610 2016-04-01 16:10
浏览 158
已采纳

在Golang中将文本输入标准化为ASCII

I am building a small tool which parses a user's input and finds common pitfalls in writing and flags them so the user can improve their text. So far everything works well except for text that has curly quotes compared to normal ASCII straight quotes. I have a hack now which will do a string replacement for opening (and closing) single curly quotes and double opening (and close) curly quotes like so:

cleanedData := bytes.Replace([]byte(data), []byte("’"), []byte("'"), -1)

I feel like there must be a better way to handle this in the stdlib so I can also convert other non-ascii characters to an ascii equivalent. Any help would be greatly appreciated.

  • 写回答

1条回答 默认 最新

  • dongshuobei1037 2016-04-01 16:37
    关注

    The strings.Map function looks to me like what you want.

    I don't know of a generic 'ToAscii' type function, but Map has a nice approach for mapping runes to other runes.

    Example (updated):

    func main() {
        data := "Hello “Frank” or ‹François› as you like to be ‘called’"
        fmt.Printf("Original: %s
    ", data)
        cleanedData := strings.Map(normalize, data)
        fmt.Printf("Cleaned: %s
    ", cleanedData)
    }
    
    func normalize(in rune) rune {
        switch in {
        case '“', '‹', '”', '›':
            return '"'
        case '‘', '’':
            return '\''
        }
        return in
    }
    

    Output:

    Original: Hello “Frank” or ‹François› as you like to be ‘called’
    Cleaned: Hello "Frank" or "François" as you like to be 'called'
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#matlab#的问题:在模糊控制器中选出线路信息,在simulink中根据线路信息生成速度时间目标曲线(初速度为20m/s,15秒后减为0的速度时间图像)我想问线路信息是什么
  • ¥15 banner广告展示设置多少时间不怎么会消耗用户价值
  • ¥16 mybatis的代理对象无法通过@Autowired装填
  • ¥15 可见光定位matlab仿真
  • ¥15 arduino 四自由度机械臂
  • ¥15 wordpress 产品图片 GIF 没法显示
  • ¥15 求三国群英传pl国战时间的修改方法
  • ¥15 matlab代码代写,需写出详细代码,代价私
  • ¥15 ROS系统搭建请教(跨境电商用途)
  • ¥15 AIC3204的示例代码有吗,想用AIC3204测量血氧,找不到相关的代码。