dqxmf02844 2016-04-10 10:26
浏览 199
已采纳

如何转换HTML标记中的转义字符?

How can we directly convert "\u003chtml\u003e" to "<html>"? Conversion of "<html>" to "\u003chtml\u003e" is quite easy using json.Marshal(), but json.Unmarshal() is quite lengthy and cumbersome. Is there any direct way to do that in golang?

  • 写回答

3条回答 默认 最新

  • dongpan1416 2016-04-10 11:34
    关注

    You can use the strconv.Unquote() to do the conversion.

    One thing you should be aware of is that strconv.Unquote() can only unquote strings that are in quotes (e.g. start and end with a quote char " or a back quote char `), so we have to manually append that.

    Example:

    // Important to use backtick ` (raw string literal)
    // else the compiler will unquote it (interpreted string literal)!
    
    s := `\u003chtml\u003e`
    fmt.Println(s)
    s2, err := strconv.Unquote(`"` + s + `"`)
    if err != nil {
        panic(err)
    }
    fmt.Println(s2)
    

    Output (try it on the Go Playground):

    \u003chtml\u003e
    <html>
    

    Note: To do HTML text escaping and unescaping, you can use the html package. Quoting its doc:

    Package html provides functions for escaping and unescaping HTML text.

    But the html package (specifically html.UnescapeString()) does not decode unicode sequences of the form \uxxxx, only &#decimal; or &#xHH;.

    Example:

    fmt.Println(html.UnescapeString(`\u003chtml\u003e`)) // wrong
    fmt.Println(html.UnescapeString(`&#60;html&#62;`))   // good
    fmt.Println(html.UnescapeString(`&#x3c;html&#x3e;`)) // good
    

    Output (try it on the Go Playground):

    \u003chtml\u003e
    <html>
    <html>
    

    Note #2:

    You should also note that if you write a code like this:

    s := "\u003chtml\u003e"
    

    This quoted string will be unquoted by the compiler itself as it is an interpreted string literal, so you can't really test that. To specify quoted string in the source, you may use the backtick to specify a raw string literal or you may use a double quoted interpreted string literal:

    s := "\u003chtml\u003e" // Interpreted string literal (unquoted by the compiler!)
    fmt.Println(s)
    
    s2 := `\u003chtml\u003e` // Raw string literal (no unquoting will take place)
    fmt.Println(s2)
    
    s3 := "\\u003chtml\\u003e" // Double quoted interpreted string literal
                               // (unquoted by the compiler to be "single" quoted)
    fmt.Println(s3)
    

    Output:

    <html>
    \u003chtml\u003e
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab