doukaojie8573 2018-09-27 03:41
浏览 86
已采纳

如何从xml(包括标签)中提取完整的html?

I have the following code:

package main

import (
    "encoding/xml"
    "fmt"
)

func main() {
    xr := &xmlResponse{}

    if err := xml.Unmarshal([]byte(x), &xr); err != nil {
        panic(err)
    }

    fmt.Printf("%+v", xr)
}

type xmlResponse struct {
    //Title string `xml:"title,omitempty"`
    Title struct {
        BoldWords []struct {
            Bold string `xml:",chardata"`
        } `xml:"bold,omitempty"`
        Title string `xml:",chardata" `
    } `xml:"title,omitempty"`
}

var x = `<?xml version="1.0" encoding="utf-8"?>
<mytag version="1.0">
  <title><bold>Go</bold> is a programming language. I repeat: <bold>Go</bold> is a programming language.</title>
</mytag>`

This outputs:

&{Title:{BoldWords:[{Bold:Go} {Bold:Go}] Title: is a programming language. I repeat:  is a programming language.}}

How do I get:

<bold>Go</bold> is a programming language. I repeat: <bold>Go</bold> is a programming language.

In other words, I need not only the tags but also keep them in the proper place and not just a slice of bolded items. Trying to get it just as a string (e.g. uncommenting the first "Title" in xmlResponse struct) leaves out the bolded items entirely.

  • 写回答

1条回答 默认 最新

  • dswsl2016 2018-09-27 03:48
    关注

    From the Docs

    If the XML element contains character data, that data is
    accumulated in the first struct field that has tag ",chardata". The struct field may have type []byte or string. If there is no such field, the character data is discarded.

    This is actually not what you want, what you're looking for is:

    If the struct has a field of type []byte or string with tag
    ",innerxml", Unmarshal accumulates the raw XML nested inside the
    element in that field. The rest of the rules still apply.

    So, use innerxml instead of chardata.

    package main
    
    import (
        "encoding/xml"
        "fmt"
    )
    
    func main() {
        xr := &xmlResponse{}
    
        if err := xml.Unmarshal([]byte(x), &xr); err != nil {
            panic(err)
        }
    
        fmt.Printf("%+v", xr)
    }
    
    type xmlResponse struct {
        //Title string `xml:"title,omitempty"`
        Title struct {
            Title string `xml:",innerxml" `
        } `xml:"title,omitempty"`
    }
    
    var x = `<?xml version="1.0" encoding="utf-8"?>
    <mytag version="1.0">
      <title><bold>Go</bold> is a programming language. I repeat: <bold>Go</bold> is a programming language.</title>
    </mytag>`
    

    Outputs:

    &{Title:{Title:<bold>Go</bold> is a programming language. I repeat: <bold>Go</bold> is a programming language.}}
    

    Play

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题
  • ¥15 slam rangenet++配置
  • ¥15 有没有研究水声通信方面的帮我改俩matlab代码