使用Go解析巨大的XML文件

We need to parse a huge XML file using Go. We'd like to use a SAX-like event based algorithm using xml.NewDecoder() and decoder.Token() library calls. We've created the appropriate struct types with XML annotations. Everything easy peasy so far.

Now, we go through the file and detect the xml.StartElement tokens. And here comes the problem. We need to decode ONLY the attributes of this starting token and continue into its content. If we call token.DecodeElement() the whole content is "decoded" or skipped in our scenario.

How to decode only the attributes of a specific StartElement and continue to the element's body?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanrebo3559 2014-11-06 21:24
关注
I parse wikipedia xml dumps (~50GB xml files) in go-wikiparse using plain struct/reflect decoding. It's super simple.

The strategy is basically this:

First, read the envelope token:

d := xml.NewDecoder(r) _, err := d.Token() if err != nil { return nil, err }

e.g., for <someDocument><billions-of-other-things/></someDocument> that will give you someDocument.

Then, you can just struct decode the next things in a loop:

var i item d.Decode(&i)

Not much RAM, and it's super easy to parse.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

使用Go解析巨大的XML文件 xml
2014-11-05 11:41

回答 1 已采纳 I parse wikipedia xml dumps (~50GB xml files) in go-wikiparse using plain struct/reflect decoding.
如何在Go中使用各种元素解析巨大的XML文件？
2016-04-14 13:56

回答 4 已采纳 Use the standard xml Decoder. Call Token to read tokens one by one. When a start element of int
使用Go使用CDATA解析XML xml
2018-03-27 09:12

回答 1 已采纳 The first thing you should do is not ignore any errors that xml.Unmarshal can give you: if err :=
Go语言的XML和JSON处理深入解析
2024-01-28 14:32

禅与计算机程序设计艺术的博客 Go语言是一种现代编程语言，由Google的Robert Griesemer、Rob Pike和Ken Thompson于2009年开发。Go语言旨在简化编程，提高开发效率，并在并发和网络编程方面具有优势。Go语言的标准库提供了丰富的XML和JSON处理功能...
使用Go无法解析XML xml
2017-02-22 06:16

回答 1 已采纳 CDATA can not be unmarshalled to a struct. If the XML element contains character data, tha
用Go解析XML文件的行为很奇怪 xml
2019-07-17 07:20

回答 1 已采纳 Adding the struct tag to the field containing the slice should work: type flowBody struct { X
使用xml.NewDecoder（xmlFile）在Go / Golang中解析大型XML文件时，如何实现进度计数器？
2019-04-05 16:43

回答 1 已采纳 xml.Decoder has method InputOffset, that return current position. Do you need something else ?
HarmonyOS实战开发：@ohos.xml (xml解析与生成)
2024-04-24 16:39

初一十五啊的博客本模块提供了将XML文本转换为JavaScript对象、以及XML文件生成和解析的一系列接口。
如何解析GO中忽略嵌套元素的巨大xml？ xml
2017-08-14 20:24

回答 1 已采纳 Typically it is best to use XML decoder for large XML, it uses the stream and Go with selective bi
在GOlang中解析XML xml
2016-05-26 22:27

回答 1 已采纳 Assuming you're using the objects from that example, to access the server name and IP of the first
如何使用go检查XML文件中是否存在标签？ xml
2018-08-02 08:27

回答 2 已采纳 You may use event-driven XML parsing. Create an xml.Decoder using xml.NewDecoder(), and parse the
使用Go语言进行安卓开发
2023-11-01 20:30

一只会写程序的猫的博客本文将介绍如何使用Go语言进行安卓开发。我们将探讨使用Go语言进行安卓开发的优点、准备工作、基本概念和示例代码。通过本文的学习，你将了解如何使用Go语言构建高效的安卓应用程序。
在Go中解析多个XML标签 xml
2015-10-20 07:49

回答 1 已采纳 I had to allocate the parameter of a new object to the object which was returned. I tried this bef
android xml 未能解析文件,Android Studio提示“无法解析符号”，但项目已编译
2021-06-03 02:56

iwbunny的博客我在build.gradle中使用以下内容在AndroidStudio中导入twitter4j：dependencies {compile 'com.android.support:support-v4:18.0.+'compile files('libs/twitter4j-core-3.0.4.jar')}该项目编译正常，我可以毫无问题...
go语言能编android程序吗,用 Golang 开发 Android 应用（二）—— 简单 UI-Go语言中文社区...
2021-06-04 10:28

105菌的博客计划按以下的内容更新简单 UI关于开发一个应用，要有自己的应用名(显示用)，和包名(真正唯一的应用名)，简单说一台 Android 手机中所有应用的包名是唯一的，如果新安装的应用包名和已安装的应用重复则只能替换安装...
没有解决我的问题, 去提问

悬赏问题

¥20 docker里部署springboot项目，访问不到扬声器
¥15 netty整合springboot之后自动重连失效
¥15 悬赏！微信开发者工具报错，求帮改
¥20 wireshark抓不到vlan
¥20 关于#stm32#的问题：需要指导自动酸碱滴定仪的原理图程序代码及仿真
¥20 设计一款异域新娘的视频相亲软件需要哪些技术支持
¥15 stata安慰剂检验作图但是真实值不出现在图上
¥15 c程序不知道为什么得不到结果
¥40 复杂的限制性的商函数处理
¥15 程序不包含适用于入口点的静态Main方法

使用Go解析巨大的XML文件

1条回答 默认 最新

悬赏问题

1条回答默认最新