使用Go解析巨大的XML文件

We need to parse a huge XML file using Go. We'd like to use a SAX-like event based algorithm using xml.NewDecoder() and decoder.Token() library calls. We've created the appropriate struct types with XML annotations. Everything easy peasy so far.

Now, we go through the file and detect the xml.StartElement tokens. And here comes the problem. We need to decode ONLY the attributes of this starting token and continue into its content. If we call token.DecodeElement() the whole content is "decoded" or skipped in our scenario.

How to decode only the attributes of a specific StartElement and continue to the element's body?

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
duanrebo3559 2014-11-06 21:24
关注
I parse wikipedia xml dumps (~50GB xml files) in go-wikiparse using plain struct/reflect decoding. It's super simple.

The strategy is basically this:

First, read the envelope token:

d := xml.NewDecoder(r) _, err := d.Token() if err != nil { return nil, err }

e.g., for <someDocument><billions-of-other-things/></someDocument> that will give you someDocument.

Then, you can just struct decode the next things in a loop:

var i item d.Decode(&i)

Not much RAM, and it's super easy to parse.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

如何在Go中使用各种元素解析巨大的XML文件？
2016-04-14 13:56

回答 4 已采纳 Use the standard xml Decoder. Call Token to read tokens one by one. When a start element of int
使用Go使用CDATA解析XML xml
2018-03-27 09:12

回答 1 已采纳 The first thing you should do is not ignore any errors that xml.Unmarshal can give you: if err :=
使用Go无法解析XML xml
2017-02-22 06:16

回答 1 已采纳 CDATA can not be unmarshalled to a struct. If the XML element contains character data, tha
Go语言的XML和JSON处理深入解析
2024-01-28 14:32

AI天才研究院的博客 Go语言是一种现代编程语言，由Google的Robert Griesemer、Rob Pike和Ken Thompson于2009年开发。Go语言旨在简化编程，提高开发效率，并在并发和网络编程方面具有优势。Go语言的标准库提供了丰富的XML和JSON处理功能...
用Go解析XML文件的行为很奇怪 xml
2019-07-17 07:20

回答 1 已采纳 Adding the struct tag to the field containing the slice should work: type flowBody struct { X
使用xml.NewDecoder（xmlFile）在Go / Golang中解析大型XML文件时，如何实现进度计数器？
2019-04-05 16:43

回答 1 已采纳 xml.Decoder has method InputOffset, that return current position. Do you need something else ?
在GOlang中解析XML xml
2016-05-26 22:27

回答 1 已采纳 Assuming you're using the objects from that example, to access the server name and IP of the first
HarmonyOS实战开发：@ohos.xml (xml解析与生成)
2024-04-24 16:39

初一十五啊的博客本模块提供了将XML文本转换为JavaScript对象、以及XML文件生成和解析的一系列接口。
如何解析GO中忽略嵌套元素的巨大xml？ xml
2017-08-14 20:24

回答 1 已采纳 Typically it is best to use XML decoder for large XML, it uses the stream and Go with selective bi
如何使用go检查XML文件中是否存在标签？ xml
2018-08-02 08:27

回答 2 已采纳 You may use event-driven XML parsing. Create an xml.Decoder using xml.NewDecoder(), and parse the
在Go中解析多个XML标签 xml
2015-10-20 07:49

回答 1 已采纳 I had to allocate the parameter of a new object to the object which was returned. I tried this bef
OpenHarmony语言基础类库【@ohos.xml (xml解析与生成)】
2024-04-28 22:04

爱桥代码的程序媛的博客将XML文本转换为JavaScript对象、以及XML文件生成和解析的一系列接口。本模块首批接口从API version 8开始支持。后续版本的新增接口，采用上角标单独标记接口的起始版本。
JavaScript 中 xml 的解析（dom4j 解析器），前端开发面试宝典
2024-04-04 06:28

2401_83973893的博客面试一面会问很多基础问题，而...CodeChina开源项目：【大厂前端面试题解析+核心总结学习笔记+真实项目实战+最新讲解视频】96道前端面试题：常用算法面试题：内容主要包括HTML，CSS，JavaScript，浏览器，性能优化。
使用Go语言进行安卓开发
2023-11-01 20:30

一只会写程序的猫的博客本文将介绍如何使用Go语言进行安卓开发。我们将探讨使用Go语言进行安卓开发的优点、准备工作、基本概念和示例代码。通过本文的学习，你将了解如何使用Go语言构建高效的安卓应用程序。
android xml 未能解析文件,Android Studio提示“无法解析符号”，但项目已编译
2021-06-03 02:56

iwbunny的博客我在build.gradle中使用以下内容在AndroidStudio中导入twitter4j：dependencies {compile 'com.android.support:support-v4:18.0.+'compile files('libs/twitter4j-core-3.0.4.jar')}该项目编译正常，我可以毫无问题...
没有解决我的问题, 去提问

悬赏问题

¥100 需要跳转番茄畅听app的adb命令
¥50 寻找一位有逆向游戏盾sdk 应用程序经验的技术
¥15 请问有用MZmine处理 “Waters SYNAPT G2-Si QTOF质谱仪在MSE模式下采集的非靶向数据” 的分析教程吗
¥50 opencv4nodejs 如何安装
¥15 adb push异常 adb: error: 1409-byte write failed: Invalid argument
¥15 nginx反向代理获取ip，java获取真实ip
¥15 eda：门禁系统设计
¥50 如何使用js去调用vscode-js-debugger的方法去调试网页
¥15 376.1电表主站通信协议下发指令全被否认问题
¥15 物体双站RCS和其组成阵列后的双站RCS关系验证

使用Go解析巨大的XML文件

1条回答 默认 最新

悬赏问题

1条回答默认最新