如何解析任意长度的文件？

I have a text file that I'd like to parse with records like this:

===================
name: John Doe
Education: High School Diploma
Education: Bachelor's Degree
Education: Sun Java Certified Programmer
Age: 29
===================
name: Bob Bear
Education: High School Diploma
Age: 18
===================
name: Jane Doe
Education: High School Diploma
Education: Bachelor's Degree
Education: Master's Degree
Education: AWS Certified Solution Architect Professional
Age: 25

As you can see, the fields in such a text file are fixed, but some of them repeat an arbitrary number of times. The records are separated by a fixed length ==== delimiter.

How would I write parsing logic this this sort of problem? I am think of using switch as it reads the start of the line, but the logic to handle multiple repeating fields baffles me.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
du229908 2018-08-03 18:52
关注
A good way to approach this sort of problem is to "divide and conquer". That is, divide the overall problem into smaller sub-problems which are easier to manage and then solve each them individually. If you've planned properly then when you've finished each of the sub-problems you should have solved the whole problem.

Start by thinking about modeling. The document appears to contain a list of records, what should those records be called? What named fields should the records contain and what types should they have? How would you represent them idiomatically in go? For example, you might decide to call each record a Person with fields as such:

type Person struct { Name string Credentials []string Age int }

Next, think about what the interface (signature) of your parse function should look like. Should it emit an array of people? Should it use a visitor pattern and emit a person as soon as it's parsed? What constraints should drive the answer? Are memory or compute time constraints important? Does the user of the parser want any control over the parsing work such as canceling? Do they need metadata such as the total number of records contained in the document? Will the input always be from a file or a string, maybe from an HTTP request or a network socket? How will these choices drive your design?

func ParsePeople(string) ([]Person, error) // ? func ParsePeople(io.Reader) ([]Person, error) // ? func ParsePeople(io.Reader, func visitor(Person) bool) error // ?

Finally you can implement your parser to fulfill the interface that you've decided on. A straightforward approach here would be to read the input file line-by-line and take an action according to the contents of the line. For example (in pseudocode):

forEach line = inputFile.line if line is a separator emit or store the last parsed person, if present create a new person to store parsed fields else if line is a data field parse the data update the person with the parsed data end end return the parsed records or final record, if emitting

Each line of pseudocode above represents a sub-problem that should be easier to solve than the whole.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

如何解析任意长度的文件？
2018-08-03 06:56

回答 2 已采纳 A good way to approach this sort of problem is to "divide and conquer". That is, divide the overal
Go HTML模板文件扩展名是任意的吗？
2016-05-26 01:16

回答 1 已采纳 The file parsing function and method expect full file names. The template package does not impose
Python如何打开任意路径下的文件 python
2021-04-29 15:17

回答 3 已采纳是这样吗？文件类型你看下自己设置咯。 import tkinter as tk from tkinter import filedialog root = tk.Tk() root.withdr
任意长度hex文件的解析（python实现）
2021-02-02 09:20

SevenHaa的博客可解析大于64KB的HEX文件。限于时间条件，笔者只测试了两个样例，数据域大小分别为8332B和1.61MB 将解析得到的结果打印出来，包括： - 起始地址 - 末尾地址 - 数据域尺寸（单位：字节）保存数据域的内容为bin...
如何创建任意长度的字符串
2019-01-08 12:40

回答 1 已采纳 Your createPayload() creates a byte slice with the given length, which is filled with zeros by def
java有没有清空输出框的功能？如何实现按任意键继续？嵌套类如何构造对象？ java 有问必答
2022-04-24 16:55

回答 2 已采纳
遍历一个任意长度的list中的元素并依次创建异步任务，如何获取所有任务的执行结果？ javascript
2022-10-09 16:58

回答 1 已采纳你说的莫不是promise. all
任意文件读取与下载漏洞学习
2021-01-25 19:36

不想当脚本小子的脚本小子的博客在web安全中，任意文件读取漏洞是非常常见的一种漏洞，属于文件操作类漏洞，一般常见于PHP/java/python语言中，任意文件读取漏洞，顾名思义，就是可以任意读取服务器上部分或者全部文件的漏洞，攻击者利用此漏洞...
调用AES时可以输入任意长度的字符串 python
2021-12-08 18:11

回答 1 已采纳有文档可以参考PyCryptodome是python一个强大的加密算法库，可以实现常见的单向加密、对称加密、非对称加密和流加密算法。直接pip安装即可： pip install pycryptodom
无法通过XML文件解析器读取任意文件 php xml
2014-06-15 14:36

回答 1 已采纳 The document() function requires well-formed XML. You can't use it to read a plain text document.
要将string类型的数组中任意一个元素的长度表示出来怎么办？ c++ c语言
2021-10-22 13:04

回答 1 已采纳获取 string 类型的长度为：xxx.size()栗子： #include<bits/stdc++.h> using namespace std; int main() { s
redis全配置文件解析
2023-11-28 14:50

默语的博客【代码】redis全配置文件解析。
如何从mysql数据库中随机获取任意条数据？ mysql 数据库
2018-10-08 02:36

回答 3 已采纳 order by random 会把整个表数据顺序打乱，这样就可以直接取需要的条数了，数据量不是特别大的时候可以这样用
HEX文件格式解析（转）
2019-08-31 16:40

IT技术猿猴的博客 Hex格式文件有两种，一种是Intel的Intel HEX，另一种是Motorola（摩托罗拉）的SREC（又称MOT）。 Intel HEX 文件是由一行行符合Intel HEX 文件格式的文本所构成的ASCII 文本文件。在Intel HEX 文件中，每一行...
【MetInfo任意文件读取】--任意文件读取漏洞
2022-08-29 16:29

夭-夜的博客任意文件读取漏洞分析
没有解决我的问题, 去提问

悬赏问题

¥20 matlab计算中误差
¥15 对于相关问题的求解与代码
¥15 ubuntu子系统密码忘记
¥15 信号傅里叶变换在matlab上遇到的小问题请求帮助
¥15 保护模式-系统加载-段寄存器
¥15 电脑桌面设定一个区域禁止鼠标操作
¥15 求NPF226060磁芯的详细资料
¥15 使用R语言marginaleffects包进行边际效应图绘制
¥20 usb设备兼容性问题
¥15 错误(10048): “调用exui内部功能”库命令的参数“参数4”不能接受空数据。怎么解决啊

如何解析任意长度的文件？

2条回答 默认 最新

悬赏问题

2条回答默认最新