I need to read Unicode files that may or may not contain a byte-order mark. I could of course check the first few bytes of the file myself, and discard a BOM if I find one. But before I do, is there any standard way of doing this, either in the core libraries or a third party?
3条回答 默认 最新
- duan0531 2014-01-27 07:45关注
No standard way, IIRC (and the standard library would really be a wrong layer to implement such a check in) so here are two examples of how you could deal with it yourself.
One is to use a buffered reader above your data stream:
import ( "bufio" "os" "log" ) func main() { fd, err := os.Open("filename") if err != nil { log.Fatal(err) } defer closeOrDie(fd) br := bufio.NewReader(fd) r, _, err := br.ReadRune() if err != nil { log.Fatal(err) } if r != '\uFEFF' { br.UnreadRune() // Not a BOM -- put the rune back } // Now work with br as you would do with fd // ... }
Another approach, which works with objects implementing the
io.Seeker
interface, is to read the first three bytes and if they're not BOM,io.Seek()
back to the beginning, like in:import ( "os" "log" ) func main() { fd, err := os.Open("filename") if err != nil { log.Fatal(err) } defer closeOrDie(fd) bom := [3]byte _, err = io.ReadFull(fd, bom[:]) if err != nil { log.Fatal(err) } if bom[0] != 0xef || bom[1] != 0xbb || bom[2] != 0xbf { _, err = fd.Seek(0, 0) // Not a BOM -- seek back to the beginning if err != nil { log.Fatal(err) } } // The next read operation on fd will read real data // ... }
This is possible since instances of
*os.File
(whatos.Open()
returns) support seeking and hence implementio.Seeker
. Note that that's not the case for, say,Body
reader of HTTP responses since you can't "rewind" it.bufio.Buffer
works around this feature of non-seekable streams by performing some buffering (obviously) — that's what allows you yoUnreadRune()
on it.Note that both examples assume the file we're dealing with is encoded in UTF-8. If you need to deal with other (or unknown) encoding, things get more complicated.
本回答被题主选为最佳回答 , 对您是否有帮助呢?解决 无用评论 打赏 举报
悬赏问题
- ¥15 Power query添加列问题
- ¥50 Kubernetes&Fission&Eleasticsearch
- ¥15 有没有帮写代码做实验仿真的
- ¥15 報錯:Person is not mapped,如何解決?
- ¥30 vmware exsi重置后登不上
- ¥15 易盾点选的cb参数怎么解啊
- ¥15 MATLAB运行显示错误,如何解决?
- ¥15 c++头文件不能识别CDialog
- ¥15 Excel发现不可读取的内容
- ¥15 关于#stm32#的问题:CANOpen的PDO同步传输问题