douju5062
douju5062
2015-01-19 09:01

使用golang正则表达式获取xlsx单元格数据?

已采纳

I use the regexp expression to get the data from .xlsx file. but i am poor and a newer in regexp. Anyone could help me?

package main

import (
        "fmt"
        "regexp"
)

func main() {
        input := `
        <sheetData>
        <row r="2" spans="1:15">
        <c r="A2" s="5" ><v>{{range .txt}}</v></c>
        <c r="B2" s="5" t="s"><v>1</v></c>
        <c r="C2" s="5" t="s"><v>2</v></c>
        <c r="D2" s="5" t="s"><v>3</v></c>
        <c r="E2" s="5" />
        <c r="K2" s="6" t="s"><v>21</v></c>
    </row> 
    <row r="3" spans="1:15">
        <c r="A3" s="5" t="s"><v>0</v></c>
        <c r="B3" s="5" t="s"><v>1</v></c>
        <c r="C3" s="5" t="s"><v>2</v></c>
        <c r="D3" s="5" t="s"><v>3</v></c>
        <c r="E3" s="5" />
        <c r="K3" s="6" t="s"><v>21</v></c>
    </row> 
    </sheetData>`
        r := regexp.MustCompile(`<row[^>]*?r="(\d+)"[^>].*?>.*?[(<v>(.*?)<\/v>.*?)]<\/row>`)
        r2 := regexp.MustCompile(`<v>(.*?)</v>`)
        row:=r.FindAllString(input,-1)
        for _,v:=range row {
        fmt.Println(r.ReplaceAllStringFunc(v, func(m string) string {
               match:=r2.FindAllString(v,-1)
            for kk,vv:=range match {
            fmt.Println(kk,vv)
             fmt.Println(r2.ReplaceAllString(v, ""))             
        }  
      }))
        }
    }   

Question:

  1. How to get the string {{range .txt}} ,and throw off the tag"..."

  2. How to get the "3" from r="3" ,and get the "A3,B3,C3..." from the "

Thanks in advance!

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

1条回答

  • duanmingting9544 duanmingting9544 6年前

    I think regexp is the wrong tool for this job. Try xml:

    import "encoding/xml"
    
    // Could probably pick better names for these.
    type C struct {
        XMLName xml.Name `xml:"c"`
        V       string   `xml:"v"`
        R       string   `xml:"r,attr"`
    }
    type Row struct {
        XMLName xml.Name `xml:"row"`
        C       []C      `xml:"c"`
    }
    type Result struct {
        XMLName xml.Name `xml:"sheetData"`
        Row     []Row    `xml:"row"`
    }
    v := Result{}
    
    err := xml.Unmarshal([]byte(input), &v)
    if err != nil {
        fmt.Printf("error: %v", err)
        return
    }
    for _, r := range v.Row {
        for _, c := range r.C {
            fmt.Printf("%v %v
    ", c.V, c.R)
        }
    }
    

    This will print:

    {{range .txt}} A2
    1 B2
    2 C2
    3 D2
    ...
    
    点赞 评论 复制链接分享