I have a simple HTML table, and want to get all cell values even if it's HTML code inside.
Trying to use xml unmarshal, but didn't get the right struct tags, values or attributes.
import (
"fmt"
"encoding/xml"
)
type XMLTable struct {
XMLName xml.Name `xml:"TABLE"`
Row []struct{
Cell string `xml:"TD"`
}`xml:"TR"`
}
func main() {
raw_html_table := `
<TABLE><TR>
<TD>lalalal</TD>
<TD>papapap</TD>
<TD>fafafa</TD>
<TD>
<form action=\"/addedUrl/;jsessionid=KJHSDFKJLSDF293847odhf" method=POST>
<input type=hidden name=acT value=\"Dev\">
<input type=hidden name=acA value=\"Anyval\">
<input type=submit name=submit value=Stop>
</form>
</TD>
</TR>
</TABLE>`
table := XMLTable{}
fmt.Printf("%q
", []byte(raw_html_table)[:15])
err := xml.Unmarshal([]byte(raw_html_table), &table)
if err != nil {
fmt.Printf("error: %v", err)
}
}
As an additional info, I don't care about cell content if it's HTML code (take only []byte
/ string
values). So I may delete cell content before unmarshaling, but this way is also not so easy.
Any suggestions with standard golang libs would be welcome.