一、背景
pdf文档中有一个表格(请见附一),已获得百度文档识别返回的结果(请见附二,节选红框所示的三行)。
二、目标
将返回的结果整理成python_dataframe表格
三、请帮助
如何使用python实现?谢谢
附一:原始表格
附二:初步识别代码(json)
a = [
{
"words_location": {"top": 382, "left": 124, "width": 52, "height": 12},
"word": "豪华客房",
},
{
"words_location": {"top": 383, "left": 280, "width": 59, "height": 11},
"word": "14501530",
},
{
"words_location": {"top": 383, "left": 425, "width": 23, "height": 10},
"word": "450",
},
{
"words_location": {"top": 383, "left": 553, "width": 28, "height": 11},
"word": "510",
},
{
"words_location": {"top": 383, "left": 689, "width": 25, "height": 10},
"word": "NA ",
},
{
"words_location": {"top": 412, "left": 113, "width": 76, "height": 13},
"word": "高级豪华客房",
},
{
"words_location": {"top": 414, "left": 277, "width": 61, "height": 11},
"word": "5001580",
},
{
"words_location": {"top": 413, "left": 424, "width": 23, "height": 11},
"word": "500",
},
{
"words_location": {"top": 413, "left": 554, "width": 26, "height": 11},
"word": "560",
},
{
"words_location": {"top": 413, "left": 690, "width": 22, "height": 10},
"word": "NA ",
},
{
"words_location": {"top": 442, "left": 111, "width": 76, "height": 12},
"word": "行攻豪华客房",
},
{
"words_location": {"top": 444, "left": 278, "width": 60, "height": 12},
"word": "1700/1700",
},
{
"words_location": {"top": 444, "left": 424, "width": 25, "height": 10},
"word": "600",
},
{
"words_location": {"top": 444, "left": 554, "width": 27, "height": 10},
"word": "600",
},
{
"words_location": {"top": 444, "left": 689, "width": 22, "height": 10},
"word": "NA ",
},
]