批量提取多个txt文本中的数据,并将这些数据存入新的txt文本中,该怎么实现?
如图是文件夹中的txt文件
下图红线框内是需要提取的数据
批量提取多个txt文本中的数据,并将这些数据存入新的txt文本中,该怎么实现?
如图是文件夹中的txt文件
下图红线框内是需要提取的数据
'''
本代码以UC数据集为例子,UC数据集共包含21类,每一类图片包含100张图片
其中class_label_dict为UC数据集label,
'''
import numpy as np
import os
class_label_dict = {'agricultural': 0,
"airplane": 1,
"baseballdiamond": 2,
"beach": 3,
"buildings": 4,
"chaparral": 5,
"denseresidential": 6,
"forest": 7,
"freeway": 8,
"golfcourse": 9,
"harbor": 10,
"intersection": 11,
"mediumresidential": 12,
"mobilehomepark": 13,
"overpass": 14,
"parkinglot": 15,
"river": 16,
"runway": 17,
"sparseresidential": 18,
"storagetanks": 19,
"tenniscourt": 20
}
file_path = "G:/UC" #此处为文件存放位置
path_list = os.listdir(file_path) #会历遍文件夹内的文件并返回一个列表
path_name=[]
for i in path_list:
path_name.append(file_path+"/"+i+" "+str(class_label_dict[i[:-6]]))
# 排序一下
path_name.sort()
train_path = []
test_path = []
trains_idx = []
tests_idx = []
for i in range(21):
start = i * 100
end = (i + 1) * 100
idx = np.arange(start, end)
np.random.shuffle(idx)
train_idx = idx[0:80]
test_idx = idx[80:]
trains_idx.extend(train_idx)
tests_idx.extend(test_idx)
path_name = np.array(path_name)
train_path = path_name[trains_idx]
test_path = path_name[tests_idx]
for file_name in path_name:
# "a"表示以不覆盖的形式写入到文件中,当前文件夹如果没有"save.txt"会自动创建
with open("data.txt", "a") as f:
f.write(file_name + "\n")
f.close()
for file_name in train_path:
# "a"表示以不覆盖的形式写入到文件中,当前文件夹如果没有"save.txt"会自动创建
with open("train.txt", "a") as f:
f.write(file_name + "\n")
f.close()
for file_name in test_path:
# "a"表示以不覆盖的形式写入到文件中,当前文件夹如果没有"save.txt"会自动创建
with open("test.txt", "a") as f:
f.write(file_name + "\n")
f.close()
输出txt文件的格式为 文件位置+文件名+label形式 ,如下图所示: