banner
Hi my new friend!

深度学习配置相关

Scroll down

深度学习

环境配置

安装

1
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio cudatoolkit=10.2 -c pytorch

–下载对应框架

测试

1
2
3
4
5
6
7
8
9
10
import torch
print('CUDA版本:',torch.version.cuda)
print('Pytorch版本:',torch.__version__)
print('显卡是否可用:','可用' if(torch.cuda.is_available()) else '不可用')
print('显卡数量:',torch.cuda.device_count())
print('当前显卡型号:',torch.cuda.get_device_name())
print('当前显卡的CUDA算力:',torch.cuda.get_device_capability())
print('当前显卡的总显存:',torch.cuda.get_device_properties(0).total_memory/1024/1024/1024,'GB')
print('是否支持TensorCore:','支持' if (torch.cuda.get_device_properties(0).major >= 7) else '不支持')
print('当前显卡的显存使用率:',torch.cuda.memory_allocated(0)/torch.cuda.get_device_properties(0).total_memory*100,'%')

设置源

1
pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
python detect.py --source data/images/bus.jpg --weights pretrained/yolov5s.pt --conf-thres 0.1

//batch:每轮训练图片数 epochs:训练次数 --conf-thres:锚框精度 上例指高于0.1输出


python train.py --img 640 --batch-size 16 --epochs 300 --data mydata.yaml --weights yolov5s.pt --cache-images --device 0

python train.py --batch-size 4 --epochs 200 --data workshop/data.yaml --weights pretrained/yolov5s.pt

python detect.py --source C:\Users\XY\myproject\yolov5\data\testimg --weights runs/train/exp15/weights/best.pt --conf-thres 0.1



你可以在训练时通过`--hyp`参数来指定超参数文件,例如:

```python
python train.py --batch-size 4 --epochs 10 --data workshop/data.yaml --weights pretrained/yolov5s.pt --hyp data/hyps/hyp.scratch-high.yaml
```

关闭程序

1
taskkill /f /im chromedriver.exe

谷歌colab

挂载云盘

colab采用的是Linux环境,使用python笔记本,执行python命令时前面加入

1
2
from google.colab import drive
drive.mount('/content/drive/')

复制数据集

1
2
3
cp /content/drive/MyDrive/workshop.zip /content/yolov5/data/

cp /content/drive/MyDrive/data/testimg.zip /content/yolov5/data/

解压数据集

1
2
3
unzip ../test.zip -d ../

!unzip /content/yolov5/data/workshop.zip -d /content/yolov5/data

运行相关

  • 更改data.yaml中数据集路径,改为Linux路径,例如:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    names:
    - bench
    - Material Box
    - roll paper
    - Masks
    nc: 4
    train: data/workshop/images/train
    val: data/workshop/images/val






    names:
    - bench
    - Material Box
    - roll paper
    - Masks
    nc: 4
    train: data/rollworkshop/images/train
    val: data/rollworkshop/images/val
  • 文件结构

    将数据集移动至yolov5/data

  • 运行

    1
    2
    3
    4
    5
    6
    7
    8
    !python train.py --batch-size 16 --epochs 50 --data workshop/data.yaml --weights pretrained/yolov5s.pt


    !python train.py --batch-size 256 --img 256 --cfg models/yolov5S.yaml --epochs 50 --data workshop/data.yaml --weights pretrained/yolov5s.pt



    !python train.py --batch-size 256 --img 256 --epochs 50 --data workshop02/data.yaml --weights pretrained/yolov5s.pt --hyp data/hyps/hyp.scratch-low.yaml --cfg models/yolov5s.yaml
  • 测试

    1
    python detect.py --source data/images/bus.jpg --weights pretrained/yolov5s.pt --conf-thres 0.1

参数相关

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
lr0: 初始学习率,对于SGD优化器采用1e-2,对于Adam优化器采用1e-3。
lrf: 最终学习率,通过OneCycleLR方法进行调整,为lr0乘以lrf。
momentum: SGD优化器的动量,Adam优化器的beta1。
weight_decay: 优化器的权重衰减,为5e-4。
warmup_epochs: 学习率预热阶段的epoch数,为3.0。
warmup_momentum: 预热阶段的动量。
warmup_bias_lr: 预热阶段的偏置学习率。
box: 框回归损失的权重。
cls: 分类损失的权重。
cls_pw: 分类损失的正样本权重。
obj: 目标检测损失的权重。
obj_pw: 目标检测损失的正样本权重。
iou_t: 训练时的IoU阈值。
anchor_t: anchor的阈值。
fl_gamma: Focal Loss中的gamma值。
hsv_h: 图像HSV色调增强的因子。
hsv_s: 图像HSV饱和度增强的因子。
hsv_v: 图像HSV亮度增强的因子。
degrees: 图像旋转的角度范围。
translate: 图像平移的比例范围。
scale: 图像缩放的比例范围。
shear: 图像剪切的角度范围。
perspective: 图像透视变换的程度。
flipud: 上下翻转的概率。
fliplr: 左右翻转的概率。
mosaic: 图像马赛克的概率。
mixup: 图像混合的概率。
copy_paste: 分割图像复制粘贴的概率。

训练相关

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
weights: YOLOv5s.pt,表示预训练模型权重文件的路径。
cfg: YOLOv5s.yaml,表示模型配置文件的路径。
data: climbladder.yaml,表示数据集的配置文件路径。
hyp: hyp.scratch-low.yaml,表示超参数的配置文件路径。
device: 0,表示使用的GPU设备编号。
epochs: 300,表示训练的总轮数。
multi-scale: False,表示是否启用多尺度训练。
batch-size: 16,表示每个批次的样本数。
single-cls: False,表示是否进行单类别目标检测。
imgsz: 640,表示输入图片的大小。
optimizer: Adam,表示使用的优化器类型。
rect: False,表示是否进行矩形训练。
sync-bn: False,表示是否启用同步批量归一化。
resume: False,表示是否从之前的checkpoint恢复训练。
local_rank: -1,表示多GPU训练时的本地进程编号。
nosave: False,表示是否禁止保存checkpoint。
patience: 100,表示early stopping的等待轮数。
noval: False,表示是否在训练过程中跳过验证集。
noautoanchor: False,表示是否禁用自适应anchor。

数据集相关

划分数据集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import os
import shutil
from sklearn.model_selection import train_test_split

# 指定图像和标注文件所在的文件夹
images_folder = r'C:\Users\XY\Desktop\zhuizhong\data2\workshop01\img'
labels_folder = r'C:\Users\XY\Desktop\zhuizhong\data2\workshop01\txt'

# 指定数据集的输出路径
dataset_folder = r'C:\Users\XY\Desktop\zhuizhong\data2\1'

# 指定验证集所占比例
val_ratio = 0.15

# 创建数据集目录结构
os.makedirs(os.path.join(dataset_folder, 'images', 'train'), exist_ok=True)
os.makedirs(os.path.join(dataset_folder, 'images', 'val'), exist_ok=True)
os.makedirs(os.path.join(dataset_folder, 'labels', 'train'), exist_ok=True)
os.makedirs(os.path.join(dataset_folder, 'labels', 'val'), exist_ok=True)

# 获取图像文件列表
image_files = [f for f in os.listdir(images_folder) if f.endswith('.jpg')]

# 将图像文件分成训练集和验证集
train_images, val_images = train_test_split(image_files, test_size=val_ratio)

# 复制训练图像和标注文件
for image_file in train_images:
shutil.copy(os.path.join(images_folder, image_file), os.path.join(dataset_folder, 'images', 'train', image_file))
label_file = os.path.splitext(image_file)[0] + '.txt'
shutil.copy(os.path.join(labels_folder, label_file), os.path.join(dataset_folder, 'labels', 'train', label_file))

# 复制验证图像和标注文件
for image_file in val_images:
shutil.copy(os.path.join(images_folder, image_file), os.path.join(dataset_folder, 'images', 'val', image_file))
label_file = os.path.splitext(image_file)[0] + '.txt'
shutil.copy(os.path.join(labels_folder, label_file), os.path.join(dataset_folder, 'labels', 'val', label_file))

部分数据增强

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
import cv2
import os
import xml.etree.ElementTree as ET
import numpy as np
# 图片文件夹路径
image_folder = r'C:\Users\XY\Desktop\data\test6\images'
# XML 文件夹路径
xml_folder = r'C:\Users\XY\Desktop\data\test6\labels'
# 输出文件夹路径
output_folder = r'C:\Users\XY\Desktop\data\test6\outimg'

# 遍历所有图片文件
for filename in os.listdir(image_folder):
# 获取图片文件名(不带扩展名)
image_name = os.path.splitext(filename)[0]
# 获取对应的 XML 文件路径
xml_path = os.path.join(xml_folder, image_name + '.xml')

# 读取 XML 文件
tree = ET.parse(xml_path)
root = tree.getroot()

# 遍历所有 object 元素
for obj in root.findall('object'):
# 获取类别名称和 bounding box 坐标
class_name = obj.find('name').text
bndbox = obj.find('bndbox')
xmin = int(bndbox.find('xmin').text)
ymin = int(bndbox.find('ymin').text)
xmax = int(bndbox.find('xmax').text)
ymax = int(bndbox.find('ymax').text)

# 读取图像并裁剪
image_path = os.path.join(image_folder, filename)
image = cv2.imread(image_path)
cropped_image = image[ymin:ymax, xmin:xmax]

# 按照类别名称创建输出文件夹(如果不存在)
class_output_folder = os.path.join(output_folder, class_name)
if not os.path.exists(class_output_folder):
os.makedirs(class_output_folder)

# 随机扩充

# scale = np.random.uniform(2, 3)
# new_height = int(cropped_image.shape[0] * scale)
# new_width = int(cropped_image.shape[1] * scale)
# new_image = np.random.randint(0, 256, (new_height, new_width, 3), dtype=np.uint8)
# new_image[(new_height - cropped_image.shape[0]) // 2:(new_height + cropped_image.shape[0]) // 2,
# (new_width - cropped_image.shape[1]) // 2:(new_width + cropped_image.shape[1]) // 2] = cropped_image
# cropped_image = new_image

# 高斯随机
# scale = np.random.uniform(2, 3)
# new_height = int(cropped_image.shape[0] * scale)
# new_width = int(cropped_image.shape[1] * scale)
#
# mean = 0
# stddev = 10
# noise = np.random.normal(mean, stddev, (new_height, new_width, 3))
# noise = noise.reshape(new_height, new_width, -1).astype(np.uint8)
#
# new_image = cv2.add(cv2.GaussianBlur(noise, (5, 5), cv2.BORDER_DEFAULT), 128)
# new_image[(new_height - cropped_image.shape[0]) // 2:(new_height + cropped_image.shape[0]) // 2,
# (new_width - cropped_image.shape[1]) // 2:(new_width + cropped_image.shape[1]) // 2] = cropped_image
# cropped_image = new_image

# 保存裁剪后的图像(使用图片文件名、类别名称和 bounding box 坐标作为新文件名)
cropped_image_filename = f'{image_name}_{class_name}_{xmin}_{ymin}_{xmax}_{ymax}.jpg'
cropped_image_path = os.path.join(class_output_folder, cropped_image_filename)
cv2.imwrite(cropped_image_path, cropped_image)

# # 复制原始 XML 文件并修改内容
# new_tree = ET.parse(xml_path)
# new_root = new_tree.getroot()
# new_bndbox = new_root.find('object').find('bndbox')
# new_bndbox.find('xmin').text = str(0)
# new_bndbox.find('ymin').text = str(0)
# new_bndbox.find('xmax').text = str(xmax - xmin)
# new_bndbox.find('ymax').text = str(ymax - ymin)

# 复制原始 XML 文件并修改内容
new_tree = ET.parse(xml_path)
new_root = new_tree.getroot()
flag = 0
# 遍历所有 object 元素
for obj in new_root.findall('object'):
# 获取类别名称
cr_class_name = obj.find('name').text

# 如果类别名称与当前裁剪图像不匹配,则删除该 object 元素
if class_name != cr_class_name:
new_root.remove(obj)
else:
flag += 1
if flag >1 :
new_root.remove(obj)

# 修改 bounding box 坐标
new_bndbox = new_root.find('object').find('bndbox')
new_bndbox.find('xmin').text = str(0)
new_bndbox.find('ymin').text = str(0)
new_bndbox.find('xmax').text = str(xmax - xmin)
new_bndbox.find('ymax').text = str(ymax - ymin)
# new_size = new_root.find('annotation').find('size')
# new_size.find('width').text = str(xmax - xmin)
# new_size.find('height').text = str(ymax - ymin)

# 修改尺度
new_size = new_root.find('size')
new_size.find('width').text = str(xmax - xmin)
new_size.find('height').text = str(ymax - ymin)

# 修改裁剪扩充后尺度
# new_size = new_root.find('size')
# new_size.find('width').text = str((new_width + cropped_image.shape[1]) // 2)
# new_size.find('height').text = str((new_height + cropped_image.shape[0]) // 2)

# 修改box
# new_bndbox = new_root.find('object').find('bndbox')
# new_bndbox.find('xmin').text = str((new_width - cropped_image.shape[1]) // 2)
# new_bndbox.find('ymin').text = str((new_height - cropped_image.shape[0]) // 2)
# new_bndbox.find('xmax').text = str((new_width + cropped_image.shape[1]) // 2)
# new_bndbox.find('ymax').text = str((new_height + cropped_image.shape[0]) // 2)

# 保存新的 XML 文件
xml_output_folder = r'C:\Users\XY\Desktop\data\test6\outxml'
if not os.path.exists(xml_output_folder):
os.makedirs(xml_output_folder)
new_xml_filename = f'{image_name}_{class_name}_{xmin}_{ymin}_{xmax}_{ymax}.xml'
new_xml_path = os.path.join(xml_output_folder, new_xml_filename)
new_tree.write(new_xml_path)

txt到xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# .txt-->.xml
# ! /usr/bin/python
# -*- coding:UTF-8 -*-
import os
import cv2
txt_path = "C:\\Users\\XY\\Desktop\\data\\test3\\txt\\"
img_path = "C:\\Users\\XY\\Desktop\\data\\test3\\images\\"
xml_path = "C:\\Users\\XY\\Desktop\\data\\test3\\xml\\"

def txt_to_xml(txt_path, img_path, xml_path):
# 1.字典对标签中的类别进行转换
dict = {
'0': "bench",'1': "Material Box",'2': "roll paper",'3': "Masks"
}
# 2.找到txt标签文件夹
files = os.listdir(txt_path)
# 用于存储 "老图"
pre_img_name = ''
# 3.遍历文件夹
for i, name in enumerate(files):
# 许多人文件夹里有该文件,默认的也删不掉,那就直接pass
if name == "desktop.ini":
continue
print(name)
# 4.打开txt
txtFile = open(txt_path + name)
# 读取所有内容
txtList = txtFile.readlines()
# 读取图片名称
img_name = name[:-4]
pic = cv2.imread(img_path + img_name + ".jpg")
# 获取图像大小信息
Pheight, Pwidth, Pdepth = pic.shape
# 5.遍历txt文件中每行内容
for row in txtList:
# 按' '分割txt的一行的内容
oneline = row.strip().split(" ")
# 遇到的是一张新图片
if img_name != pre_img_name:
# 6.新建xml文件
xml_file = open((xml_path + img_name + '.xml'), 'w')
xml_file.write('<annotation>\n')
xml_file.write(' <folder>images</folder>\n')
xml_file.write(' <filename>' + img_name + '.jpg' + '</filename>\n')
xml_file.write(' <path>' + img_path + img_name + '.jpg' + '</path>\n')
xml_file.write(' <source>\n')
# xml_file.write(' <database>orgaquant</database>\n')
# xml_file.write('<annotation>organoids</annotation>\n')
xml_file.write(' <database>Unknown</database>\n')
xml_file.write(' </source>\n')
xml_file.write(' <size>\n')
xml_file.write(' <width>' + str(Pwidth) + '</width>\n')
xml_file.write(' <height>' + str(Pheight) + '</height>\n')
xml_file.write(' <depth>' + str(Pdepth) + '</depth>\n')
xml_file.write(' </size>\n')
xml_file.write(' <segmented>0</segmented>\n')
xml_file.write(' <object>\n')
xml_file.write(' <name>' + dict[oneline[0]] + '</name>\n')
xml_file.write(' <pose>Unspecified</pose>\n')
xml_file.write(' <truncated>0</truncated>\n')
xml_file.write(' <difficult>0</difficult>\n')
xml_file.write(' <bndbox>\n')
xml_file.write(' <xmin>' + str(
int(((float(oneline[1])) * Pwidth) - (float(oneline[3])) * 0.5 * Pwidth)) + '</xmin>\n')
xml_file.write(' <ymin>' + str(
int(((float(oneline[2])) * Pheight) - (float(oneline[4])) * 0.5 * Pheight)) + '</ymin>\n')
xml_file.write(' <xmax>' + str(
int(((float(oneline[1])) * Pwidth) + (float(oneline[3])) * 0.5 * Pwidth)) + '</xmax>\n')
xml_file.write(' <ymax>' + str(
int(((float(oneline[2])) * Pheight) + (float(oneline[4])) * 0.5 * Pheight)) + '</ymax>\n')
xml_file.write(' </bndbox>\n')
xml_file.write(' </object>\n')
xml_file.close()
pre_img_name = img_name # 将其设为"老"图
else: # 不是新图而是"老图"
# 7.同一张图片,只需要追加写入object
xml_file = open((xml_path + img_name + '.xml'), 'a')
xml_file.write(' <object>\n')
xml_file.write(' <name>' + dict[oneline[0]] + '</name>\n')
xml_file.write(' <pose>Unspecified</pose>\n')
xml_file.write(' <truncated>0</truncated>\n')
xml_file.write(' <difficult>0</difficult>\n')
''' 按需添加这里和上面
xml_file.write(' <pose>Unspecified</pose>\n')
xml_file.write(' <truncated>0</truncated>\n')
xml_file.write(' <difficult>0</difficult>\n')
'''
xml_file.write(' <bndbox>\n')
xml_file.write(' <xmin>' + str(
int(((float(oneline[1])) * Pwidth) - (float(oneline[3])) * 0.5 * Pwidth)) + '</xmin>\n')
xml_file.write(' <ymin>' + str(
int(((float(oneline[2])) * Pheight) - (float(oneline[4])) * 0.5 * Pheight)) + '</ymin>\n')
xml_file.write(' <xmax>' + str(
int(((float(oneline[1])) * Pwidth) + (float(oneline[3])) * 0.5 * Pwidth)) + '</xmax>\n')
xml_file.write(' <ymax>' + str(
int(((float(oneline[2])) * Pheight) + (float(oneline[4])) * 0.5 * Pheight)) + '</ymax>\n')
xml_file.write(' </bndbox>\n')
xml_file.write(' </object>\n')
xml_file.close()

# 8.读完txt文件最后写入</annotation>
xml_file1 = open((xml_path + pre_img_name + '.xml'), 'a')
xml_file1.write('</annotation>')
xml_file1.close()
print("Done !")


# 修改成自己的文件夹 注意文件夹最后要加上/
txt_to_xml(txt_path, img_path, xml_path)

xml到txt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# xml to txt
import copy
from lxml.etree import Element, SubElement, tostring, ElementTree

import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join

path = 'C:\\Users\\XY\\Desktop\\data\\test\\'
classes = ["bench", "Material Box", "roll paper", "Masks"] # 类别

CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))


def convert(size, box):
dw = 1. / size[0]
dh = 1. / size[1]
x = (box[0] + box[1]) / 2.0
y = (box[2] + box[3]) / 2.0
w = box[1] - box[0]
h = box[3] - box[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)


def convert_annotation(image_id):
in_file = open(path + 'xml\\%s.xml' % (image_id), encoding='UTF-8')
# print(image_id)

out_file = open(path + 'txt\\%s.txt' % (image_id), 'w') # 生成txt格式文件
tree = ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)

for obj in root.iter('object'):
cls = obj.find('name').text
# print(cls)
if cls not in classes:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text),
float(xmlbox.find('ymax').text))
bb = convert((w, h), b)
# out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
out_file.write(str(cls_id) + " " + " ".join(["{:.6f}".format(a) for a in bb]) + '\n')


xml_path = os.path.join(CURRENT_DIR, path + 'xml\\')

# xml list
img_xmls = os.listdir(xml_path)
for img_xml in img_xmls:
label_name = img_xml.split('.')[0]
print(label_name)
convert_annotation(label_name)

批量命名

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import os

# 指定图像和标注文件所在的文件夹
image_folder = r'C:\Users\XY\Desktop\zhuizhong\data2\workshop01\img'
label_folder = r'C:\Users\XY\Desktop\zhuizhong\data2\workshop01\txt'

# 初始化计数器
counter = 4000

# 遍历图像文件夹中的所有图像
for filename in os.listdir(image_folder):
# 构造新的文件名
new_filename = '{:05d}.jpg'.format(counter)
new_label_filename = '{:05d}.txt'.format(counter)

# 重命名图像文件
image_path = os.path.join(image_folder, filename)
new_image_path = os.path.join(image_folder, new_filename)
os.rename(image_path, new_image_path)

# 重命名对应的标注文件
label_filename = os.path.splitext(filename)[0] + '.txt'
label_path = os.path.join(label_folder, label_filename)
new_label_path = os.path.join(label_folder, new_label_filename)
os.rename(label_path, new_label_path)

# 更新计数器
counter += 1

旋转

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
import cv2
import os
import xml.etree.ElementTree as ET
import numpy as np
# 图片文件夹路径
imgs_path = r'C:\Users\XY\Desktop\bakcup\data\mask\img'
# XML 文件夹路径
xmls_path = r'C:\Users\XY\Desktop\bakcup\data\mask\xzxml'
# 输出文件夹路径
img_save_path = r'C:\Users\XY\Desktop\bakcup\data\mask\outimg'
# 输出文件夹xml
xml_save_path = r'C:\Users\XY\Desktop\bakcup\data\mask\xzimg'
for images in os.listdir(imgs_path):
# rotate_img
oriname = images.rstrip('.jpg')
img_path = os.path.join(imgs_path, oriname + '.jpg')
img = cv2.imread(img_path)
# number = oriname.rsplit('_', 1)
# print(number)
new_number = 30000 + int(oriname)
H, W, C = img.shape
# 旋转中心,逆时针旋转90度,最后一个是缩放因子


# dst = np.rot90(img)
#
# img_filename = '90' + '_' + str(new_number) + '.jpg'
# img_filepath = os.path.join(img_save_path, img_filename)
# cv2.imwrite(img_filepath, dst)

for angle in [90, 180, 270, 360]:
rotated_img = np.rot90(img, k=angle // 90)
img_filename = str(angle) + '_' + str(new_number) + '.jpg'
img_filepath = os.path.join(img_save_path, img_filename)
cv2.imwrite(img_filepath, rotated_img)

# rotate_xml
xml_path = os.path.join(xmls_path, oriname + '.xml')
tree = ET.parse(xml_path)
root = tree.getroot()
filename = root.find('filename').text
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for object in root.findall('object'):
object_name = object.find('name').text
Xmin = int(object.find('bndbox').find('xmin').text)
Ymin = int(object.find('bndbox').find('ymin').text)
Xmax = int(object.find('bndbox').find('xmax').text)
Ymax = int(object.find('bndbox').find('ymax').text)
# 修改属性
w = Xmax - Xmin
h = Ymax - Ymin
if angle ==90:
object.find('bndbox').find('xmin').text = str(Ymin)
object.find('bndbox').find('ymin').text = str(W - Xmax)
object.find('bndbox').find('xmax').text = str(Ymax)
object.find('bndbox').find('ymax').text = str(W - Xmin)
elif angle == 180:
# 更新坐标
object.find('bndbox').find('xmin').text = str(W - Xmax)
object.find('bndbox').find('ymin').text = str(H - Ymax)
object.find('bndbox').find('xmax').text = str(W - Xmin)
object.find('bndbox').find('ymax').text = str(H - Ymin)
elif angle == 270:
# 更新坐标
object.find('bndbox').find('xmin').text = str(H - Ymax)
object.find('bndbox').find('ymin').text = str(Xmin)
object.find('bndbox').find('xmax').text = str(H - Ymin)
object.find('bndbox').find('ymax').text = str(Xmax)
size.find('width').text = str(H)
size.find('height').text = str(W)
root.find('filename').text = str(new_number) + '.jpg'
xml_filename = str(angle) + '_' + str(new_number) + '.xml'
xml_filepath = os.path.join(xml_save_path, xml_filename)
tree.write(xml_filepath)
其他文章
目录导航 置顶
  1. 1. 深度学习
    1. 1.1. 环境配置
      1. 1.1.1. 安装
      2. 1.1.2. 测试
      3. 1.1.3. 设置源
      4. 1.1.4. 测试
      5. 1.1.5. 关闭程序
    2. 1.2. 谷歌colab
      1. 1.2.1. 挂载云盘
      2. 1.2.2. 复制数据集
      3. 1.2.3. 解压数据集
      4. 1.2.4. 运行相关
      5. 1.2.5. 参数相关
      6. 1.2.6. 训练相关
    3. 1.3. 数据集相关
      1. 1.3.1. 划分数据集
      2. 1.3.2. 部分数据增强
      3. 1.3.3. txt到xml
      4. 1.3.4. xml到txt
      5. 1.3.5. 批量命名
      6. 1.3.6. 旋转
请输入关键词进行搜索