使用 YOLO 進行自定義對象檢測
共 18077字,需瀏覽 37分鐘
·
2024-05-04 10:05
點擊上方“小白學視覺”,選擇加"星標"或“置頂”
重磅干貨,第一時間送達
我們知道我們可以專門檢測一些對象。那么我們?nèi)绾斡柧毾到y(tǒng)檢測自定義對象呢?讓我們一步一步來。
1. 創(chuàng)建數(shù)據(jù)集
機器是通過數(shù)據(jù)集學習的。數(shù)據(jù)集必須包含圖像和標簽。例如,讓我的目標是創(chuàng)建一個檢測坦克的系統(tǒng)。
我準備了從網(wǎng)上下載的坦克圖片。然后我們需要使用第三方工具對圖像進行標記,例如;LabelImg、MakeSense 等。我們將在此示例中使用 MakeSense,可以在此處訪問它:https://www.makesense.ai/
上傳完所有圖片后,點擊 Object Detection 選擇。你會看到你上傳的照片。你需要標記對象的區(qū)域。
我們標記了對象的區(qū)域。(如果圖像有很多對象,則必須標記所有對象。)然后查看頁面右側的“標簽”。
單擊加號圖標并在那里輸入對象的名稱。
現(xiàn)在我們的數(shù)據(jù)集準備好了!讓我們下載數(shù)據(jù)集。
準備的數(shù)據(jù)集如上圖。
2. 使用 Colab 進行訓練
我們將使用 Google Colab 進行訓練。
什么是 Colab?
Colaboratory,簡稱“Colab”,是谷歌研究院的一款產(chǎn)品。Colab 允許任何人通過瀏覽器編寫和執(zhí)行任意 python 代碼,特別適合機器學習、數(shù)據(jù)分析和教育。
GPU 訓練比 CPU 訓練更快。
按照以下步驟進行訓練
-
打開 Google Drive,創(chuàng)建一個名為“yolov3”的文件夾 -
將你的數(shù)據(jù)集上傳為“images.zip” -
從此處鏈接下載訓練文件“Train_YoloV3.ipynb” :https://github.com/turgay2317/yolov3-training-files -
打開 Colab:https://colab.research.google.com/ -
選擇“上傳”選項卡,并選擇下載訓練文件“Train_YoloV3.ipynb”
上傳過程結束后,你會遇到如下頁面。
該文件應用了這些步驟;檢查 Nvidia GPU,安裝 Google Drive,克隆和編譯 DarkNet,為標簽創(chuàng)建 obj.names 文件,從 images.zip 中提取圖像,開始訓練
訓練結束后可以查看Google Drive/yolov3目錄下的權重文件。
我們將使用權重文件。下載它。
3. 準備所有檢測文件
我們將使用這些文件:
-
yolov3_training_last.weights -> 訓練文件 -
coco.names -> 包含特定對象的標簽 -
yolo_object_detection.py -> 使用 OpenCV 進行對象檢測 -
yolov3_testing.cfg -> 一些配置
你可以從這里下載其他文件:https://github.com/turgay2317/yolov3-training-files
4. 運行自定義對象檢測
不要忘記安裝 OpenCV 和所需的庫。然后運行“yolo_object_detection.py”文件!
import cv2
import numpy as np
import glob
import random
net = cv2.dnn.readNet("yolov3_training_last.weights", "yolov3_testing.cfg")
classes = ["Tank"]
images_path = glob.glob(r"tests/*.jpeg")
layer_names = net.getLayerNames()
output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()]
colors = np.random.uniform(0, 255, size=(len(classes), 3))
random.shuffle(images_path)
for img_path in images_path:
# Loading image
img = cv2.imread(img_path)
img = cv2.resize(img, None, fx=0.4, fy=0.4)
height, width, channels = img.shape
# Detecting objects
blob = cv2.dnn.blobFromImage(img, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)
# Showing informations on the screen
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.3:
# Object detected
print(class_id)
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.5, 0.4)
print(indexes)
font = cv2.FONT_HERSHEY_PLAIN
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
color = colors[class_ids[i]]
cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
cv2.putText(img, label, (x, y + 30), font, 3, color, 2)
cv2.imshow("Image", img)
key = cv2.waitKey(0)
cv2.destroyAllWindows()
檢測實例
最后,你可以看到我們的檢測成功了。現(xiàn)在我們的系統(tǒng)可以檢測到坦克。
5. 視頻中的自定義對象檢測
import cv2
import numpy as np
import time
# Load Yolo
net = cv2.dnn.readNet("yolov3_training_last.weights", "yolov3_testing.cfg")
classes = []
with open("coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]
layer_names = net.getLayerNames()
output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()]
colors = np.random.uniform(0, 255, size=(len(classes), 3))
# Loading image
cap = cv2.VideoCapture("tests/test.mp4")
font = cv2.FONT_HERSHEY_PLAIN
starting_time = time.time()
frame_id = 0
while True:
_, frame = cap.read()
frame_id += 1
height, width, channels = frame.shape
# Detecting objects
blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)
# Showing informations on the screen
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.1:
# Object detected
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.8, 0.3)
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
confidence = confidences[i]
color = colors[class_ids[i]]
cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
cv2.putText(frame, label + " " + str(round(confidence, 2)), (x, y + 30), font, 3, color, 3)
elapsed_time = time.time() - starting_time
fps = frame_id / elapsed_time
cv2.putText(frame, "FPS: " + str(round(fps, 2)), (10, 50), font, 4, (0, 0, 0), 3)
cv2.imshow("Image", frame)
key = cv2.waitKey(1)
if key == 27:
break
cap.release()
cv2.destroyAllWindows()
6. 相機的自定義對象檢測
import cv2
import numpy as np
import time
# Load Yolo
net = cv2.dnn.readNet("yolov3_training_last.weights", "yolov3_testing.cfg")
classes = []
with open("coco.names", "r") as f:
classes = [line.strip() for line in f.readlines()]
layer_names = net.getLayerNames()
output_layers = [layer_names[i - 1] for i in net.getUnconnectedOutLayers()]
colors = np.random.uniform(0, 255, size=(len(classes), 3))
# Loading image
cap = cv2.VideoCapture(0)
font = cv2.FONT_HERSHEY_PLAIN
starting_time = time.time()
frame_id = 0
while True:
_, frame = cap.read()
frame_id += 1
height, width, channels = frame.shape
# Detecting objects
blob = cv2.dnn.blobFromImage(frame, 0.00392, (416, 416), (0, 0, 0), True, crop=False)
net.setInput(blob)
outs = net.forward(output_layers)
# Showing informations on the screen
class_ids = []
confidences = []
boxes = []
for out in outs:
for detection in out:
scores = detection[5:]
class_id = np.argmax(scores)
confidence = scores[class_id]
if confidence > 0.1:
# Object detected
center_x = int(detection[0] * width)
center_y = int(detection[1] * height)
w = int(detection[2] * width)
h = int(detection[3] * height)
# Rectangle coordinates
x = int(center_x - w / 2)
y = int(center_y - h / 2)
boxes.append([x, y, w, h])
confidences.append(float(confidence))
class_ids.append(class_id)
indexes = cv2.dnn.NMSBoxes(boxes, confidences, 0.8, 0.3)
for i in range(len(boxes)):
if i in indexes:
x, y, w, h = boxes[i]
label = str(classes[class_ids[i]])
confidence = confidences[i]
color = colors[class_ids[i]]
cv2.rectangle(frame, (x, y), (x + w, y + h), color, 2)
cv2.putText(frame, label + " " + str(round(confidence, 2)), (x, y + 30), font, 3, color, 3)
elapsed_time = time.time() - starting_time
fps = frame_id / elapsed_time
cv2.putText(frame, "FPS: " + str(round(fps, 2)), (10, 50), font, 4, (0, 0, 0), 3)
cv2.imshow("Image", frame)
key = cv2.waitKey(1)
if key == 27:
break
cap.release()
cv2.destroyAllWindows()
因此,你現(xiàn)在可以檢測自定義對象!
下載1:OpenCV-Contrib擴展模塊中文版教程
在「小白學視覺」公眾號后臺回復:擴展模塊中文教程,即可下載全網(wǎng)第一份OpenCV擴展模塊教程中文版,涵蓋擴展模塊安裝、SFM算法、立體視覺、目標跟蹤、生物視覺、超分辨率處理等二十多章內(nèi)容。
下載2:Python視覺實戰(zhàn)項目52講
在「小白學視覺」公眾號后臺回復:Python視覺實戰(zhàn)項目,即可下載包括圖像分割、口罩檢測、車道線檢測、車輛計數(shù)、添加眼線、車牌識別、字符識別、情緒檢測、文本內(nèi)容提取、面部識別等31個視覺實戰(zhàn)項目,助力快速學校計算機視覺。
下載3:OpenCV實戰(zhàn)項目20講
在「小白學視覺」公眾號后臺回復:OpenCV實戰(zhàn)項目20講,即可下載含有20個基于OpenCV實現(xiàn)20個實戰(zhàn)項目,實現(xiàn)OpenCV學習進階。
交流群
歡迎加入公眾號讀者群一起和同行交流,目前有SLAM、三維視覺、傳感器、自動駕駛、計算攝影、檢測、分割、識別、醫(yī)學影像、GAN、算法競賽等微信群(以后會逐漸細分),請掃描下面微信號加群,備注:”昵稱+學校/公司+研究方向“,例如:”張三 + 上海交大 + 視覺SLAM“。請按照格式備注,否則不予通過。添加成功后會根據(jù)研究方向邀請進入相關微信群。請勿在群內(nèi)發(fā)送廣告,否則會請出群,謝謝理解~
