97爽无码人妻aⅴ精品牛牛,国产三级黄色片,欲天堂导航,影音先锋久久,www.毛片,狠狠躁日日躁夜夜躁A片男男视频,亚洲欧美日韩色图,操逼王

修改 Anchor 尺寸

在實際的應(yīng)用場景中，我們按照 MS COCO 標(biāo)準(zhǔn)中把大小不大于 32x32 或者占原始圖片比率不足 0.01 的目標(biāo)物體定義為一個小目標(biāo)物體。

在使用 Anchor 的檢測算法（以目標(biāo)檢測網(wǎng)絡(luò) Faster RCNN 為例）中，如下圖所示：算法會按照一定的規(guī)則在主干網(wǎng)絡(luò)的所有輸出 Feature Map 上生成不同尺寸的 Anchor，而候選提議框生成層 RPN（RPN 的輸出結(jié)果和最后生成的預(yù)測目標(biāo)物體框的大小、分類以及定位息息相關(guān)）則會預(yù)測這些 Anchor 中是否含有目標(biāo)物以及目標(biāo)物體框離 Anchor 框的偏移。

為了提高小目標(biāo)物體的檢測效果，我們可以通過修改 Anchor 的尺寸來生成合適的 Anchor。

下面我們詳細(xì)介紹如何修改 Anchor 的尺寸來提高小目標(biāo)的檢測效果。根據(jù)上圖所示我們知道 Anchor 生成在主干網(wǎng)絡(luò)的輸出特征圖上進行，如果我們選擇合適的 Anchor 來 Match 小目標(biāo)，我們就可以提高小目標(biāo)物體的分類準(zhǔn)確度和定位精準(zhǔn)度，從而提高小目標(biāo)的檢測精準(zhǔn)度。

下面我們以 Faster RCNN 網(wǎng)絡(luò)中 Anchor 的生成代碼為例，說明如何調(diào)節(jié)輸入?yún)?shù)來對調(diào)節(jié) Anchor 的尺寸。

import numpy as np


# 傳入anchor的左上角和右下角的坐標(biāo)，返回anchor的中心坐標(biāo)和長寬
def _whctrs(anchor):
    """
    :param anchor: list，某個anchor的坐標(biāo)信息[xmin, ymin, xmax, ymax]
    :return: anchor的中心點坐標(biāo)和anchor的長寬
    """
    w = anchor[2] - anchor[0] + 1
    h = anchor[3] - anchor[1] + 1
    x_ctr = anchor[0] + 0.5 * (w - 1)
    y_ctr = anchor[1] + 0.5 * (h - 1)
    return w, h, x_ctr, y_ctr


# 給定一個anchor的中心坐標(biāo)和長寬，輸出各個anchor，即預(yù)測窗口，**輸出anchor的面積相等，只是寬高比不同**
def _mkanchors(ws, hs, x_ctr, y_ctr):
    """
    :param ws: anchor的寬
    :param hs: anchor的長
    :param x_ctr: anchor中心點x坐標(biāo)
    :param y_ctr: anchor中心點y坐標(biāo)
    :return: numpy array, 生成的符合條件的一組anchor
    """
    # 將ws和hs數(shù)組轉(zhuǎn)置
    ws = ws[:, np.newaxis]
    hs = hs[:, np.newaxis]
    # 生成符合條件的一組anchor
    anchors = np.hstack((x_ctr - 0.5 * (ws - 1),
                         y_ctr - 0.5 * (hs - 1),
                         x_ctr + 0.5 * (ws - 1),
                         y_ctr + 0.5 * (hs - 1)))
    return anchors


# 將給定的anchor放大scales中指定的倍數(shù)
def _scale_enum(anchor, scales):
    """
    :param anchor: list，某個anchor的坐標(biāo)信息[xmin, ymin, xmax, ymax]
    :param scales: list，將anchor中的元素放大到scales中指定的倍數(shù)
    :return: numpy array, 生成的符合條件的一組anchor
    """
    # 找到anchor的中心坐標(biāo)
    w, h, x_ctr, y_ctr = _whctrs(anchor)
    # 將anchor的長寬放大到scales中指定的倍數(shù)
    ws = w * scales
    hs = h * scales
    # 根據(jù)指定的anchor長寬和中心點坐標(biāo)信息生成一組anchor
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
    return anchors


# 計算不同長寬尺度下的anchor的坐標(biāo)
def _ratio_enum(anchor, ratios):
    """
    :param anchor: 基準(zhǔn)anchor
    :param ratios: list, anchor長寬比例尺寸
    :return: list, 生成的anchor信息
    """
    # 獲取anchor的中心點坐標(biāo)和長寬
    w, h, x_ctr, y_ctr = _whctrs(anchor)
    # 獲取anchor的面積
    size = w * h
    # 在保持面積不變的情況下生成ratios中指定長寬比的anchor長和寬
    size_ratios = size / ratios
    ws = np.round(np.sqrt(size_ratios))
    hs = np.round(ws * ratios)
    # 獲取指定長寬和中心點坐標(biāo)的anchors
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)
    return anchors


def generate_anchors(base_size=16, ratios=[0.5, 1, 2],
                     scales=2 ** np.arange(3, 6)):
    """
    :param base_size: int, 基準(zhǔn)anchor尺寸
    :param ratios: list, anchor長寬比例尺寸
    :param scales: list, anchor邊長放大的倍數(shù)
    :return: list, 生成的anchor信息
    """
    # 請注意anchor的表示形式有兩種，一種是記錄左上角和右下角的坐標(biāo)，一種是記錄中心坐標(biāo)和寬高
    # 這里生成一個基準(zhǔn)anchor，采用左上角和右下角的坐標(biāo)表示[0,0,15,15]
    # base_anchor = [0,0,15,15]
    base_anchor = np.array([1, 1, base_size, base_size]) - 1
    # 按照ratios元素信息生成不同長寬比的anchor
    ratio_anchors = _ratio_enum(base_anchor, ratios)
    # 將ratio_anchors中的每個anchor放大到scales里指定的倍數(shù)
    anchors = np.vstack([_scale_enum(ratio_anchors[i, :], scales)
                         for i in range(ratio_anchors.shape[0])])
    return anchors


if __name__ == '__main__':
    import time

    t = time.time()
    # 生成anchor
    a = generate_anchors(base_size=16, ratios=[0.25, 0.5, 1, 2],
                         scales=2 ** np.arange(2, 6))
    # 打印生成過程所需要的時間
    print(time.time() - t)
    print(a)

結(jié)果如下：

0.00026607513427734375
[[ -56.   -8.   71.   23.]
 [-120.  -24.  135.   39.]
 [-248.  -56.  263.   71.]
 [-504. -120.  519.  135.]
 [ -38.  -16.   53.   31.]
 [ -84.  -40.   99.   55.]
 [-176.  -88.  191.  103.]
 [-360. -184.  375.  199.]
 [ -24.  -24.   39.   39.]
 [ -56.  -56.   71.   71.]
 [-120. -120.  135.  135.]
 [-248. -248.  263.  263.]
 [ -14.  -36.   29.   51.]
 [ -36.  -80.   51.   95.]
 [ -80. -168.   95.  183.]
 [-168. -344.  183.  359.]]

上述 Anchor 函數(shù)生成的所有 Anchor，我們可以根據(jù)主干網(wǎng)絡(luò)的網(wǎng)絡(luò)架構(gòu)計算出其在原始圖像上的感受野大小。進而可以比對原始圖片上感受野大小和原始圖片上目標(biāo)標(biāo)注框大小。

而在實際操作過程中，原始圖片上目標(biāo)標(biāo)注框已經(jīng)獲取，我們需要通過分析這些目標(biāo)標(biāo)注框的大小反推 Anchor 生成函數(shù)的參數(shù)，進而調(diào)控生成 Anchor 的尺寸來更好的 Match 小目標(biāo)物體的尺寸。

上面給出的 Anchor 生成函數(shù) generate_anchors 共有三個可調(diào)節(jié)參數(shù)：

第一個參數(shù) base_size 為基準(zhǔn) Anchor 的大小。
第二個參數(shù) ratios=[0.5, 1, 2] 指的是在保持面積不變的情況下，Anchor 框的邊長按照 1:2、1:1、2:1 三種比例進行變換得到一組新的 Anchor，如下圖所示：

第三個參數(shù) scales=2 ** np.arange(3, 6)，指的是將各個 Anchor 放大 [8, 16, 32] 倍，得到一組新的 Anchor。如下圖所示:

修改 Anchor 數(shù)量

根據(jù)上述所闡述的生成 Anchor 的尺寸和預(yù)測目標(biāo)物體框的關(guān)系可知，如果我們能根據(jù)實際應(yīng)用場景中目標(biāo)物的大小來設(shè)計 Anchor 的尺寸，我們能在一定的程度上提高小目標(biāo)物體的分類和定位精準(zhǔn)度。

而在實際應(yīng)用場景中，我們會碰到一類數(shù)據(jù)集目標(biāo)物的大小變化范圍比較大且含有大部分的小目標(biāo)物體，這種情況下，如果我們僅僅通過調(diào)節(jié)參數(shù)值修改 Anchor 的尺寸，可能不足以達到提高所有目標(biāo)物的分類和定位準(zhǔn)確度，我們還需要適當(dāng)?shù)脑黾?Anchor 的個數(shù)，讓 Anchor 更加多尺度的來 Match 不同大小的目標(biāo)物體。

根據(jù)上述給出的 Anchor 生成函數(shù)可知，修改 anchor ratio 或者 anchor scale 的值的個數(shù)可以生成更多數(shù)量的 Anchor，即在實際預(yù)測過程中，會生成更多的不同尺寸的目標(biāo)候選框來 Match 更多不同大小的目標(biāo)物體。

下面我們介紹一種在實際應(yīng)用過程中的普適方法，來詳細(xì)說明在不同數(shù)據(jù)集上，如何修改 Anchor 的尺寸和數(shù)量讓 Anchor 機制生成更加符合實際目標(biāo)物體大小的 Anchor。

第一步：解析并讀取目標(biāo)物標(biāo)注信息，計算并統(tǒng)計目標(biāo)物體的坐標(biāo)信息，代碼如下：

import os
import tqdm
import xml.etree.ElementTree as ET
import config


def convert_annotation(year, classes, image_name):
    """

    :param year: str, 數(shù)據(jù)集版本（voc2012）
    :param classes: list, 數(shù)據(jù)集類別list
    :param image_name: str, 標(biāo)注圖片名字
    :return: list, 標(biāo)注框坐標(biāo)信息和類別信息
    """
    # 獲取圖片對應(yīng)的標(biāo)注文件路徑并打開
    xml_file = open(os.path.join(config.PLANE_CUT_DATASET,
                                 'VOC%s/Annotations/%s.xml' % (year, image_name)))
    # 使用xml讀取三方包解析xml信息
    tree = ET.parse(xml_file)
    # 遍歷根目錄下的所有標(biāo)注信息
    b = []
    root = tree.getroot()
    for obj in root.iter('object'):
        # 是否是難檢出目標(biāo)物
        difficult = obj.find('difficult').text
        # 標(biāo)注類別名字并過濾掉不在類別list中的標(biāo)注信息
        cls = obj.find('name').text
        if cls not in classes or int(difficult) == 1:
            continue
        # 獲取對應(yīng)的類別id
        cls_id = classes.index(cls)
        # 獲取標(biāo)注框信息
        bbox = obj.find('bndbox')
        
        box_info = (int(bbox.find('xmin').text), 
                    int(bbox.find('ymin').text), 
                    int(bbox.find('xmax').text), 
                    int(bbox.find('ymax').text), cls_id)
        # 過濾掉切圖產(chǎn)生的空背景標(biāo)注
        if box_info == (1, 1, 1, 1, 0):
            continue
        else:
            b.append(box_info)
    return b


def main():
    # 設(shè)置數(shù)據(jù)集版本和類別等信息
    sets = [('2012', 'train'), ('2012', 'val')]
    classes = ["plane"]
    wd = os.getcwd()
    # 遍歷分別為訓(xùn)練集、驗證集、測試集生成標(biāo)注信息文件
    for year, image_set in sets:
        # 讀取數(shù)據(jù)文件信息
        temp_path = 'VOC%s/ImageSets/Main/%s.txt' % (year, image_set)
        # 獲取圖像數(shù)據(jù)名字
        image_names = open(os.path.join(config.PLANE_CUT_DATASET,
                                        temp_path)).read().strip().split()
        # 用只讀模式打開標(biāo)注信息記錄文件
        info_fp = open('%s_%s.txt' % (year, image_set), 'w')
        # 遍歷寫入每個標(biāo)注文件的標(biāo)注信息
        for image_name in tqdm.tqdm(image_names):
            # 解析并讀取標(biāo)注文件信息，并寫入文件
            idx_info = convert_annotation(year, classes, image_name)
            if idx_info:
                info_fp.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg'
                              % (wd, year, image_name))
                # 將標(biāo)注框坐標(biāo)等信息寫入文件
                for box_info in idx_info:
                    info_fp.write(" " + ",".join([str(a) for a in box_info]))
                info_fp.write('\n')
        info_fp.close()


if __name__ == '__main__':
    main()

第二步：使用 Kmeans 方法對目標(biāo)物體的統(tǒng)計數(shù)據(jù)進行聚類，得到每一類的中心位置信息，代碼如下：

import numpy as np


class KMEANS(object):

    def __init__(self, cluster_number, filename):
        self.cluster_number = cluster_number
        self.filename = "2012_train.txt"

    # 計算每個標(biāo)注框和聚類中心的iou值矩陣
    def iou(self, boxes, clusters):
        """
        :param boxes: numpy array, 每個元素為每個標(biāo)注框?qū)捀?br>        :param clusters: numpy array, 元素個數(shù)=聚類中心個數(shù)，元素為從boxes中隨機選取的元素
        :return: float, iou值
        """
        # 獲取標(biāo)注框個數(shù)和聚類中心數(shù)目
        n = boxes.shape[0]
        k = cluster_number
        # 計算標(biāo)注框面積，并讓每個元素重復(fù)k遍，整理成維度為（n, k）的numpy數(shù)組
        box_area = boxes[:, 0] * boxes[:, 1]
        box_area = box_area.repeat(k)
        box_area = np.reshape(box_area, (n, k))
        # 計算隨機挑選的聚類中心面積, 并將元素復(fù)制n遍
        cluster_area = clusters[:, 0] * clusters[:, 1]
        cluster_area = np.tile(cluster_area, [1, n])
        cluster_area = np.reshape(cluster_area, (n, k))
        # 構(gòu)建標(biāo)注框?qū)捑仃嚭途垲愔行膶捑仃?/span>
        box_w_matrix = np.reshape(boxes[:, 0].repeat(k), (n, k))
        cluster_w_matrix = np.reshape(np.tile(clusters[:, 0], (1, n)), (n, k))
        # 獲取兩個矩陣中對應(yīng)元素中較小的值
        min_w_matrix = np.minimum(cluster_w_matrix, box_w_matrix)
        # 構(gòu)建標(biāo)注框高矩陣
        box_h_matrix = np.reshape(boxes[:, 1].repeat(k), (n, k))
        cluster_h_matrix = np.reshape(np.tile(clusters[:, 1], (1, n)), (n, k))
        # 獲取最兩個矩陣中對應(yīng)元素中較大的值
        min_h_matrix = np.minimum(cluster_h_matrix, box_h_matrix)
        # 計算兩個矩陣對應(yīng)元素的內(nèi)積，即計算聚類中心和每個標(biāo)注框的相交面積
        inter_area = np.multiply(min_w_matrix, min_h_matrix)
        # 計算聚類中心和每個標(biāo)注框的iou值
        result = inter_area / (box_area + cluster_area - inter_area)
        return result

    # 計算iou均值
    def avg_iou(self, boxes, clusters):
        """
        :param boxes: numpy array, 每個元素為每個標(biāo)注框?qū)捀?br>        :param clusters: numpy array, 每個元素為一個聚類中心
        :return: 標(biāo)注矩形框和聚類中心的iou均值
        """
        accuracy = np.mean([np.max(self.iou(boxes, clusters), axis=1)])
        return accuracy

    # 訓(xùn)練kmeans模型
    def kmeans(self, boxes, k, dist=np.median):
        """
        :param boxes: numpy array, 每個元素為每個標(biāo)注框?qū)捀?br>        :param k: 聚類類別數(shù)目
        :param dist: 聚類中心距離計算函數(shù)
        :return: 聚類中心信息
        """
        # 獲取元素個數(shù)（標(biāo)注框數(shù)目）
        box_number = boxes.shape[0]
        # 隨機生成一個新的numpy數(shù)組，維度為(標(biāo)注框數(shù)目, 聚類數(shù)目)
        distances = np.empty((box_number, k))
        # 生成一個0元素構(gòu)成的numpy數(shù)組
        last_nearest = np.zeros((box_number,))
        np.random.seed()
        # 從box_number中，隨機選取大小為k的數(shù)據(jù)
        clusters = boxes[np.random.choice(
            box_number, k, replace=False)]  # init k clusters
        while True:
            # 計算每個元素和聚類中心的"距離"（1-每個標(biāo)注框和聚類中心框的iou）
            distances = 1 - self.iou(boxes, clusters)
            # 判斷模型是否收斂即聚類結(jié)果不再變化
            current_nearest = np.argmin(distances, axis=1)
         
            if (last_nearest == current_nearest).all():
                break
            # 重新計算新的聚類中心
            for cluster in range(k):
                clusters[cluster] = dist(  # update clusters
                    boxes[current_nearest == cluster], axis=0)
            last_nearest = current_nearest
        return clusters

    # 將計算出來的kmeans計算出來的anchor結(jié)果寫入txt文件
    def result_txt(self, data):
        """
        :param data: 聚類中心數(shù)據(jù)
        :return:
        """
        f = open("yolo_anchors.txt", 'w')
        row = np.shape(data)[0]
        for i in range(row):
            if i == 0:
                x_y = "%d,%d" % (data[i][0], data[i][1])
            else:
                x_y = ", %d,%d" % (data[i][0], data[i][1])
            f.write(x_y)
        f.close()

    # 獲取每個標(biāo)注框的寬高信息
    def get_box_info(self):
        # 用只讀模式打開標(biāo)注信息統(tǒng)計文件
        fp = open(self.filename, 'r')
        box_info_list = []
        # 遍歷文件的每一行（每個文件）獲取標(biāo)注框信息并計算其寬高
        for line in fp:
            # 按照空格分割每個標(biāo)注框信息
            infos = line.split(" ")
            length = len(infos)
            # 遍歷每個標(biāo)注框信息并計算其寬和高
            for i in range(1, length):
                width = int(infos[i].split(",")[2]) - \
                        int(infos[i].split(",")[0])
                height = int(infos[i].split(",")[3]) - \
                         int(infos[i].split(",")[1])
                box_info_list.append([width, height])
        # 將標(biāo)注框信息list轉(zhuǎn)成numpy數(shù)組并返回
        result = np.array(box_info_list)
        fp.close()
        return result

    # 訓(xùn)練kmeans并獲取聚類結(jié)果
    def get_clusters(self):
        # 獲取標(biāo)注信息統(tǒng)計文件中的每個標(biāo)注框的寬高
        all_boxes = self.get_box_info()
        # 將標(biāo)注框?qū)捀咝畔⒆鳛閗means訓(xùn)練數(shù)據(jù)，獲取其聚類中心
        result = self.kmeans(all_boxes, k=self.cluster_number)
        # 將聚類中心按照第一列進行排序
        result = result[np.lexsort(result.T[0, None])]
        # 將聚類中心結(jié)果存入txt文件
        self.result_txt(result)
        print("value of {} anchors:\n {}".format(self.cluster_number, result))
        print("Accuracy: {:.2f}%".format(
            self.avg_iou(all_boxes, result) * 100))


if __name__ == "__main__":
    # 設(shè)置聚類中心數(shù)目和訓(xùn)練數(shù)據(jù)文件信息
    cluster_number = 9
    filename = "2012_train.txt"
    # 創(chuàng)建KMEANS類對象
    kmeans = KMEANS(cluster_number, filename)
    # 調(diào)用類成員函數(shù)獲取聚類中心、聚類中心和標(biāo)注框iou準(zhǔn)確率信息
    kmeans.get_clusters()

第三步：根據(jù)聚類中心修改 Anchor 生成機制參數(shù)。

掃描上方二維碼可聯(lián)系小書童加入交流群~

想要了解更多前沿AI視覺感知全棧知識【分類、檢測、分割、關(guān)鍵點、車道線檢測、3D視覺（分割、檢測）、多模態(tài)、目標(biāo)跟蹤、NerF】、行業(yè)技術(shù)方案【AI安防、AI醫(yī)療、AI自動駕駛】、AI模型部署落地實戰(zhàn)【CUDA、TensorRT、NCNN、OpenVINO、MNN、ONNXRuntime以及地平線框架等】，歡迎掃描下方二維碼，加入集智書童知識星球，日常分享論文、學(xué)習(xí)筆記、問題解決方案、部署方案以及全棧式答疑，期待交流！

目標(biāo)檢測Trick | 如何優(yōu)化小目標(biāo)檢測問題之 Anchor調(diào)節(jié)實戰(zhàn)（附代碼）

修改 Anchor 尺寸

修改 Anchor 數(shù)量