<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          Pytorch 利用mobilenet系列(v1,v2,v3)搭建yolov4目標(biāo)檢測(cè)平臺(tái)

          共 27124字,需瀏覽 55分鐘

           ·

          2022-02-28 03:03

          睿智的目標(biāo)檢測(cè)——Pytorch 利用mobilenet系列(v1,v2,v3)搭建yolov4目標(biāo)檢測(cè)平臺(tái)

          學(xué)習(xí)前言

          一起來看看如何利用mobilenet系列搭建yolov4目標(biāo)檢測(cè)平臺(tái)。

          源碼下載

          https://github.com/bubbliiiing/mobilenet-yolov4-pytorch 喜歡的可以點(diǎn)個(gè)star噢。

          網(wǎng)絡(luò)替換實(shí)現(xiàn)思路

          1、網(wǎng)絡(luò)結(jié)構(gòu)解析與替換思路解析

          對(duì)于YoloV4而言,其整個(gè)網(wǎng)絡(luò)結(jié)構(gòu)可以分為三個(gè)部分。分別是:1、主干特征提取網(wǎng)絡(luò)Backbone,對(duì)應(yīng)圖像上的CSPdarknet53 2、加強(qiáng)特征提取網(wǎng)絡(luò),對(duì)應(yīng)圖像上的SPP和PANet 3、預(yù)測(cè)網(wǎng)絡(luò)YoloHead,利用獲得到的特征進(jìn)行預(yù)測(cè)

          其中:第一部分主干特征提取網(wǎng)絡(luò)的功能是進(jìn)行初步的特征提取,利用主干特征提取網(wǎng)絡(luò),我們可以獲得三個(gè)初步的有效特征層。第二部分加強(qiáng)特征提取網(wǎng)絡(luò)的功能是進(jìn)行加強(qiáng)的特征提取,利用加強(qiáng)特征提取網(wǎng)絡(luò),我們可以對(duì)三個(gè)初步的有效特征層進(jìn)行特征融合,提取出更好的特征,獲得三個(gè)更有效的有效特征層。第三部分預(yù)測(cè)網(wǎng)絡(luò)的功能是利用更有效的有效特整層獲得預(yù)測(cè)結(jié)果。

          在這三部分中,第1部分和第2部分可以更容易去修改。第3部分可修改內(nèi)容不大,畢竟本身也只是3x3卷積和1x1卷積的組合。

          mobilenet系列網(wǎng)絡(luò)可用于進(jìn)行分類,其主干部分的作用是進(jìn)行特征提取,我們可以使用mobilenet系列網(wǎng)絡(luò)代替yolov4當(dāng)中的CSPdarknet53進(jìn)行特征提取,將三個(gè)初步的有效特征層相同shape的特征層進(jìn)行加強(qiáng)特征提取,便可以將mobilenet系列替換進(jìn)yolov4當(dāng)中了。

          2、mobilenet系列網(wǎng)絡(luò)介紹

          本文共用到三個(gè)主干特征提取網(wǎng)絡(luò),分別是mobilenetV1、mobilenetV2、mobilenetV3。

          a、mobilenetV1介紹

          MobileNet模型是Google針對(duì)手機(jī)等嵌入式設(shè)備提出的一種輕量級(jí)的深層神經(jīng)網(wǎng)絡(luò),其使用的核心思想便是depthwise separable convolution(深度可分離卷積塊)。

          對(duì)于一個(gè)卷積點(diǎn)而言:假設(shè)有一個(gè)3×3大小的卷積層,其輸入通道為16、輸出通道為32。具體為,32個(gè)3×3大小的卷積核會(huì)遍歷16個(gè)通道中的每個(gè)數(shù)據(jù),最后可得到所需的32個(gè)輸出通道,所需參數(shù)為16×32×3×3=4608個(gè)。

          應(yīng)用深度可分離卷積結(jié)構(gòu)塊,用16個(gè)3×3大小的卷積核分別遍歷16通道的數(shù)據(jù),得到了16個(gè)特征圖譜。在融合操作之前,接著用32個(gè)1×1大小的卷積核遍歷這16個(gè)特征圖譜,所需參數(shù)為16×3×3+16×32×1×1=656個(gè)。 可以看出來depthwise separable convolution可以減少模型的參數(shù)。

          如下這張圖就是depthwise separable convolution的結(jié)構(gòu) 在建立模型的時(shí)候,可以將卷積group設(shè)置成in_filters層實(shí)現(xiàn)深度可分離卷積,然后再利用1x1卷積調(diào)整channels數(shù)。

          通俗地理解就是3x3的卷積核厚度只有一層,然后在輸入張量上一層一層地滑動(dòng),每一次卷積完生成一個(gè)輸出通道,當(dāng)卷積完成后,在利用1x1的卷積調(diào)整厚度。

          如下就是MobileNet的結(jié)構(gòu),其中Conv dw就是分層卷積,在其之后都會(huì)接一個(gè)1x1的卷積進(jìn)行通道處理, 上圖所示是的mobilenetV1-1的結(jié)構(gòu),由于我沒有辦法找到pytorch的mobilenetv1的權(quán)重資源,我只有mobilenetV1-0.25的權(quán)重,所以本文所使用的mobilenetV1版本就是mobilenetV1-0.25。

          mobilenetV1-0.25是mobilenetV1-1通道數(shù)壓縮為原來1/4的網(wǎng)絡(luò)。對(duì)于yolov4來講,我們需要取出它的最后三個(gè)shape的有效特征層進(jìn)行加強(qiáng)特征提取。

          在代碼中,我們?nèi)〕隽薿ut1、out2、out3。

          import?time
          import?torch
          import?torch.nn?as?nn
          import?torchvision.models._utils?as?_utils
          import?torchvision.models?as?models
          import?torch.nn.functional?as?F
          from?torch.autograd?import?Variable

          def?conv_bn(inp,?oup,?stride?=?1):
          ????return?nn.Sequential(
          ????????nn.Conv2d(inp,?oup,?3,?stride,?1,?bias=False),
          ????????nn.BatchNorm2d(oup),
          ????????nn.ReLU6(inplace=True)
          ????)
          ????
          def?conv_dw(inp,?oup,?stride?=?1):
          ????return?nn.Sequential(
          ????????nn.Conv2d(inp,?inp,?3,?stride,?1,?groups=inp,?bias=False),
          ????????nn.BatchNorm2d(inp),
          ????????nn.ReLU6(inplace=True),

          ????????nn.Conv2d(inp,?oup,?1,?1,?0,?bias=False),
          ????????nn.BatchNorm2d(oup),
          ????????nn.ReLU6(inplace=True),
          ????)

          class?MobileNetV1(nn.Module):
          ????def?__init__(self):
          ????????super(MobileNetV1,?self).__init__()
          ????????self.stage1?=?nn.Sequential(
          ????????????#?640,640,3?->?320,320,32
          ????????????conv_bn(3,?32,?2),
          ????????????#?320,320,32?->?320,320,64
          ????????????conv_dw(32,?64,?1),?

          ????????????#?320,320,64?->?160,160,128
          ????????????conv_dw(64,?128,?2),
          ????????????conv_dw(128,?128,?1),

          ????????????#?160,160,128?->?80,80,256
          ????????????conv_dw(128,?256,?2),
          ????????????conv_dw(256,?256,?1),?
          ????????)
          ????????????#?80,80,256?->?40,40,512
          ????????self.stage2?=?nn.Sequential(
          ????????????conv_dw(256,?512,?2),
          ????????????conv_dw(512,?512,?1),
          ????????????conv_dw(512,?512,?1),
          ????????????conv_dw(512,?512,?1),?
          ????????????conv_dw(512,?512,?1),
          ????????????conv_dw(512,?512,?1),
          ????????)
          ????????????#?40,40,512?->?20,20,1024
          ????????self.stage3?=?nn.Sequential(
          ????????????conv_dw(512,?1024,?2),
          ????????????conv_dw(1024,?1024,?1),
          ????????)
          ????????self.avg?=?nn.AdaptiveAvgPool2d((1,1))
          ????????self.fc?=?nn.Linear(1024,?1000)

          ????def?forward(self,?x):
          ????????x?=?self.stage1(x)
          ????????x?=?self.stage2(x)
          ????????x?=?self.stage3(x)
          ????????x?=?self.avg(x)
          ????????#?x?=?self.model(x)
          ????????x?=?x.view(-1,?1024)
          ????????x?=?self.fc(x)
          ????????return?x

          def?mobilenet_v1(pretrained=False,?progress=True):
          ????model?=?MobileNetV1()
          ????if?pretrained:
          ????????print("mobilenet_v1?has?no?pretrained?model")
          ????return?model

          if?__name__?==?"__main__":
          ????import?torch
          ????from?torchsummary?import?summary

          ????#?需要使用device來指定網(wǎng)絡(luò)在GPU還是CPU運(yùn)行
          ????device?=?torch.device('cuda'?if?torch.cuda.is_available()?else?'cpu')
          ????model?=?mobilenet_v1().to(device)
          ????summary(model,?input_size=(3,?416,?416))

          b、mobilenetV2介紹

          MobileNetV2是MobileNet的升級(jí)版,它具有一個(gè)非常重要的特點(diǎn)就是使用了Inverted resblock,整個(gè)mobilenetv2都由Inverted resblock組成。

          Inverted resblock可以分為兩個(gè)部分:左邊是主干部分,首先利用1x1卷積進(jìn)行升維,然后利用3x3深度可分離卷積進(jìn)行特征提取,然后再利用1x1卷積降維。右邊是殘差邊部分,輸入和輸出直接相接。

          整體網(wǎng)絡(luò)結(jié)構(gòu)如下:(其中Inverted resblock進(jìn)行的操作就是上述結(jié)構(gòu)) 對(duì)于yolov4來講,我們需要取出它的最后三個(gè)shape的有效特征層進(jìn)行加強(qiáng)特征提取。

          在代碼中,我們?nèi)〕隽薿ut1、out2、out3。

          from?torch?import?nn
          from?torchvision.models.utils?import?load_state_dict_from_url

          model_urls?=?{
          ????'mobilenet_v2':?'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
          }


          def?_make_divisible(v,?divisor,?min_value=None):
          ????if?min_value?is?None:
          ????????min_value?=?divisor
          ????new_v?=?max(min_value,?int(v?+?divisor?/?2)?//?divisor?*?divisor)
          ????if?new_v?<?0.9?*?v:
          ????????new_v?+=?divisor
          ????return?new_v

          class?ConvBNReLU(nn.Sequential):
          ????def?__init__(self,?in_planes,?out_planes,?kernel_size=3,?stride=1,?groups=1):
          ????????padding?=?(kernel_size?-?1)?//?2
          ????????super(ConvBNReLU,?self).__init__(
          ????????????nn.Conv2d(in_planes,?out_planes,?kernel_size,?stride,?padding,?groups=groups,?bias=False),
          ????????????nn.BatchNorm2d(out_planes),
          ????????????nn.ReLU6(inplace=True)
          ????????)

          class?InvertedResidual(nn.Module):
          ????def?__init__(self,?inp,?oup,?stride,?expand_ratio):
          ????????super(InvertedResidual,?self).__init__()
          ????????self.stride?=?stride
          ????????assert?stride?in?[1,?2]

          ????????hidden_dim?=?int(round(inp?*?expand_ratio))
          ????????self.use_res_connect?=?self.stride?==?1?and?inp?==?oup

          ????????layers?=?[]
          ????????if?expand_ratio?!=?1:
          ????????????layers.append(ConvBNReLU(inp,?hidden_dim,?kernel_size=1))
          ????????layers.extend([
          ????????????ConvBNReLU(hidden_dim,?hidden_dim,?stride=stride,?groups=hidden_dim),
          ????????????nn.Conv2d(hidden_dim,?oup,?1,?1,?0,?bias=False),
          ????????????nn.BatchNorm2d(oup),
          ????????])
          ????????self.conv?=?nn.Sequential(*layers)

          ????def?forward(self,?x):
          ????????if?self.use_res_connect:
          ????????????return?x?+?self.conv(x)
          ????????else:
          ????????????return?self.conv(x)


          class?MobileNetV2(nn.Module):
          ????def?__init__(self,?num_classes=1000,?width_mult=1.0,?inverted_residual_setting=None,?round_nearest=8):
          ????????super(MobileNetV2,?self).__init__()
          ????????block?=?InvertedResidual
          ????????input_channel?=?32
          ????????last_channel?=?1280

          ????????if?inverted_residual_setting?is?None:
          ????????????inverted_residual_setting?=?[
          ????????????????#?t,?c,?n,?s
          ????????????????[1,?16,?1,?1],
          ????????????????[6,?24,?2,?2],
          ????????????????[6,?32,?3,?2],
          ????????????????[6,?64,?4,?2],
          ????????????????[6,?96,?3,?1],
          ????????????????[6,?160,?3,?2],
          ????????????????[6,?320,?1,?1],
          ????????????]

          ????????if?len(inverted_residual_setting)?==?0?or?len(inverted_residual_setting[0])?!=?4:
          ????????????raise?ValueError("inverted_residual_setting?should?be?non-empty?"
          ?????????????????????????????"or?a?4-element?list,?got?{}".format(inverted_residual_setting))

          ????????input_channel?=?_make_divisible(input_channel?*?width_mult,?round_nearest)
          ????????self.last_channel?=?_make_divisible(last_channel?*?max(1.0,?width_mult),?round_nearest)
          ????????features?=?[ConvBNReLU(3,?input_channel,?stride=2)]

          ????????for?t,?c,?n,?s?in?inverted_residual_setting:
          ????????????output_channel?=?_make_divisible(c?*?width_mult,?round_nearest)
          ????????????for?i?in?range(n):
          ????????????????stride?=?s?if?i?==?0?else?1
          ????????????????features.append(block(input_channel,?output_channel,?stride,?expand_ratio=t))
          ????????????????input_channel?=?output_channel

          ????????features.append(ConvBNReLU(input_channel,?self.last_channel,?kernel_size=1))
          ????????self.features?=?nn.Sequential(*features)

          ????????self.classifier?=?nn.Sequential(
          ????????????nn.Dropout(0.2),
          ????????????nn.Linear(self.last_channel,?num_classes),
          ????????)

          ????????for?m?in?self.modules():
          ????????????if?isinstance(m,?nn.Conv2d):
          ????????????????nn.init.kaiming_normal_(m.weight,?mode='fan_out')
          ????????????????if?m.bias?is?not?None:
          ????????????????????nn.init.zeros_(m.bias)
          ????????????elif?isinstance(m,?nn.BatchNorm2d):
          ????????????????nn.init.ones_(m.weight)
          ????????????????nn.init.zeros_(m.bias)
          ????????????elif?isinstance(m,?nn.Linear):
          ????????????????nn.init.normal_(m.weight,?0,?0.01)
          ????????????????nn.init.zeros_(m.bias)

          ????def?forward(self,?x):
          ????????x?=?self.features(x)
          ????????x?=?x.mean([2,?3])
          ????????x?=?self.classifier(x)
          ????????return?x

          def?mobilenet_v2(pretrained=False,?progress=True):
          ????model?=?MobileNetV2()
          ????if?pretrained:
          ????????state_dict?=?load_state_dict_from_url(model_urls['mobilenet_v2'],?model_dir="model_data",
          ??????????????????????????????????????????????progress=progress)
          ????????model.load_state_dict(state_dict)

          ????return?model

          if?__name__?==?"__main__":
          ????print(mobilenet_v2())

          c、mobilenetV3介紹

          mobilenetV3使用了特殊的bneck結(jié)構(gòu)。

          bneck結(jié)構(gòu)如下圖所示: 它綜合了以下四個(gè)特點(diǎn):a、MobileNetV2的具有線性瓶頸的逆殘差結(jié)構(gòu)(the inverted residual with linear bottleneck)。 即先利用1x1卷積進(jìn)行升維度,再進(jìn)行下面的操作,并具有殘差邊。

          b、MobileNetV1的深度可分離卷積(depthwise separable convolutions)。 在輸入1x1卷積進(jìn)行升維度后,進(jìn)行3x3深度可分離卷積。

          c、輕量級(jí)的注意力模型。 這個(gè)注意力機(jī)制的作用方式是調(diào)整每個(gè)通道的權(quán)重。

          d、利用h-swish代替swish函數(shù)。 在結(jié)構(gòu)中使用了h-swishj激活函數(shù),代替swish函數(shù),減少運(yùn)算量,提高性能。

          下圖為整個(gè)mobilenetV3的結(jié)構(gòu)圖: 如何看懂這個(gè)表呢?我們從每一列出發(fā):第一列Input代表mobilenetV3每個(gè)特征層的shape變化;第二列Operator代表每次特征層即將經(jīng)歷的block結(jié)構(gòu),我們可以看到在MobileNetV3中,特征提取經(jīng)過了許多的bneck結(jié)構(gòu);第三、四列分別代表了bneck內(nèi)逆殘差結(jié)構(gòu)上升后的通道數(shù)、輸入到bneck時(shí)特征層的通道數(shù)。第五列SE代表了是否在這一層引入注意力機(jī)制。第六列NL代表了激活函數(shù)的種類,HS代表h-swish,RE代表RELU。第七列s代表了每一次block結(jié)構(gòu)所用的步長(zhǎng)。

          對(duì)于yolov4來講,我們需要取出它的最后三個(gè)shape的有效特征層進(jìn)行加強(qiáng)特征提取。

          在代碼中,我們?nèi)〕隽薿ut1、out2、out3。

          import?torch.nn?as?nn
          import?math
          import?torch
          def?_make_divisible(v,?divisor,?min_value=None):
          ????if?min_value?is?None:
          ????????min_value?=?divisor
          ????new_v?=?max(min_value,?int(v?+?divisor?/?2)?//?divisor?*?divisor)
          ????#?Make?sure?that?round?down?does?not?go?down?by?more?than?10%.
          ????if?new_v?<?0.9?*?v:
          ????????new_v?+=?divisor
          ????return?new_v

          class?h_sigmoid(nn.Module):
          ????def?__init__(self,?inplace=True):
          ????????super(h_sigmoid,?self).__init__()
          ????????self.relu?=?nn.ReLU6(inplace=inplace)

          ????def?forward(self,?x):
          ????????return?self.relu(x?+?3)?/?6


          class?h_swish(nn.Module):
          ????def?__init__(self,?inplace=True):
          ????????super(h_swish,?self).__init__()
          ????????self.sigmoid?=?h_sigmoid(inplace=inplace)

          ????def?forward(self,?x):
          ????????return?x?*?self.sigmoid(x)


          class?SELayer(nn.Module):
          ????def?__init__(self,?channel,?reduction=4):
          ????????super(SELayer,?self).__init__()
          ????????self.avg_pool?=?nn.AdaptiveAvgPool2d(1)
          ????????self.fc?=?nn.Sequential(
          ????????????????nn.Linear(channel,?_make_divisible(channel?//?reduction,?8)),
          ????????????????nn.ReLU(inplace=True),
          ????????????????nn.Linear(_make_divisible(channel?//?reduction,?8),?channel),
          ????????????????h_sigmoid()
          ????????)

          ????def?forward(self,?x):
          ????????b,?c,?_,?_?=?x.size()
          ????????y?=?self.avg_pool(x).view(b,?c)
          ????????y?=?self.fc(y).view(b,?c,?1,?1)
          ????????return?x?*?y


          def?conv_3x3_bn(inp,?oup,?stride):
          ????return?nn.Sequential(
          ????????nn.Conv2d(inp,?oup,?3,?stride,?1,?bias=False),
          ????????nn.BatchNorm2d(oup),
          ????????h_swish()
          ????)


          def?conv_1x1_bn(inp,?oup):
          ????return?nn.Sequential(
          ????????nn.Conv2d(inp,?oup,?1,?1,?0,?bias=False),
          ????????nn.BatchNorm2d(oup),
          ????????h_swish()
          ????)


          class?InvertedResidual(nn.Module):
          ????def?__init__(self,?inp,?hidden_dim,?oup,?kernel_size,?stride,?use_se,?use_hs):
          ????????super(InvertedResidual,?self).__init__()
          ????????assert?stride?in?[1,?2]

          ????????self.identity?=?stride?==?1?and?inp?==?oup

          ????????if?inp?==?hidden_dim:
          ????????????self.conv?=?nn.Sequential(
          ????????????????#?dw
          ????????????????nn.Conv2d(hidden_dim,?hidden_dim,?kernel_size,?stride,?(kernel_size?-?1)?//?2,?groups=hidden_dim,?bias=False),
          ????????????????nn.BatchNorm2d(hidden_dim),
          ????????????????h_swish()?if?use_hs?else?nn.ReLU(inplace=True),
          ????????????????#?Squeeze-and-Excite
          ????????????????SELayer(hidden_dim)?if?use_se?else?nn.Identity(),
          ????????????????#?pw-linear
          ????????????????nn.Conv2d(hidden_dim,?oup,?1,?1,?0,?bias=False),
          ????????????????nn.BatchNorm2d(oup),
          ????????????)
          ????????else:
          ????????????self.conv?=?nn.Sequential(
          ????????????????#?pw
          ????????????????nn.Conv2d(inp,?hidden_dim,?1,?1,?0,?bias=False),
          ????????????????nn.BatchNorm2d(hidden_dim),
          ????????????????h_swish()?if?use_hs?else?nn.ReLU(inplace=True),
          ????????????????#?dw
          ????????????????nn.Conv2d(hidden_dim,?hidden_dim,?kernel_size,?stride,?(kernel_size?-?1)?//?2,?groups=hidden_dim,?bias=False),
          ????????????????nn.BatchNorm2d(hidden_dim),
          ????????????????#?Squeeze-and-Excite
          ????????????????SELayer(hidden_dim)?if?use_se?else?nn.Identity(),
          ????????????????h_swish()?if?use_hs?else?nn.ReLU(inplace=True),
          ????????????????#?pw-linear
          ????????????????nn.Conv2d(hidden_dim,?oup,?1,?1,?0,?bias=False),
          ????????????????nn.BatchNorm2d(oup),
          ????????????)

          ????def?forward(self,?x):
          ????????if?self.identity:
          ????????????return?x?+?self.conv(x)
          ????????else:
          ????????????return?self.conv(x)


          class?MobileNetV3(nn.Module):
          ????def?__init__(self,?num_classes=1000,?width_mult=1.):
          ????????super(MobileNetV3,?self).__init__()
          ????????#?setting?of?inverted?residual?blocks
          ????????self.cfgs?=?[
          ????????????#`?k,?t,?c,?SE,?HS,?s?
          ????????????[3,???1,??16,?0,?0,?1],
          ????????????[3,???4,??24,?0,?0,?2],
          ????????????[3,???3,??24,?0,?0,?1],
          ????????????[5,???3,??40,?1,?0,?2],
          ????????????[5,???3,??40,?1,?0,?1],
          ????????????[5,???3,??40,?1,?0,?1],
          ????????????[3,???6,??80,?0,?1,?2],
          ????????????[3,?2.5,??80,?0,?1,?1],
          ????????????[3,?2.3,??80,?0,?1,?1],
          ????????????[3,?2.3,??80,?0,?1,?1],
          ????????????[3,???6,?112,?1,?1,?1],
          ????????????[3,???6,?112,?1,?1,?1],
          ????????????[5,???6,?160,?1,?1,?2],
          ????????????[5,???6,?160,?1,?1,?1],
          ????????????[5,???6,?160,?1,?1,?1]
          ????????]

          ????????input_channel?=?_make_divisible(16?*?width_mult,?8)
          ????????layers?=?[conv_3x3_bn(3,?input_channel,?2)]

          ????????block?=?InvertedResidual
          ????????for?k,?t,?c,?use_se,?use_hs,?s?in?self.cfgs:
          ????????????output_channel?=?_make_divisible(c?*?width_mult,?8)
          ????????????exp_size?=?_make_divisible(input_channel?*?t,?8)
          ????????????layers.append(block(input_channel,?exp_size,?output_channel,?k,?s,?use_se,?use_hs))
          ????????????input_channel?=?output_channel
          ????????self.features?=?nn.Sequential(*layers)

          ????????self.conv?=?conv_1x1_bn(input_channel,?exp_size)
          ????????self.avgpool?=?nn.AdaptiveAvgPool2d((1,?1))
          ????????output_channel?=?_make_divisible(1280?*?width_mult,?8)?if?width_mult?>?1.0?else?1280
          ????????self.classifier?=?nn.Sequential(
          ????????????nn.Linear(exp_size,?output_channel),
          ????????????h_swish(),
          ????????????nn.Dropout(0.2),
          ????????????nn.Linear(output_channel,?num_classes),
          ????????)

          ????????self._initialize_weights()

          ????def?forward(self,?x):
          ????????x?=?self.features(x)
          ????????x?=?self.conv(x)
          ????????x?=?self.avgpool(x)
          ????????x?=?x.view(x.size(0),?-1)
          ????????x?=?self.classifier(x)
          ????????return?x

          ????def?_initialize_weights(self):
          ????????for?m?in?self.modules():
          ????????????if?isinstance(m,?nn.Conv2d):
          ????????????????n?=?m.kernel_size[0]?*?m.kernel_size[1]?*?m.out_channels
          ????????????????m.weight.data.normal_(0,?math.sqrt(2.?/?n))
          ????????????????if?m.bias?is?not?None:
          ????????????????????m.bias.data.zero_()
          ????????????elif?isinstance(m,?nn.BatchNorm2d):
          ????????????????m.weight.data.fill_(1)
          ????????????????m.bias.data.zero_()
          ????????????elif?isinstance(m,?nn.Linear):
          ????????????????n?=?m.weight.size(1)
          ????????????????m.weight.data.normal_(0,?0.01)
          ????????????????m.bias.data.zero_()

          def?mobilenet_v3(pretrained=False,?**kwargs):
          ????model?=?MobileNetV3(**kwargs)
          ????if?pretrained:
          ????????state_dict?=?torch.load('./model_data/mobilenetv3-large-1cd25616.pth')
          ????????model.load_state_dict(state_dict,?strict=True)
          ????return?model


          3、將預(yù)測(cè)結(jié)果融入到y(tǒng)olov4網(wǎng)絡(luò)當(dāng)中

          對(duì)于yolov4來講,我們需要利用主干特征提取網(wǎng)絡(luò)獲得的三個(gè)有效特征進(jìn)行加強(qiáng)特征金字塔的構(gòu)建。

          利用上一步定義的MobilenetV1、MobilenetV2、MobilenetV3三個(gè)函數(shù)我們可以獲得每個(gè)Mobilenet網(wǎng)絡(luò)對(duì)應(yīng)的三個(gè)有效特征層。

          我們可以利用這三個(gè)有效特征層替換原來yolov4主干網(wǎng)絡(luò)CSPdarknet53的有效特征層。

          為了進(jìn)一步減少參數(shù)量,我們可以使用深度可分離卷積代替yoloV3中用到的普通卷積。

          實(shí)現(xiàn)代碼如下:

          import?torch
          import?torch.nn?as?nn
          from?collections?import?OrderedDict
          from?nets.mobilenet_v1?import?mobilenet_v1
          from?nets.mobilenet_v2?import?mobilenet_v2
          from?nets.mobilenet_v3?import?mobilenet_v3

          class?MobileNetV1(nn.Module):
          ????def?__init__(self,?pretrained?=?False):
          ????????super(MobileNetV1,?self).__init__()
          ????????self.model?=?mobilenet_v1(pretrained=pretrained)

          ????def?forward(self,?x):
          ????????out3?=?self.model.stage1(x)
          ????????out4?=?self.model.stage2(out3)
          ????????out5?=?self.model.stage3(out4)
          ????????return?out3,?out4,?out5

          class?MobileNetV2(nn.Module):
          ????def?__init__(self,?pretrained?=?False):
          ????????super(MobileNetV2,?self).__init__()
          ????????self.model?=?mobilenet_v2(pretrained=pretrained)

          ????def?forward(self,?x):
          ????????out3?=?self.model.features[:7](x)
          ????????out4?=?self.model.features[7:14](out3)
          ????????out5?=?self.model.features[14:18](out4)
          ????????return?out3,?out4,?out5

          class?MobileNetV3(nn.Module):
          ????def?__init__(self,?pretrained?=?False):
          ????????super(MobileNetV3,?self).__init__()
          ????????self.model?=?mobilenet_v3(pretrained=pretrained)

          ????def?forward(self,?x):
          ????????out3?=?self.model.features[:7](x)
          ????????out4?=?self.model.features[7:13](out3)
          ????????out5?=?self.model.features[13:16](out4)
          ????????return?out3,?out4,?out5

          def?conv2d(filter_in,?filter_out,?kernel_size,?groups=1,?stride=1):
          ????pad?=?(kernel_size?-?1)?//?2?if?kernel_size?else?0
          ????return?nn.Sequential(OrderedDict([
          ????????("conv",?nn.Conv2d(filter_in,?filter_out,?kernel_size=kernel_size,?stride=stride,?padding=pad,?groups=groups,?bias=False)),
          ????????("bn",?nn.BatchNorm2d(filter_out)),
          ????????("relu",?nn.ReLU6(inplace=True)),
          ????]))

          def?conv_dw(filter_in,?filter_out,?stride?=?1):
          ????return?nn.Sequential(
          ????????nn.Conv2d(filter_in,?filter_in,?3,?stride,?1,?groups=filter_in,?bias=False),
          ????????nn.BatchNorm2d(filter_in),
          ????????nn.ReLU6(inplace=True),

          ????????nn.Conv2d(filter_in,?filter_out,?1,?1,?0,?bias=False),
          ????????nn.BatchNorm2d(filter_out),
          ????????nn.ReLU6(inplace=True),
          ????)

          #---------------------------------------------------#
          #???SPP結(jié)構(gòu),利用不同大小的池化核進(jìn)行池化
          #???池化后堆疊
          #---------------------------------------------------#
          class?SpatialPyramidPooling(nn.Module):
          ????def?__init__(self,?pool_sizes=[5,?9,?13]):
          ????????super(SpatialPyramidPooling,?self).__init__()

          ????????self.maxpools?=?nn.ModuleList([nn.MaxPool2d(pool_size,?1,?pool_size//2)?for?pool_size?in?pool_sizes])

          ????def?forward(self,?x):
          ????????features?=?[maxpool(x)?for?maxpool?in?self.maxpools[::-1]]
          ????????features?=?torch.cat(features?+?[x],?dim=1)

          ????????return?features

          #---------------------------------------------------#
          #???卷積?+?上采樣
          #---------------------------------------------------#
          class?Upsample(nn.Module):
          ????def?__init__(self,?in_channels,?out_channels):
          ????????super(Upsample,?self).__init__()

          ????????self.upsample?=?nn.Sequential(
          ????????????conv2d(in_channels,?out_channels,?1),
          ????????????nn.Upsample(scale_factor=2,?mode='nearest')
          ????????)

          ????def?forward(self,?x,):
          ????????x?=?self.upsample(x)
          ????????return?x

          #---------------------------------------------------#
          #???三次卷積塊
          #---------------------------------------------------#
          def?make_three_conv(filters_list,?in_filters):
          ????m?=?nn.Sequential(
          ????????conv2d(in_filters,?filters_list[0],?1),
          ????????conv_dw(filters_list[0],?filters_list[1]),
          ????????conv2d(filters_list[1],?filters_list[0],?1),
          ????)
          ????return?m

          #---------------------------------------------------#
          #???五次卷積塊
          #---------------------------------------------------#
          def?make_five_conv(filters_list,?in_filters):
          ????m?=?nn.Sequential(
          ????????conv2d(in_filters,?filters_list[0],?1),
          ????????conv_dw(filters_list[0],?filters_list[1]),
          ????????conv2d(filters_list[1],?filters_list[0],?1),
          ????????conv_dw(filters_list[0],?filters_list[1]),
          ????????conv2d(filters_list[1],?filters_list[0],?1),
          ????)
          ????return?m

          #---------------------------------------------------#
          #???最后獲得yolov4的輸出
          #---------------------------------------------------#
          def?yolo_head(filters_list,?in_filters):
          ????m?=?nn.Sequential(
          ????????conv_dw(in_filters,?filters_list[0]),
          ????????
          ????????nn.Conv2d(filters_list[0],?filters_list[1],?1),
          ????)
          ????return?m

          #---------------------------------------------------#
          #???yolo_body
          #---------------------------------------------------#
          class?YoloBody(nn.Module):
          ????def?__init__(self,?num_anchors,?num_classes,?backbone="mobilenetv2",?pretrained=False):
          ????????super(YoloBody,?self).__init__()
          ????????#??backbone
          ????????if?backbone?==?"mobilenetv1":
          ????????????self.backbone?=?MobileNetV1(pretrained=pretrained)
          ????????????alpha?=?1
          ????????????in_filters?=?[256,512,1024]
          ????????elif?backbone?==?"mobilenetv2":
          ????????????self.backbone?=?MobileNetV2(pretrained=pretrained)
          ????????????alpha?=?1
          ????????????in_filters?=?[32,96,320]
          ????????elif?backbone?==?"mobilenetv3":
          ????????????self.backbone?=?MobileNetV3(pretrained=pretrained)
          ????????????alpha?=?1
          ????????????in_filters?=?[40,112,160]
          ????????else:
          ????????????raise?ValueError('Unsupported?backbone?-?`{}`,?Use?mobilenetv1,?mobilenetv2,?mobilenetv3.'.format(backbone))

          ????????self.conv1???????????=?make_three_conv([int(512*alpha),?int(1024*alpha)],?in_filters[2])
          ????????self.SPP?????????????=?SpatialPyramidPooling()
          ????????self.conv2???????????=?make_three_conv([int(512*alpha),?int(1024*alpha)],?int(2048*alpha))

          ????????self.upsample1???????=?Upsample(int(512*alpha),?int(256*alpha))
          ????????self.conv_for_P4?????=?conv2d(in_filters[1],?int(256*alpha),1)
          ????????self.make_five_conv1?=?make_five_conv([int(256*alpha),?int(512*alpha)],?int(512*alpha))

          ????????self.upsample2???????=?Upsample(int(256*alpha),?int(128*alpha))
          ????????self.conv_for_P3?????=?conv2d(in_filters[0],?int(128*alpha),1)
          ????????self.make_five_conv2?=?make_five_conv([?int(128*alpha),?int(256*alpha)],?int(256*alpha))
          ????????#?3*(5+num_classes)=3*(5+20)=3*(4+1+20)=75
          ????????#?4+1+num_classes
          ????????final_out_filter2????=?num_anchors?*?(5?+?num_classes)
          ????????self.yolo_head3??????=?yolo_head([int(256*alpha),?final_out_filter2],int(128*alpha))

          ????????self.down_sample1????=?conv_dw(int(128*alpha),?int(256*alpha),stride=2)
          ????????self.make_five_conv3?=?make_five_conv([int(256*alpha),?int(512*alpha)],int(512*alpha))
          ????????#?3*(5+num_classes)=3*(5+20)=3*(4+1+20)=75
          ????????final_out_filter1????=?num_anchors?*?(5?+?num_classes)
          ????????self.yolo_head2??????=?yolo_head([int(512*alpha),?final_out_filter1],?int(256*alpha))


          ????????self.down_sample2????=?conv_dw(int(256*alpha),?int(512*alpha),stride=2)
          ????????self.make_five_conv4?=?make_five_conv([int(512*alpha),?int(1024*alpha)],?int(1024*alpha))
          ????????#?3*(5+num_classes)=3*(5+20)=3*(4+1+20)=75
          ????????final_out_filter0????=?num_anchors?*?(5?+?num_classes)
          ????????self.yolo_head1??????=?yolo_head([int(1024*alpha),?final_out_filter0],?int(512*alpha))


          ????def?forward(self,?x):
          ????????#??backbone
          ????????x2,?x1,?x0?=?self.backbone(x)

          ????????P5?=?self.conv1(x0)
          ????????P5?=?self.SPP(P5)
          ????????P5?=?self.conv2(P5)

          ????????P5_upsample?=?self.upsample1(P5)
          ????????P4?=?self.conv_for_P4(x1)
          ????????P4?=?torch.cat([P4,P5_upsample],axis=1)
          ????????P4?=?self.make_five_conv1(P4)

          ????????P4_upsample?=?self.upsample2(P4)
          ????????P3?=?self.conv_for_P3(x2)
          ????????P3?=?torch.cat([P3,P4_upsample],axis=1)
          ????????P3?=?self.make_five_conv2(P3)

          ????????P3_downsample?=?self.down_sample1(P3)
          ????????P4?=?torch.cat([P3_downsample,P4],axis=1)
          ????????P4?=?self.make_five_conv3(P4)

          ????????P4_downsample?=?self.down_sample2(P4)
          ????????P5?=?torch.cat([P4_downsample,P5],axis=1)
          ????????P5?=?self.make_five_conv4(P5)

          ????????out2?=?self.yolo_head3(P3)
          ????????out1?=?self.yolo_head2(P4)
          ????????out0?=?self.yolo_head1(P5)

          ????????return?out0,?out1,?out2

          訓(xùn)練自己的YoloV4模型

          首先前往Github下載對(duì)應(yīng)的倉(cāng)庫(kù),下載完后利用解壓軟件解壓,之后用編程軟件打開文件夾。注意打開的根目錄必須正確,否則相對(duì)目錄不正確的情況下,代碼將無(wú)法運(yùn)行。 一定要注意打開后的根目錄是文件存放的目錄。

          一、數(shù)據(jù)集的準(zhǔn)備

          本文使用VOC格式進(jìn)行訓(xùn)練,訓(xùn)練前需要自己制作好數(shù)據(jù)集,如果沒有自己的數(shù)據(jù)集,可以通過Github連接下載VOC12+07的數(shù)據(jù)集嘗試下。 訓(xùn)練前將標(biāo)簽文件放在VOCdevkit文件夾下的VOC2007文件夾下的Annotation中。 訓(xùn)練前將圖片文件放在VOCdevkit文件夾下的VOC2007文件夾下的JPEGImages中。 此時(shí)數(shù)據(jù)集的擺放已經(jīng)結(jié)束。

          二、數(shù)據(jù)集的處理

          在完成數(shù)據(jù)集的擺放之后,我們需要對(duì)數(shù)據(jù)集進(jìn)行下一步的處理,目的是獲得訓(xùn)練用的2007_train.txt以及2007_val.txt,需要用到根目錄下的voc_annotation.py。

          voc_annotation.py里面有一些參數(shù)需要設(shè)置。分別是annotation_mode、classes_path、trainval_percent、train_percent、VOCdevkit_path,第一次訓(xùn)練可以僅修改classes_path

          '''
          annotation_mode用于指定該文件運(yùn)行時(shí)計(jì)算的內(nèi)容
          annotation_mode為0代表整個(gè)標(biāo)簽處理過程,包括獲得VOCdevkit/VOC2007/ImageSets里面的txt以及訓(xùn)練用的2007_train.txt、2007_val.txt
          annotation_mode為1代表獲得VOCdevkit/VOC2007/ImageSets里面的txt
          annotation_mode為2代表獲得訓(xùn)練用的2007_train.txt、2007_val.txt
          '
          ''
          annotation_mode?????=?0
          '''
          必須要修改,用于生成2007_train.txt、2007_val.txt的目標(biāo)信息
          與訓(xùn)練和預(yù)測(cè)所用的classes_path一致即可
          如果生成的2007_train.txt里面沒有目標(biāo)信息
          那么就是因?yàn)閏lasses沒有設(shè)定正確
          僅在annotation_mode為0和2的時(shí)候有效
          '
          ''
          classes_path????????=?'model_data/voc_classes.txt'
          '''
          trainval_percent用于指定(訓(xùn)練集+驗(yàn)證集)與測(cè)試集的比例,默認(rèn)情況下?(訓(xùn)練集+驗(yàn)證集):測(cè)試集?=?9:1
          train_percent用于指定(訓(xùn)練集+驗(yàn)證集)中訓(xùn)練集與驗(yàn)證集的比例,默認(rèn)情況下?訓(xùn)練集:驗(yàn)證集?=?9:1
          僅在annotation_mode為0和1的時(shí)候有效
          '
          ''
          trainval_percent????=?0.9
          train_percent???????=?0.9
          '''
          指向VOC數(shù)據(jù)集所在的文件夾
          默認(rèn)指向根目錄下的VOC數(shù)據(jù)集
          '
          ''
          VOCdevkit_path??=?'VOCdevkit'

          classes_path用于指向檢測(cè)類別所對(duì)應(yīng)的txt,以voc數(shù)據(jù)集為例,我們用的txt為: 訓(xùn)練自己的數(shù)據(jù)集時(shí),可以自己建立一個(gè)cls_classes.txt,里面寫自己所需要區(qū)分的類別。

          三、開始網(wǎng)絡(luò)訓(xùn)練

          通過voc_annotation.py我們已經(jīng)生成了2007_train.txt以及2007_val.txt,此時(shí)我們可以開始訓(xùn)練了。訓(xùn)練的參數(shù)較多,大家可以在下載庫(kù)后仔細(xì)看注釋,其中最重要的部分依然是train.py里的classes_path。

          classes_path用于指向檢測(cè)類別所對(duì)應(yīng)的txt,這個(gè)txt和voc_annotation.py里面的txt一樣!訓(xùn)練自己的數(shù)據(jù)集必須要修改! 修改完classes_path后就可以運(yùn)行train.py開始訓(xùn)練了,在訓(xùn)練多個(gè)epoch后,權(quán)值會(huì)生成在logs文件夾中。

          另外,backbone參數(shù)用于指定所用的主干特征提取網(wǎng)絡(luò),可以在mobilenetv1, mobilenetv2, mobilenetv3中進(jìn)行選擇。

          訓(xùn)練前需要注意所用mobilenet版本和預(yù)訓(xùn)練權(quán)重的對(duì)齊。

          其它參數(shù)的作用如下:

          #-------------------------------#
          #???是否使用Cuda
          #???沒有GPU可以設(shè)置成False
          #-------------------------------#
          Cuda?=?True
          #--------------------------------------------------------#
          #???訓(xùn)練前一定要修改classes_path,使其對(duì)應(yīng)自己的數(shù)據(jù)集
          #--------------------------------------------------------#
          classes_path????=?'model_data/voc_classes.txt'
          #---------------------------------------------------------------------#
          #?? anchors_path代表先驗(yàn)框?qū)?yīng)的txt文件,一般不修改。
          #?? anchors_mask用于幫助代碼找到對(duì)應(yīng)的先驗(yàn)框,一般不修改。
          #---------------------------------------------------------------------#
          anchors_path????=?'model_data/yolo_anchors.txt'
          anchors_mask????=?[[6,?7,?8],?[3,?4,?5],?[0,?1,?2]]
          #------------------------------------------------------------------------------------------------------#
          #???權(quán)值文件請(qǐng)看README,百度網(wǎng)盤下載。數(shù)據(jù)的預(yù)訓(xùn)練權(quán)重對(duì)不同數(shù)據(jù)集是通用的,因?yàn)樘卣魇峭ㄓ玫?/span>
          #???預(yù)訓(xùn)練權(quán)重對(duì)于99%的情況都必須要用,不用的話權(quán)值太過隨機(jī),特征提取效果不明顯,網(wǎng)絡(luò)訓(xùn)練的結(jié)果也不會(huì)好。
          #???訓(xùn)練自己的數(shù)據(jù)集時(shí)提示維度不匹配正常,預(yù)測(cè)的東西都不一樣了自然維度不匹配
          #???如果想要斷點(diǎn)續(xù)練就將model_path設(shè)置成logs文件夾下已經(jīng)訓(xùn)練的權(quán)值文件。?
          #------------------------------------------------------------------------------------------------------#
          model_path??????=?'model_data/yolov4_mobilenet_v1_voc.pth'
          #------------------------------------------------------#
          #???輸入的shape大小,一定要是32的倍數(shù)
          #------------------------------------------------------#
          input_shape?????=?[416,?416]
          #-------------------------------#
          #???所使用的主干特征提取網(wǎng)絡(luò)
          #???mobilenetv1
          #???mobilenetv2
          #???mobilenetv3
          #???ghostnet
          #-------------------------------#
          backbone????????=?"mobilenetv1"
          #----------------------------------#
          #???是否使用主干網(wǎng)絡(luò)的預(yù)訓(xùn)練權(quán)重
          #???只包括主干部分,與model_path無(wú)關(guān)
          #----------------------------------#
          pretrained??????=?False
          #------------------------------------------------------#
          #???Yolov4的tricks應(yīng)用
          #???mosaic?馬賽克數(shù)據(jù)增強(qiáng)?True?or?False?
          #???實(shí)際測(cè)試時(shí)mosaic數(shù)據(jù)增強(qiáng)并不穩(wěn)定,所以默認(rèn)為False
          #???Cosine_scheduler?余弦退火學(xué)習(xí)率?True?or?False
          #???label_smoothing?標(biāo)簽平滑?0.01以下一般?如0.01、0.005
          #------------------------------------------------------#
          mosaic??????????????=?False
          Cosine_lr???????????=?False
          label_smoothing?????=?0

          四、訓(xùn)練結(jié)果預(yù)測(cè)

          訓(xùn)練結(jié)果預(yù)測(cè)需要用到兩個(gè)文件,分別是yolo.py和predict.py。我們首先需要去yolo.py里面修改model_path以及classes_path,這兩個(gè)參數(shù)必須要修改。

          另外,backbone參數(shù)用于指定所用的主干特征提取網(wǎng)絡(luò),可以在mobilenetv1, mobilenetv2, mobilenetv3中進(jìn)行選擇。

          model_path指向訓(xùn)練好的權(quán)值文件,在logs文件夾里。classes_path指向檢測(cè)類別所對(duì)應(yīng)的txt。 完成修改后就可以運(yùn)行predict.py進(jìn)行檢測(cè)了。運(yùn)行后輸入圖片路徑即可檢測(cè)。


          瀏覽 211
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  亚洲V V| 久久人妻av | 亚州中文久久精品无码 | 久久成人三级片 | 伊人在线观看视频网站 |