<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          ResNet及其變體的結(jié)構(gòu)梳理、有效性分析

          共 18992字,需瀏覽 38分鐘

           ·

          2020-08-13 11:40

          點擊上方AI算法與圖像處理”,選擇加"星標"或“置頂”

          重磅干貨,第一時間送達

          閱讀大概需要20分鐘

          Follow小博主,每天更新前沿干貨

          【導(dǎo)讀】2020年,在各大CV頂會上又出現(xiàn)了許多基于ResNet改進的工作,比如:Res2Net,ResNeSt,IResNet,SCNet等等。為了更好的了解ResNet整個體系脈絡(luò)的發(fā)展,我們特此對ResNet系列重新梳理,并制作了一個ResNet專題,希望能幫助大家對ResNet體系有一個更深的理解。本篇文章我們將主要講解ResNet、preResNet、ResNext以及它們的代碼實現(xiàn)。

          ResNet


          • 論文鏈接:https://arxiv.org/abs/1512.03385

          • 代碼地址:https://github.com/KaimingHe/deep-residual-networks

          • pytorch版:https://github.com/Cadene/pretrained-models.pytorch


          Motivation?和創(chuàng)新點
          深度學(xué)習的發(fā)展從LeNet到AlexNet,再到VGGNet和GoogLeNet,網(wǎng)絡(luò)的深度在不斷加深,經(jīng)驗表明,網(wǎng)絡(luò)深度有著至關(guān)重要的影響,層數(shù)深的網(wǎng)絡(luò)可以提取出圖片的低層、中層和高層特征。但當網(wǎng)絡(luò)足夠深時,僅僅在后面繼續(xù)堆疊更多層會帶來很多問題:第一個問題就是梯度爆炸 / 消失(vanishing / exploding gradients),backprop無法把有效地把梯度更新到前面的網(wǎng)絡(luò)層,導(dǎo)致前面的層參數(shù)無法更新。第二個問題就是退化(degradation)問題,即當網(wǎng)絡(luò)層數(shù)堆疊過多會導(dǎo)致優(yōu)化困難、且訓(xùn)練誤差和預(yù)測誤差更大了,注意這里誤差更大并不是由過擬合導(dǎo)致的。

          ResNet旨在解決網(wǎng)絡(luò)加深后訓(xùn)練難度增大的現(xiàn)象。其提出了residual模塊,包含兩個3×3卷積和一個shortcut connection。shortcut connection可以有效緩解反向傳播時由于深度過深導(dǎo)致的梯度消失現(xiàn)象,這使得網(wǎng)絡(luò)加深之后性能不會變差。短路連接是深度學(xué)習又一重要思想,除計算機視覺外,短路連接也被用到了機器翻譯、語音識別/合成領(lǐng)域。此外,具有短路連接的ResNet可以看作是許多不同深度而共享參數(shù)的網(wǎng)絡(luò)的集成,網(wǎng)絡(luò)數(shù)目隨層數(shù)指數(shù)增加。值得注意的是:在此之前已有研究者使用跨層連接對響應(yīng)和梯度中心化(center)處理;inception結(jié)構(gòu)本質(zhì)也是跨層連接;highway網(wǎng)絡(luò)也使用到了跨層連接


          ResNet的關(guān)鍵點是:

          • 利用殘差結(jié)構(gòu)讓網(wǎng)絡(luò)能夠更深、收斂速度更快、優(yōu)化更容易,同時參數(shù)相對之前的模型更少、復(fù)雜度更低

          • ResNet大量使用了批量歸一層,而不是Dropout。

          • 對于很深的網(wǎng)絡(luò)(超過50層),ResNet使用了更高效的瓶頸(bottleneck)結(jié)構(gòu)極大程度上降低了參數(shù)計算量。



          ResNet的殘差結(jié)構(gòu)

          為了解決退化問題,我們引入了一個新的深度殘差學(xué)習block,在這里,對于一個堆積層結(jié)構(gòu)(幾層堆積而成)當輸入為時,其學(xué)習到的特征記為,現(xiàn)在我們希望其可以學(xué)習到殘差 ,這樣其實原始的學(xué)習特征是 。之所以這樣是因為殘差學(xué)習相比原始特征直接學(xué)習更容易。當殘差為0時,此時堆積層僅僅做了恒等映射,至少網(wǎng)絡(luò)性能不會下降,實際上殘差不會為0,這也會使得堆積層在輸入特征基礎(chǔ)上學(xué)習到新的特征,從而擁有更好的性能。


          本質(zhì)也就是不改變目標函數(shù) ,將網(wǎng)絡(luò)結(jié)構(gòu)拆成兩個分支,一個分支是殘差映射,一個分支是恒等映射,于是網(wǎng)絡(luò)僅需學(xué)習殘差映射即可。對于上述殘差單元,我們可以從數(shù)學(xué)的角度來分析一下,首先上述結(jié)構(gòu)可表示為:

          其中分別表示的是第個殘差單元的輸入和輸出,注意每個殘差單元一般包含多層結(jié)構(gòu)。是殘差函數(shù),表示學(xué)習到的殘差,而表示恒等映射,是ReLU激活函數(shù)?;谏鲜?,我們求得從淺層到深層的學(xué)習特征為:

          利用鏈式規(guī)則,可以求得反向過程的梯度:

          式子的第一個因子 表示的損失函數(shù)到達的梯度,小括號中的1表明短路機制可以無損地傳播梯度,而另外一項殘差梯度則需要經(jīng)過帶有weights的層,梯度不是直接傳遞過來的。殘差梯度不會那么巧全為-1,而且就算其比較小,有1的存在也不會導(dǎo)致梯度消失。所以殘差學(xué)習會更容易。要注意上面的推導(dǎo)并不是嚴格的證明。

          殘差結(jié)構(gòu)為什么有效?

          1. 自適應(yīng)深度:網(wǎng)絡(luò)退化問題就體現(xiàn)了多層網(wǎng)絡(luò)難以擬合恒等映射這種情況,也就是說難以擬合,但使用了殘差結(jié)構(gòu)之后,擬合恒等映射變得很容易,直接把網(wǎng)絡(luò)參數(shù)全學(xué)習到為0,只留下那個恒等映射的跨層連接即可。于是當網(wǎng)絡(luò)不需要這么深時,中間的恒等映射就可以多一點,反之就可以少一點。(當然網(wǎng)絡(luò)中出現(xiàn)某些層僅僅擬合恒等映射的可能性很小,但根據(jù)下面的第二點也有其用武之地;另外關(guān)于為什么多層網(wǎng)絡(luò)難以擬合恒等映射,這涉及到信號與系統(tǒng)的知識見:https://www.zhihu.com/question/293243905/answer/484708047)

          2. 差分放大器:假設(shè)最優(yōu)更接近恒等映射,那么網(wǎng)絡(luò)更容易發(fā)現(xiàn)除恒等映射之外微小的波動

          3. 模型集成:整個ResNet類似于多個網(wǎng)絡(luò)的集成,原因是刪除ResNet的部分網(wǎng)絡(luò)結(jié)點不影響整個網(wǎng)絡(luò)的性能,但VGGNet會崩潰,具體可以看這篇NIPS論文:Residual Networks Behave Like Ensembles of Relatively Shallow Networks

          4. 緩解梯度消失:針對一個殘差結(jié)構(gòu)對輸入求導(dǎo)就可以知道,由于跨層連接的存在,總梯度在的導(dǎo)數(shù)基礎(chǔ)上還會加1

          下面給出一個直觀理解圖:

          如上圖所示,左邊來了一輛裝滿了“梯度”商品的貨車,來領(lǐng)商品的客人一般都要排隊一個個拿才可以,如果排隊的人太多,后面的人就沒有了。于是這時候派了一個人走了“快捷通道”,到貨車上領(lǐng)了一部分“梯度”,直接送給后面的人,這樣后面排隊的客人就能拿到更多的“梯度”。


          bottleneck(利用1*1卷積)的好處-兩種殘差單元

          我們來計算一下1*1卷積的計算量優(yōu)勢:首先看上圖右邊的bottleneck結(jié)構(gòu),對于256維的輸入特征,參數(shù)數(shù)目:1x1x256x64+3x3x64x64+1x1x64x256=69632,如果同樣的輸入輸出維度但不使用1x1卷積,而使用兩個3x3卷積的話,參數(shù)數(shù)目為(3x3x256x256)x2=1179648。簡單計算下就知道了,使用了1x1卷積的bottleneck將計算量簡化為原有的5.9%,收益超高。


          詳細見:【基礎(chǔ)積累】1x1卷積到底有哪些用處?


          ResNet網(wǎng)絡(luò)設(shè)計結(jié)構(gòu):



          基于Pytorch的ResNet代碼實現(xiàn)


          def?conv3x3(in_planes, out_planes, stride=1):
          ????"""3x3 convolution with padding"""
          ????return?nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
          ?????????????????????padding=1, bias=False)


          class?Bottleneck(nn.Module):
          ????expansion = 4

          ????def?__init__(self, inplanes, planes, stride=1, downsample=None):
          ????????super(Bottleneck, self).__init__()
          ????????self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
          ????????self.bn1 = nn.BatchNorm2d(planes)
          ????????self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
          ???????????????????????????????padding=1, bias=False)
          ????????self.bn2 = nn.BatchNorm2d(planes)
          ????????self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
          ????????self.bn3 = nn.BatchNorm2d(planes * 4)
          ????????self.relu = nn.ReLU(inplace=True)
          ????????self.downsample = downsample
          ????????self.stride = stride

          ????def?forward(self, x):
          ????????residual = x

          ????????out = self.conv1(x)
          ????????out = self.bn1(out)
          ????????out = self.relu(out)

          ????????out = self.conv2(out)
          ????????out = self.bn2(out)
          ????????out = self.relu(out)

          ????????out = self.conv3(out)
          ????????out = self.bn3(out)

          ????????if?self.downsample is?not?None:
          ????????????residual = self.downsample(x)

          ????????out += residual
          ????????out = self.relu(out)

          ????????return?out


          class?ResNet(nn.Module):

          ????def?__init__(self, block, layers, num_classes, grayscale):
          ????????self.inplanes = 64
          ????????if?grayscale:
          ????????????in_dim = 1
          ????????else:
          ????????????in_dim = 3
          ????????super(ResNet, self).__init__()
          ????????self.conv1 = nn.Conv2d(in_dim, 64, kernel_size=7, stride=2, padding=3,
          ???????????????????????????????bias=False)
          ????????self.bn1 = nn.BatchNorm2d(64)
          ????????self.relu = nn.ReLU(inplace=True)
          ????????self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
          ????????self.layer1 = self._make_layer(block, 64, layers[0])
          ????????self.layer2 = self._make_layer(block, 128, layers[1], stride=2)
          ????????self.layer3 = self._make_layer(block, 256, layers[2], stride=2)
          ????????self.layer4 = self._make_layer(block, 512, layers[3], stride=2)
          ????????self.avgpool = nn.AvgPool2d(7, stride=1, padding=2)
          ????????#self.fc = nn.Linear(2048 * block.expansion, num_classes)
          ????????self.fc = nn.Linear(2048, num_classes)

          ????????for?m in?self.modules():
          ????????????if?isinstance(m, nn.Conv2d):
          ????????????????n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
          ????????????????m.weight.data.normal_(0, (2.?/ n)**.5)
          ????????????elif?isinstance(m, nn.BatchNorm2d):
          ????????????????m.weight.data.fill_(1)
          ????????????????m.bias.data.zero_()

          ????def?_make_layer(self, block, planes, blocks, stride=1):
          ????????downsample = None
          ????????if?stride != 1?or?self.inplanes != planes * block.expansion:
          ????????????downsample = nn.Sequential(
          ????????????????nn.Conv2d(self.inplanes, planes * block.expansion,
          ??????????????????????????kernel_size=1, stride=stride, bias=False),
          ????????????????nn.BatchNorm2d(planes * block.expansion),
          ????????????)

          ????????layers = []
          ????????layers.append(block(self.inplanes, planes, stride, downsample))
          ????????self.inplanes = planes * block.expansion
          ????????for?i in?range(1, blocks):
          ????????????layers.append(block(self.inplanes, planes))

          ????????return?nn.Sequential(*layers)

          ????def?forward(self, x):
          ????????x = self.conv1(x)
          ????????x = self.bn1(x)
          ????????x = self.relu(x)
          ????????x = self.maxpool(x)

          ????????x = self.layer1(x)
          ????????x = self.layer2(x)
          ????????x = self.layer3(x)
          ????????x = self.layer4(x)

          ????????#x = self.avgpool(x)
          ????????x = x.view(x.size(0), -1)
          ????????logits = self.fc(x)
          ????????probas = F.softmax(logits, dim=1)
          ????????return?logits, probas



          def?resnet101(num_classes, grayscale):
          ????"""Constructs a ResNet-101 model."""
          ????model = ResNet(block=Bottleneck,
          ???????????????????layers=[3, 4, 23, 3],
          ???????????????????num_classes=NUM_CLASSES,
          ???????????????????grayscale=grayscale)
          ????return?model


          preResNet


          • 論文鏈接:https://arxiv.org/abs/1603.05027

          • 代碼地址:https://github.com/KaimingHe/resnet-1k-layers.


          論文主要思想和改進點
          本文分析了殘差模塊背后的傳播公式,重點是創(chuàng)建一個信息傳播的“直接”路徑——不僅在殘差單元內(nèi),而且要通過整個網(wǎng)絡(luò)。一系列實驗表明在使用恒等映射作為跳躍連接和在BN后添加激活函數(shù)的方式,前向和后向傳播信號可以直接從一個塊傳播到任何其他塊。一系列的消融實驗證明了這些恒等映射的重要性。這促使我們提出了一個新的殘差單位,它使的訓(xùn)練更容易,并改進了泛化效果。我們把它叫做preResNet,preResNet主要是調(diào)整了residual模塊中各層的順序。preResNet比較了ReLU和BN的擺放位置不同,比較兩者的效果。相比經(jīng)典residual模塊(a),(b)將BN共享會更加影響信息的短路傳播,使網(wǎng)絡(luò)更難訓(xùn)練、性能也更差;(c)直接將ReLU移到BN后會使該分支的輸出始終非負,使網(wǎng)絡(luò)表示能力下降;(d)將ReLU提前解決了(e)的非負問題,但ReLU無法享受BN的效果;(e)將ReLU和BN都提前解決了(d)的問題。preResNet的短路連接(e)能更加直接的傳遞信息,進而取得了比ResNet更好的性能。


          基于Pytorch的PreResNet代碼


          import?torch.nn as?nn


          __all__ = ['preresnet20', 'preresnet32', 'preresnet44',
          ???????????'preresnet56', 'preresnet110', 'preresnet1202']


          def?conv3x3(in_planes, out_planes, stride=1):
          ????"3x3 convolution with padding"
          ????return?nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
          ?????????????????????padding=1, bias=False)


          class?BasicBlock(nn.Module):
          ????expansion = 1

          ????def?__init__(self, inplanes, planes, stride=1, downsample=None):
          ????????super(BasicBlock, self).__init__()
          ????????self.bn_1 = nn.BatchNorm2d(inplanes)
          ????????self.relu = nn.ReLU(inplace=True)
          ????????self.conv_1 = conv3x3(inplanes, planes, stride)
          ????????self.bn_2 = nn.BatchNorm2d(planes)
          ????????self.conv_2 = conv3x3(planes, planes)
          ????????self.downsample = downsample
          ????????self.stride = stride

          ????def?forward(self, x):
          ????????residual = x

          ????????out = self.bn_1(x)
          ????????out = self.relu(out)
          ????????out = self.conv_1(out)

          ????????out = self.bn_2(out)
          ????????out = self.relu(out)
          ????????out = self.conv_2(out)

          ????????if?self.downsample is?not?None:
          ????????????residual = self.downsample(x)

          ????????out += residual

          ????????return?out


          class?Bottleneck(nn.Module):
          ????expansion = 4

          ????def?__init__(self, inplanes, planes, stride=1, downsample=None):
          ????????super(Bottleneck, self).__init__()
          ????????self.bn_1 = nn.BatchNorm2d(inplanes)
          ????????self.conv_1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
          ????????self.bn_2 = nn.BatchNorm2d(planes)
          ????????self.conv_2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride,
          ????????????????????????????????padding=1, bias=False)
          ????????self.bn_3 = nn.BatchNorm2d(planes)
          ????????self.conv_3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False)
          ????????self.relu = nn.ReLU(inplace=True)
          ????????self.downsample = downsample
          ????????self.stride = stride

          ????def?forward(self, x):
          ????????residual = x

          ????????out = self.bn_1(x)
          ????????out = self.relu(out)
          ????????out = self.conv_1(out)

          ????????out = self.bn_2(out)
          ????????out = self.relu(out)
          ????????out = self.conv_2(out)

          ????????out = self.bn_3(out)
          ????????out = self.relu(out)
          ????????out = self.conv_3(out)

          ????????if?self.downsample is?not?None:
          ????????????residual = self.downsample(x)

          ????????out += residual

          ????????return?out


          class?PreResNet(nn.Module):

          ????def?__init__(self, depth, num_classes=1000, block_name='BasicBlock'):
          ????????super(PreResNet, self).__init__()
          ????????# Model type specifies number of layers for CIFAR-10 model
          ????????if?block_name.lower() == 'basicblock':
          ????????????assert?(
          ????????????????depth - 2) % 6?== 0, "When use basicblock, depth should be 6n+2, e.g. 20, 32, 44, 56, 110, 1202"
          ????????????n = (depth - 2) // 6
          ????????????block = BasicBlock
          ????????elif?block_name.lower() == 'bottleneck':
          ????????????assert?(
          ????????????????depth - 2) % 9?== 0, "When use bottleneck, depth should be 9n+2 e.g. 20, 29, 47, 56, 110, 1199"
          ????????????n = (depth - 2) // 9
          ????????????block = Bottleneck
          ????????else:
          ????????????raise?ValueError('block_name shoule be Basicblock or Bottleneck')

          ????????self.inplanes = 16
          ????????self.conv_1 = nn.Conv2d(3, 16, kernel_size=3, padding=1,
          ????????????????????????????????bias=False)
          ????????self.layer1 = self._make_layer(block, 16, n)
          ????????self.layer2 = self._make_layer(block, 32, n, stride=2)
          ????????self.layer3 = self._make_layer(block, 64, n, stride=2)
          ????????self.bn = nn.BatchNorm2d(64?* block.expansion)
          ????????self.relu = nn.ReLU(inplace=True)
          ????????self.avgpool = nn.AvgPool2d(8)
          ????????self.fc = nn.Linear(64?* block.expansion, num_classes)

          ????????for?m in?self.modules():
          ????????????if?isinstance(m, nn.Conv2d):
          ????????????????nn.init.kaiming_normal_(m.weight.data)
          ????????????elif?isinstance(m, nn.BatchNorm2d):
          ????????????????m.weight.data.fill_(1)
          ????????????????m.bias.data.zero_()

          ????def?_make_layer(self, block, planes, blocks, stride=1):
          ????????downsample = None
          ????????if?stride != 1?or?self.inplanes != planes * block.expansion:
          ????????????downsample = nn.Sequential(
          ????????????????nn.Conv2d(self.inplanes, planes * block.expansion,
          ??????????????????????????kernel_size=1, stride=stride, bias=False))

          ????????layers = []
          ????????layers.append(block(self.inplanes, planes, stride, downsample))
          ????????self.inplanes = planes * block.expansion
          ????????for?_ in?range(1, blocks):
          ????????????layers.append(block(self.inplanes, planes))

          ????????return?nn.Sequential(*layers)

          ????def?forward(self, x):
          ????????x = self.conv_1(x) # 32x32

          ????????x = self.layer1(x) # 32x32
          ????????x = self.layer2(x) # 16x16
          ????????x = self.layer3(x) # 8x8
          ????????x = self.bn(x)
          ????????x = self.relu(x)

          ????????x = self.avgpool(x)
          ????????x = x.view(x.size(0), -1)
          ????????x = self.fc(x)

          ????????return?x


          def?preresnet20(num_classes):
          ????return?PreResNet(depth=20, num_classes=num_classes)


          def?preresnet32(num_classes):
          ????return?PreResNet(depth=32, num_classes=num_classes)


          def?preresnet44(num_classes):
          ????return?PreResNet(depth=44, num_classes=num_classes)


          def?preresnet56(num_classes):
          ????return?PreResNet(depth=56, num_classes=num_classes)


          def?preresnet110(num_classes):
          ????return?PreResNet(depth=110, num_classes=num_classes)


          def?preresnet1202(num_classes):
          ????return?PreResNet(depth=1202, num_classes=num_classes)



          ResNeXt



          • 論文鏈接:https://arxiv.org/abs/1611.05431


          論文思想和主要改進點
          傳統(tǒng)的方法通常是靠加深或加寬網(wǎng)絡(luò)來提升性能,但計算開銷也會隨之增加。ResNeXt旨在不改變模型復(fù)雜度的情況下提升性能。受精簡而高效的Inception模塊啟發(fā),在這篇文章中,我們提出了一個簡單的、高度模塊化的圖像分類網(wǎng)絡(luò)架構(gòu)。我們的網(wǎng)絡(luò)是通過聚合不同的模塊構(gòu)建起來的,它借鑒了Inception的“分割-變換-聚合”策略,卻用相同的拓撲結(jié)構(gòu)來組建多分支結(jié)構(gòu)。這種多分支結(jié)構(gòu)的策略衍生出了一個新的維度,我們稱之為“基數(shù)”,它也是網(wǎng)絡(luò)結(jié)構(gòu)中除了深度和寬度之外,一個重要的影響因素。作為ResNet的一個高能進化版,ResNeXt在寬度和深度之外,通過引入了“基數(shù)?(Cardinality) ”的概念。在網(wǎng)絡(luò)不加深不加寬的情況下,增加基數(shù)便可以提高模型效果和提升準確率,還能減少超參數(shù)的數(shù)量。

          ResNeXt的關(guān)鍵點是:

          • 沿用ResNet的短路連接,并且重復(fù)堆疊相同的模塊組合。

          • ResNeXt將ResNet中非跳躍連接的那一分支變?yōu)槎鄠€分支。

          • 多分支分別處理。

          • 使用1×1卷積降低計算量。其綜合了ResNet和Inception的優(yōu)點。

          • ResNeXt與Inception最本質(zhì)的差別,其實是Block內(nèi)每個分支的拓撲結(jié)構(gòu),Inception為了提高表達能力/結(jié)合不同感受野,每個分支使用了不同的拓撲結(jié)構(gòu)。而ResNeXt則使用了同一拓撲的分支,即ResNeXt的分支是同構(gòu)的!

          因為ResNeXt是同構(gòu)的,因此繼承了VGG/ResNet的精神衣缽:維持網(wǎng)絡(luò)拓撲結(jié)構(gòu)不變。主要體現(xiàn)在兩點:

          • 特征圖大小相同,則涉及的結(jié)構(gòu)超參數(shù)相同

          • 每當空間分辨率/2(降采樣),則卷積核的寬度*2


          神經(jīng)元連接

          聚合變換

          ResNeXt最終輸出模塊公式:


          此外,ResNeXt巧妙地利用分組卷積進行實現(xiàn)。ResNeXt發(fā)現(xiàn),增加分支數(shù)是比加深或加寬更有效地提升網(wǎng)絡(luò)性能的方式。ResNeXt的命名旨在說明這是下一代(next)的ResNet。


          如果一個ResNeXt Block中只有兩層conv,前后都可等效成一個大的conv層

          上圖a的解讀:

          ResNeXt最核心的地方只存在于被最上最下兩層卷積夾著的,中間的部分

          1. 因為第一個分開的conv其實都接受了一樣的輸入,各分支又有著相同的拓撲結(jié)構(gòu)。類比乘法結(jié)合律,這其實就是把一個conv的輸出拆開了分掉。相同輸入,不同輸出)

          2. 而最后一個conv又只對同一個輸出負責,因此就可以并起來用一個conv處理。不同輸入,相同輸出

          3. 唯一一個輸入和輸出都不同的,就是中間的3*3conv了。它們的輸入,參數(shù),負責的輸出都不同,無法合并,因此也相互獨立。這才是模型的關(guān)鍵所在。最終模型可以被等效為下圖所示的最終形態(tài):




          ResNeXt的網(wǎng)絡(luò)結(jié)構(gòu)設(shè)計:



          基于Pytorch的ResNeXt代碼


          import?torch.nn as?nn
          import?torch.nn.functional as?F


          __all__ = ['resnext29_8x64d', 'resnext29_16x64d']


          class?Bottleneck(nn.Module):

          ????def?__init__(
          ????????????self,
          ????????????in_channels,
          ????????????out_channels,
          ????????????stride,
          ????????????cardinality,
          ????????????base_width,
          ????????????expansion)
          :


          ????????super(Bottleneck, self).__init__()
          ????????width_ratio = out_channels / (expansion * 64.)
          ????????D = cardinality * int(base_width * width_ratio)

          ????????self.relu = nn.ReLU(inplace=True)

          ????????self.conv_reduce = nn.Conv2d(
          ????????????in_channels, D, kernel_size=1, stride=1, padding=0, bias=False)
          ????????self.bn_reduce = nn.BatchNorm2d(D)
          ????????self.conv_conv = nn.Conv2d(
          ????????????D,
          ????????????D,
          ????????????kernel_size=3,
          ????????????stride=stride,
          ????????????padding=1,
          ????????????groups=cardinality,
          ????????????bias=False)
          ????????self.bn = nn.BatchNorm2d(D)
          ????????self.conv_expand = nn.Conv2d(
          ????????????D, out_channels, kernel_size=1, stride=1, padding=0, bias=False)
          ????????self.bn_expand = nn.BatchNorm2d(out_channels)

          ????????self.shortcut = nn.Sequential()
          ????????if?in_channels != out_channels:
          ????????????self.shortcut.add_module(
          ????????????????'shortcut_conv',
          ????????????????nn.Conv2d(
          ????????????????????in_channels,
          ????????????????????out_channels,
          ????????????????????kernel_size=1,
          ????????????????????stride=stride,
          ????????????????????padding=0,
          ????????????????????bias=False))
          ????????????self.shortcut.add_module(
          ????????????????'shortcut_bn', nn.BatchNorm2d(out_channels))

          ????def?forward(self, x):
          ????????out = self.conv_reduce.forward(x)
          ????????out = self.relu(self.bn_reduce.forward(out))
          ????????out = self.conv_conv.forward(out)
          ????????out = self.relu(self.bn.forward(out))
          ????????out = self.conv_expand.forward(out)
          ????????out = self.bn_expand.forward(out)
          ????????residual = self.shortcut.forward(x)
          ????????return?self.relu(residual + out)


          class?ResNeXt(nn.Module):
          ????"""
          ????ResNext optimized for the Cifar dataset, as specified in
          ????https://arxiv.org/pdf/1611.05431.pdf
          ????"""


          ????def?__init__(
          ????????????self,
          ????????????cardinality,
          ????????????depth,
          ????????????num_classes,
          ????????????base_width,
          ????????????expansion=4)
          :

          ????????""" Constructor
          ????????Args:
          ????????????cardinality: number of convolution groups.
          ????????????depth: number of layers.
          ????????????num_classes: number of classes
          ????????????base_width: base number of channels in each group.
          ????????????expansion: factor to adjust the channel dimensionality
          ????????"""

          ????????super(ResNeXt, self).__init__()
          ????????self.cardinality = cardinality
          ????????self.depth = depth
          ????????self.block_depth = (self.depth - 2) // 9
          ????????self.base_width = base_width
          ????????self.expansion = expansion
          ????????self.num_classes = num_classes
          ????????self.output_size = 64
          ????????self.stages = [64, 64?* self.expansion, 128?*
          ???????????????????????self.expansion, 256?* self.expansion]

          ????????self.conv_1_3x3 = nn.Conv2d(3, 64, 3, 1, 1, bias=False)
          ????????self.bn_1 = nn.BatchNorm2d(64)
          ????????self.stage_1 = self.block('stage_1', self.stages[0], self.stages[1], 1)
          ????????self.stage_2 = self.block('stage_2', self.stages[1], self.stages[2], 2)
          ????????self.stage_3 = self.block('stage_3', self.stages[2], self.stages[3], 2)
          ????????self.fc = nn.Linear(self.stages[3], num_classes)
          ????????for?m in?self.modules():
          ????????????if?isinstance(m, nn.Conv2d):
          ????????????????nn.init.kaiming_normal_(m.weight.data)
          ????????????elif?isinstance(m, nn.BatchNorm2d):
          ????????????????m.weight.data.fill_(1)
          ????????????????m.bias.data.zero_()

          ????def?block(self, name, in_channels, out_channels, pool_stride=2):
          ????????block = nn.Sequential()
          ????????for?bottleneck in?range(self.block_depth):
          ????????????name_ = '%s_bottleneck_%d'?% (name, bottleneck)
          ????????????if?bottleneck == 0:
          ????????????????block.add_module(
          ????????????????????name_,
          ????????????????????Bottleneck(
          ????????????????????????in_channels,
          ????????????????????????out_channels,
          ????????????????????????pool_stride,
          ????????????????????????self.cardinality,
          ????????????????????????self.base_width,
          ????????????????????????self.expansion))
          ????????????else:
          ????????????????block.add_module(
          ????????????????????name_,
          ????????????????????Bottleneck(
          ????????????????????????out_channels,
          ????????????????????????out_channels,
          ????????????????????????1,
          ????????????????????????self.cardinality,
          ????????????????????????self.base_width,
          ????????????????????????self.expansion))
          ????????return?block

          ????def?forward(self, x):
          ????????x = self.conv_1_3x3.forward(x)
          ????????x = F.relu(self.bn_1.forward(x), inplace=True)
          ????????x = self.stage_1.forward(x)
          ????????x = self.stage_2.forward(x)
          ????????x = self.stage_3.forward(x)
          ????????x = F.avg_pool2d(x, 8, 1)
          ????????x = x.view(-1, self.stages[3])
          ????????return?self.fc(x)


          def?resnext29_8x64d(num_classes):
          ????return?ResNeXt(
          ????????cardinality=8,
          ????????depth=29,
          ????????num_classes=num_classes,
          ????????base_width=64)


          def?resnext29_16x64d(num_classes):
          ????return?ResNeXt(
          ????????cardinality=16,
          ????????depth=29,
          ????????num_classes=num_classes,
          ????????base_width=64)



          參考鏈接

          1. https://zhuanlan.zhihu.com/p/54289848

          2. https://zhuanlan.zhihu.com/p/28124810

          3. https://zhuanlan.zhihu.com/p/31727402

          4. https://zhuanlan.zhihu.com/p/56961832

          5. https://zhuanlan.zhihu.com/p/54072011

          6. https://github.com/BIGBALLON/CIFAR-ZOO

          7. https://zhuanlan.zhihu.com/p/78019001


          下載1
          在「AI算法與圖像處」公眾號后臺回復(fù):yolov4,即可下載?YOLOv4 trick相關(guān)論文

          下載2
          AI算法與圖像處公眾號后臺回復(fù):速查表即可下載包括 21張經(jīng)典查找表

          個人微信(如果沒有備注不拉群!
          請注明:地區(qū)+學(xué)校/企業(yè)+研究方向+昵稱


          瀏覽 29
          點贊
          評論
          收藏
          分享

          手機掃一掃分享

          分享
          舉報
          評論
          圖片
          表情
          推薦
          點贊
          評論
          收藏
          分享

          手機掃一掃分享

          分享
          舉報
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  狠狠躁夜夜躁人人爽天天高潮 | 日韩欧美亚洲一区二区三区 | 99精品全国免费观看 | 黄色大电影在这 | 色婷婷久久综合久色 |