五月激情偷拍,二区三区导航,a 视频在线播放,日日干日日摸,成人激情视频图片播放,91n-最新地址发布页,国产免费污污,www.撸一撸

點擊上方“小白學(xué)視覺”，選擇加"星標"或“置頂”

重磅干貨，第一時間送達

僅作學(xué)術(shù)分享，不代表本公眾號立場，侵權(quán)聯(lián)系刪除

轉(zhuǎn)載于：作者丨博學(xué)的咖喱醬@知乎（已授權(quán)）

來源丨h(huán)ttps://zhuanlan.zhihu.com/p/599936122

編輯丨極市平臺

語義分割技巧

選擇模型

使用小尺寸圖片選擇模型，因為小尺寸圖片可以有大的batchsize，使得選擇模型過程消耗的時間較少。而大尺寸圖片和小尺寸圖片訓(xùn)練相同的次數(shù)，loss下降的效果幾乎相同。小尺寸約256x256.
先確定合適的參數(shù)，如學(xué)習(xí)率、損失函數(shù)參數(shù)等，參數(shù)一樣的情況下，訓(xùn)練不同的模型。好的模型前期會下降較快，在小尺寸圖片下預(yù)訓(xùn)練尤其有效。大尺寸圖片訓(xùn)練反而下降不快，看出不下降趨勢。
確定模型參數(shù)量，對于有些網(wǎng)絡(luò)，模型參數(shù)量太大會導(dǎo)致loss曲線奇異。這一般是由于下采樣過多、或attention通道數(shù)過多造成的，可以減少模型的參數(shù)量，確定合適的模型參數(shù)量。

訓(xùn)練模型

使用粗略的損失函數(shù)參數(shù)訓(xùn)練到底后，可以稍微改變損失函數(shù)參數(shù)，如改變bwloss和dloss的比例，對于測試集精度可能會有提升。
對于語義分割，focalloss等損失函數(shù)屬于輔助損失函數(shù)，在訓(xùn)練前期起到加速訓(xùn)練的效果，在總損失函數(shù)中占比例較低，一般低于0.2的比例，約0.1。diceloss是主要損失函數(shù)，對于小前景分割有效果，能直接提升測試集精度，一般最后精訓(xùn)練只用diceloss損失，預(yù)訓(xùn)練是focalloss+diceloss。
先使用小尺寸圖片(約256x256)預(yù)訓(xùn)練，再使用中尺寸圖片(約512x512)訓(xùn)練，最后使用大尺寸圖片(約1024x1024)精訓(xùn)練。小尺寸圖片可以有很大的batchsize，且使用小尺寸圖片訓(xùn)練相同的次數(shù)收斂更快，估計小尺寸圖片收斂速度為中尺寸圖片的兩倍，大尺寸最慢，大尺寸圖片通常batchsize只能取1，無法直接訓(xùn)練，且收斂效果很差，一般用于精訓(xùn)練。（使用不同尺寸圖片訓(xùn)練是語義分割任務(wù)的優(yōu)點，像目標檢測等任務(wù)一般很難這樣做，所以輸入尺寸只能很?。?

模型改進

maxpool可以增強抗噪能力。avgpool或convstridepool可以保留位置信息。
運用跨層連接結(jié)構(gòu)可以不影響模型容量下加速訓(xùn)練。這些結(jié)構(gòu)通常被封裝成一個模塊?，F(xiàn)有大量的模型都是使用了優(yōu)美的跨層連接結(jié)構(gòu)，訓(xùn)練更容易，如DFANet。
有時瓶頸結(jié)構(gòu)會有更好的效果，猜測可能起到編碼器-解碼器的作用，起到降噪的作用。如卷積從inch->outch//2;outch//2->outch//2;outch//2->outch這樣，其中outch//2可能小于inch，但這樣同樣有效。
使用大量的分離卷積，模型參數(shù)會更小，容量會很大。分離卷積效果比非對稱卷積、大卷積要好，非對稱卷積一般是用在很小尺寸(約32x32)的經(jīng)過下采樣的圖片上，而大卷積一般是用在網(wǎng)絡(luò)一開始輸入的地方。對于unet的結(jié)構(gòu)，網(wǎng)絡(luò)深度不是很深，因此使用大卷積可能較好。對于fcn式，網(wǎng)絡(luò)可以堆疊很深，因此使用分離卷積效果較好。
在語義分割中，attention的本質(zhì)是降噪。讓不需要的信息（噪聲）置為零，跟relu的作用一樣，所以relu在語義分割中尤為有效。使用attention模塊能更好的開發(fā)一個模型的潛能，能更快收斂，可以用在模塊block的后面，跟BN一樣。而且用很多也不會有副作用。attention模塊主要有通道型和空間型兩種，通道型的attention可能會導(dǎo)致loss訓(xùn)練很難（異常），帶空間型的可以緩解這種現(xiàn)象，使得訓(xùn)練容易。常用的有SE-module或CBAM模塊。
在編碼器-解碼器的網(wǎng)絡(luò)中，主要工作都是由編碼器做的，編碼器參數(shù)要很大，相反，解碼器要盡量簡單，參數(shù)量要很小，如通常的無偏置1x1卷積來聚合編碼器信息，或使用無參數(shù)的雙線性插值來放大圖片。使用反卷積等技術(shù)上采樣反而訓(xùn)練更困難，因為其加大了解碼器的參數(shù)量。

理解

神經(jīng)網(wǎng)絡(luò)可以看成是降噪，即不斷精煉信息的過程，這意味著需要不斷去除冗余信息或噪聲。attention、bn、relu等手段一定程度上可以看成是去除噪聲的手段，因此bottleneck結(jié)構(gòu)有效，因為它在瓶頸處精煉出有用信息然后處理這些有用信息。3x3卷積的作用是處理信息，那么1x1卷積的作用就是降維精煉信息，因此1x1升維=>3x3處理=>1x1精煉也是一種有效結(jié)構(gòu)。
ottleneck結(jié)構(gòu)本質(zhì)是降噪。bottleneck結(jié)構(gòu)一般用在網(wǎng)絡(luò)前面較好，且bottle處卷積參數(shù)初始化為正，使用torch.nn.init.uniform_(conv.weight, a=0, b=1)初始化權(quán)重和nn.init.zeros_(conv.bias)置偏置為零。原因為在瓶頸處降噪的效果好。
1x1卷積的主要作用是復(fù)制圖片（增加通道數(shù)）給3x3卷積處理，因此增加通道數(shù)可以使用1x1卷積而不是3x3卷積，3x3卷積用于處理圖片，只需要分離卷積即可。1x1卷積還可以將多張圖片（多通道）合并成更少的圖片（降通道），即將處理的結(jié)果組合起來。

########## DFANET模型定義 #############

##################################### DFANET模型定義 ####################################

# 純粹的卷積，如Conv2d
class SeparableConv2d(nn.Module):
    def __init__(self, in_ch, out_ch, stride=1, bottle=False):
        super(SeparableConv2d, self).__init__()
        self.sconv = nn.Sequential(
            nn.Conv2d(in_ch, in_ch, kernel_size=5, padding=2, groups=in_ch, stride=stride),
            Attention(in_ch),
            nn.Conv2d(in_ch, out_ch, kernel_size=1, bias=False),
        )
        if bottle:
            torch.nn.init.uniform_(self.sconv[0].weight, a=0, b=1)
            torch.nn.init.uniform_(self.sconv[2].weight, a=0, b=1)
            nn.init.zeros_(self.sconv[0].bias)

    def forward(self, x):
        return self.sconv(x)


# bottleneck結(jié)構(gòu)
class Block(nn.Module):
    def __init__(self, in_ch, out_ch, stride=1, reduction = 2, bottle=False):
        super(Block, self).__init__()
        self.block = nn.Sequential(
            SeparableConv2d(in_ch, out_ch//reduction, stride=stride),
            nn.BatchNorm2d(out_ch//reduction),
            nn.ReLU(inplace=True),
            SeparableConv2d(out_ch//reduction, out_ch//reduction, bottle=bottle),
            nn.BatchNorm2d(out_ch//reduction),
            nn.ReLU(inplace=True),
            SeparableConv2d(out_ch//reduction, out_ch),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True),
        )
        self.proj = nn.Sequential(
            nn.MaxPool2d(stride, stride=stride),
            nn.Conv2d(in_ch, out_ch, 1, bias=False)
        )
    def forward(self, x):
        out = self.block(x)
        identity = self.proj(x)
        return out + identity


class enc(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(enc, self).__init__()
        self.encblocks = nn.Sequential(
            Block(in_ch, out_ch, stride=2),
            Block(out_ch, out_ch),
            Block(out_ch, out_ch),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True),
            Attention(out_ch),
        )

    def forward(self, x):
        return self.encblocks(x)


class Attention(nn.Module):
    def __init__(self, channels):
        super(Attention, self).__init__()
        mid_channels = int((channels/2)**0.5) # 1 4 9
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.sharedMLP = nn.Sequential(
            nn.Conv2d(channels, mid_channels, kernel_size=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(mid_channels, mid_channels, kernel_size=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(mid_channels, channels, kernel_size=1),
        )

    def forward(self, x):
        avg = self.sharedMLP(self.avg_pool(x))
        x = x * torch.sigmoid(avg)
        return x


class SubBranch(nn.Module):
    def __init__(self, in_ch, chs, branch_index):
        super(SubBranch,self).__init__()
        self.encs = nn.ModuleList()
        current_ch = in_ch
        for i, ch in enumerate(chs):
            self.encs.append(enc(current_ch, ch))
            if branch_index != 0 and i < len(chs)-2:
                current_ch = chs[i]+chs[i+1]
            else:
                current_ch = ch
        self.branch_index = branch_index

    def forward(self, x0, *args):
        retlist = []
        if self.branch_index == 0:
            for i, enc in enumerate(self.encs):
                retlist.append(enc(x0 if i==0 else retlist[-1]))
        else:
            for i, enc in enumerate(self.encs):
                if i == 0:
                    retlist.append(enc(x0))
                elif i-1 < len(args):
                    retlist.append(enc(torch.cat([retlist[-1], args[i-1]], 1)))
                else:
                    retlist.append(enc(retlist[-1]))

        return retlist


class DFA_Encoder(nn.Module):
    def __init__(self, chs, m): # m個subbranch
        super(DFA_Encoder,self).__init__()
        self.branchs = nn.ModuleList()
        for i in range(m):
            current_inch = chs[0] if i==0 else chs[1]+chs[-1]
            self.branchs.append(SubBranch(current_inch, chs[1:], branch_index=i))
        self.n = len(chs) - 1
        self.m = m 

    def forward(self, x):
        lowfeatures = [None]*self.m
        highfeatures = [None]*self.m
        lastvariables = [None]*(self.n-2)# -2
        lasthighfeaturelist = []
        for i, branch in enumerate(self.branchs):
            tem = torch.cat([x if i==0 else lowfeatures[i-1]]+lasthighfeaturelist, 1)
            lowfeatures[i], *lastvariables, highfeatures[i] = branch(tem, *lastvariables)
            if i != self.m-1:
                lasthighfeaturelist = [F.interpolate(highfeatures[i], size=lowfeatures[i].shape[2:], mode='bilinear', align_corners=True)]
        return lowfeatures, highfeatures

class DFA_Decoder(nn.Module):
    def __init__(self, chs, out_ch, m):
        super(DFA_Decoder,self).__init__()
        self.lowconv = nn.Sequential(
            nn.Conv2d(chs[1], chs[0], kernel_size=1, bias=False),
            nn.BatchNorm2d(chs[0]),
        )
        self.highconv = nn.Sequential(
            nn.Conv2d(chs[-1], chs[0], kernel_size=1, bias=False),
            nn.BatchNorm2d(chs[0]),
        )

        self.shuffleconv = nn.Conv2d(chs[0], out_ch, kernel_size=1, bias=True)
        self.m = m

    def forward(self, lows, highs, proj):# proj沒什么用
        for i in range(1, self.m):
            lows[i] = F.interpolate(lows[i], size=lows[i-1].shape[2:], mode='bilinear', align_corners=True)
        for i in range(self.m):
            highs[i] = F.interpolate(highs[i], size=lows[0].shape[2:], mode='bilinear', align_corners=True)

        x_low = self.lowconv(sum(lows))
        x_high = self.highconv(sum(highs))
        x_sf = self.shuffleconv(x_low + x_high)
        return F.interpolate(x_sf, scale_factor=2, mode='bilinear', align_corners=True) # 沒有有效的上采樣方式


################################# PreModule ########################################
class PreModule(nn.Module):
    def __init__(self, in_ch, out_ch):
        super(PreModule, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(in_ch, out_ch, kernel_size=7, padding=3, bias=False),
            nn.BatchNorm2d(out_ch),
            nn.ReLU(inplace=True),
        )
        self.block = Block(out_ch, out_ch, reduction=8, bottle=True) # 若下采樣，val有極限0.53

    def forward(self, x):
        out = self.conv(x)
        return self.block(out)


class DFANet(nn.Module):
    def __init__(self, chs, in_ch, out_ch, m): # 改成chs=[32, 64, 128]
        super(DFANet, self).__init__()
        self.premodule = PreModule(in_ch, chs[0]) 
        self.encoder = DFA_Encoder(chs, m)
        self.decoder = DFA_Decoder(chs, out_ch, m)

    def forward(self, x):
        x = self.premodule(x)
        lows, highs = self.encoder(x)
        y = self.decoder(lows, highs, x)
        return torch.softmax(y, dim=1)

      
       下載1：OpenCV-Contrib擴展模塊中文版教程
       

      
      
       在「小白學(xué)視覺」公眾號后臺回復(fù)：擴展模塊中文教程，即可下載全網(wǎng)第一份OpenCV擴展模塊教程中文版，涵蓋擴展模塊安裝、SFM算法、立體視覺、目標跟蹤、生物視覺、超分辨率處理等二十多章內(nèi)容。
      
      
       

      
      
       下載2：Python視覺實戰(zhàn)項目52講
      
      
       在「小白學(xué)視覺」公眾號后臺回復(fù)：Python視覺實戰(zhàn)項目，即可下載包括圖像分割、口罩檢測、車道線檢測、車輛計數(shù)、添加眼線、車牌識別、字符識別、情緒檢測、文本內(nèi)容提取、面部識別等31個視覺實戰(zhàn)項目，助力快速學(xué)校計算機視覺。
      
      
       

      
      
       下載3：OpenCV實戰(zhàn)項目20講
      
      
       在「小白學(xué)視覺」公眾號后臺回復(fù)：OpenCV實戰(zhàn)項目20講，即可下載含有20個基于OpenCV實現(xiàn)20個實戰(zhàn)項目，實現(xiàn)OpenCV學(xué)習(xí)進階。
      
      
       

      
交流群

歡迎加入公眾號讀者群一起和同行交流，目前有SLAM、三維視覺、傳感器、自動駕駛、計算攝影、檢測、分割、識別、醫(yī)學(xué)影像、GAN、算法競賽等微信群（以后會逐漸細分），請掃描下面微信號加群，備注：”昵稱+學(xué)校/公司+研究方向“，例如：”張三 + 上海交大 + 視覺SLAM“。請按照格式備注，否則不予通過。添加成功后會根據(jù)研究方向邀請進入相關(guān)微信群。請勿在群內(nèi)發(fā)送廣告，否則會請出群，謝謝理解~

收藏 | 盤點語義分割小技巧

語義分割技巧

選擇模型

訓(xùn)練模型

模型改進

理解