<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          CeiT:訓(xùn)練更快的多層特征抽取ViT

          共 5777字,需瀏覽 12分鐘

           ·

          2022-01-15 07:34



          【GiantPandaCV導(dǎo)語(yǔ)】

          來(lái)自商湯和南洋理工的工作,也是使用卷積來(lái)增強(qiáng)模型提出low-level特征的能力,增強(qiáng)模型獲取局部性的能力,核心貢獻(xiàn)是LCA模塊,可以用于捕獲多層特征表示。相比DeiT,訓(xùn)練速度更快。




          引言

          針對(duì)先前Transformer架構(gòu)需要大量額外數(shù)據(jù)或者額外的監(jiān)督(Deit),才能獲得與卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)相當(dāng)?shù)男阅?,為了克服這種缺陷,提出結(jié)合CNN來(lái)彌補(bǔ)Transformer的缺陷,提出了CeiT:

          (1)設(shè)計(jì)Image-to-Tokens模塊來(lái)從low-level特征中得到embedding。

          (2)將Transformer中的Feed Forward模塊替換為L(zhǎng)ocally-enhanced Feed-Forward(LeFF)模塊,增加了相鄰token之間的相關(guān)性。

          (3)使用Layer-wise Class Token Attention(LCA)捕獲多層的特征表示。

          經(jīng)過(guò)以上修改,可以發(fā)現(xiàn)模型效率方面以及泛化能力得到了提升,收斂性也有所改善,如下圖所示:

          方法

          1. Image-to-Tokens

          使用卷積+池化來(lái)取代原先ViT中7x7的大型patch。

          2. LeFF

          將tokens重新拼成feature map,然后使用深度可分離卷積添加局部性的處理,然后再使用一個(gè)Linear層映射至tokens。

          3. LCA

          前兩個(gè)都比較常規(guī),最后一個(gè)比較有特色,經(jīng)過(guò)所有Transformer層以后使用的Layer-wise Class-token Attention,如下圖所示:

          LCA模塊會(huì)將所有Transformer Block中得到的class token作為輸入,然后再在其基礎(chǔ)上使用一個(gè)MSA+FFN得到最終的logits輸出。作者認(rèn)為這樣可以獲取多尺度的表征。

          實(shí)驗(yàn)

          SOTA比較:

          I2T消融實(shí)驗(yàn):

          LeFF消融實(shí)驗(yàn):

          LCA有效性比較:

          收斂速度比較:

          代碼

          模塊1:I2T Image-to-Token

          ??#?IoT
          ??self.conv?=?nn.Sequential(
          ??????nn.Conv2d(in_channels,?out_channels,?conv_kernel,?stride,?4),
          ??????nn.BatchNorm2d(out_channels),
          ??????nn.MaxPool2d(pool_kernel,?stride)????
          ??)
          ??
          ??feature_size?=?image_size?//?4

          ??assert?feature_size?%?patch_size?==?0,?'Image?dimensions?must?be?divisible?by?the?patch?size.'
          ??num_patches?=?(feature_size?//?patch_size)?**?2
          ??patch_dim?=?out_channels?*?patch_size?**?2
          ??self.to_patch_embedding?=?nn.Sequential(
          ??????Rearrange('b?c?(h?p1)?(w?p2)?->?b?(h?w)?(p1?p2?c)',?p1?=?patch_size,?p2?=?patch_size),
          ??????nn.Linear(patch_dim,?dim),
          ??)

          模塊2:LeFF

          class?LeFF(nn.Module):
          ????
          ????def?__init__(self,?dim?=?192,?scale?=?4,?depth_kernel?=?3):
          ????????super().__init__()
          ????????
          ????????scale_dim?=?dim*scale
          ????????self.up_proj?=?nn.Sequential(nn.Linear(dim,?scale_dim),
          ????????????????????????????????????Rearrange('b?n?c?->?b?c?n'),
          ????????????????????????????????????nn.BatchNorm1d(scale_dim),
          ????????????????????????????????????nn.GELU(),
          ????????????????????????????????????Rearrange('b?c?(h?w)?->?b?c?h?w',?h=14,?w=14)
          ????????????????????????????????????)
          ????????
          ????????self.depth_conv?=??nn.Sequential(nn.Conv2d(scale_dim,?scale_dim,?kernel_size=depth_kernel,?padding=1,?groups=scale_dim,?bias=False),
          ??????????????????????????nn.BatchNorm2d(scale_dim),
          ??????????????????????????nn.GELU(),
          ??????????????????????????Rearrange('b?c?h?w?->?b?(h?w)?c',?h=14,?w=14)
          ??????????????????????????)
          ????????
          ????????self.down_proj?=?nn.Sequential(nn.Linear(scale_dim,?dim),
          ????????????????????????????????????Rearrange('b?n?c?->?b?c?n'),
          ????????????????????????????????????nn.BatchNorm1d(dim),
          ????????????????????????????????????nn.GELU(),
          ????????????????????????????????????Rearrange('b?c?n?->?b?n?c')
          ????????????????????????????????????)
          ????????
          ????def?forward(self,?x):
          ????????x?=?self.up_proj(x)
          ????????x?=?self.depth_conv(x)
          ????????x?=?self.down_proj(x)
          ????????return?x
          ????????
          class?TransformerLeFF(nn.Module):
          ????def?__init__(self,?dim,?depth,?heads,?dim_head,?scale?=?4,?depth_kernel?=?3,?dropout?=?0.):
          ????????super().__init__()
          ????????self.layers?=?nn.ModuleList([])
          ????????for?_?in?range(depth):
          ????????????self.layers.append(nn.ModuleList([
          ????????????????Residual(PreNorm(dim,?Attention(dim,?heads?=?heads,?dim_head?=?dim_head,?dropout?=?dropout))),
          ????????????????Residual(PreNorm(dim,?LeFF(dim,?scale,?depth_kernel)))
          ????????????]))
          ????def?forward(self,?x):
          ????????c?=?list()
          ????????for?attn,?leff?in?self.layers:
          ????????????x?=?attn(x)
          ????????????cls_tokens?=?x[:,?0]
          ????????????c.append(cls_tokens)
          ????????????x?=?leff(x[:,?1:])
          ????????????x?=?torch.cat((cls_tokens.unsqueeze(1),?x),?dim=1)?
          ????????return?x,?torch.stack(c).transpose(0,?1)

          模塊3:LCA

          class?LCAttention(nn.Module):
          ????def?__init__(self,?dim,?heads?=?8,?dim_head?=?64,?dropout?=?0.):
          ????????super().__init__()
          ????????inner_dim?=?dim_head?*??heads
          ????????project_out?=?not?(heads?==?1?and?dim_head?==?dim)

          ????????self.heads?=?heads
          ????????self.scale?=?dim_head?**?-0.5

          ????????self.to_qkv?=?nn.Linear(dim,?inner_dim?*?3,?bias?=?False)

          ????????self.to_out?=?nn.Sequential(
          ????????????nn.Linear(inner_dim,?dim),
          ????????????nn.Dropout(dropout)
          ????????)?if?project_out?else?nn.Identity()

          ????def?forward(self,?x):
          ????????b,?n,?_,?h?=?*x.shape,?self.heads
          ????????qkv?=?self.to_qkv(x).chunk(3,?dim?=?-1)
          ????????q,?k,?v?=?map(lambda?t:?rearrange(t,?'b?n?(h?d)?->?b?h?n?d',?h?=?h),?qkv)
          ????????q?=?q[:,?:,?-1,?:].unsqueeze(2)?#?Only?Lth?element?use?as?query

          ????????dots?=?einsum('b?h?i?d,?b?h?j?d?->?b?h?i?j',?q,?k)?*?self.scale

          ????????attn?=?dots.softmax(dim=-1)

          ????????out?=?einsum('b?h?i?j,?b?h?j?d?->?b?h?i?d',?attn,?v)
          ????????out?=?rearrange(out,?'b?h?n?d?->?b?n?(h?d)')
          ????????out?=??self.to_out(out)
          ????????return?out

          class?LCA(nn.Module):
          ????#?I?remove?Residual?connection?from?here,?in?paper?author?didn't?explicitly?mentioned?to?use?Residual?connection,?
          ????#?so?I?removed?it,?althougth?with?Residual?connection?also?this?code?will?work.
          ????def?__init__(self,?dim,?heads,?dim_head,?mlp_dim,?dropout?=?0.):
          ????????super().__init__()
          ????????self.layers?=?nn.ModuleList([])
          ????????self.layers.append(nn.ModuleList([
          ????????????????PreNorm(dim,?LCAttention(dim,?heads?=?heads,?dim_head?=?dim_head,?dropout?=?dropout)),
          ????????????????PreNorm(dim,?FeedForward(dim,?mlp_dim,?dropout?=?dropout))
          ????????????]))
          ????def?forward(self,?x):
          ????????for?attn,?ff?in?self.layers:
          ????????????x?=?attn(x)?+?x[:,?-1].unsqueeze(1)
          ????????????x?=?x[:,?-1].unsqueeze(1)?+?ff(x)
          ????????return?x

          參考

          https://arxiv.org/abs/2103.11816

          https://github.com/rishikksh20/CeiT-pytorch/blob/master/ceit.py



          為了感謝讀者的長(zhǎng)期支持,今天我們將送出三本由?機(jī)械工業(yè)出版社?提供的:《從零開(kāi)始構(gòu)建深度前饋神經(jīng)網(wǎng)絡(luò)》 。點(diǎn)擊下方抽獎(jiǎng)助手參與抽獎(jiǎng)。沒(méi)抽到并且對(duì)本書(shū)有興趣的也可以使用下方鏈接進(jìn)行購(gòu)買。

          《從零開(kāi)始構(gòu)建深度前饋神經(jīng)網(wǎng)絡(luò)》抽獎(jiǎng)鏈接


          本書(shū)通過(guò)Python+NumPy從零開(kāi)始構(gòu)建神經(jīng)網(wǎng)絡(luò)模型,強(qiáng)化讀者對(duì)算法思想的理解,并通過(guò)TensorFlow構(gòu)建模型來(lái)驗(yàn)證讀者親手從零構(gòu)建的版本。前饋神經(jīng)網(wǎng)絡(luò)是深度學(xué)習(xí)的重要知識(shí),其核心思想是反向傳播與梯度下降。本書(shū)從極易理解的示例開(kāi)始,逐漸深入,幫助讀者充分理解并熟練掌握反向傳播與梯度下降算法,為后續(xù)學(xué)習(xí)打下堅(jiān)實(shí)的基礎(chǔ)。


          END



          瀏覽 127
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  亚洲精品中文字幕成人片 | 极品人妻疯狂3p超刺激 | 黄色片免费观看视频 | 免费中文字幕日韩欧美 | 大香蕉伊人免费网站 |