<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          (附論文)ICCV2021 | GUPNet:基于幾何不確定性映射的單目3D檢測網(wǎng)絡(luò)

          共 4234字,需瀏覽 9分鐘

           ·

          2021-09-05 10:21

          點(diǎn)擊左上方藍(lán)字關(guān)注我們



          一個(gè)專注于目標(biāo)檢測與深度學(xué)習(xí)知識(shí)分享的公眾號(hào)
          作者 | 柒柒@知乎
          鏈接丨h(huán)ttps://zhuanlan.zhihu.com/p/397105796


          論文標(biāo)題:Geometry Uncertainty Projection Network for Monocular 3D Object Detection
          作者單位:The University of Sydney, SenseTime Computer Vision Group 等
          論文:
          https://arxiv.org/pdf/2107.13774.pdf

          一句話讀論文:

          利用幾何關(guān)系衡量深度估計(jì)的不確定度。


          作者的觀點(diǎn):




          Existing methods with the projection model usually estimate the height of 2D and 3D bounding box first and then infer the depth via the projection formular.


          作者還提供了圖例,如下圖。
          從圖中可以看出,即使高度估計(jì)誤差只有0.1m,也可能導(dǎo)致4m的深度值偏差。


          We can find that a slight bias (0.1m) of 3D heights could cause a significant shift (even 4m) in the projected depth.


          2. 作者探討的第一個(gè)問題是:推斷可靠性。為什么要討論這個(gè)問題呢?原因其實(shí)第一點(diǎn)已經(jīng)提過了,“slight height bias → significant depth shift”。也就是說由于高度預(yù)測的不確定性,導(dǎo)致了深度值估計(jì)的不確定性。


          The first problem is inference reliability. A small quality change in the 3D height estimation would cause a large change in the depth estimation quality. This makes the model cannot predict reliable uncertainty or confidence easily, leading to uncontrollable outputs.


          3. 作者探討的第二個(gè)問題是:模型訓(xùn)練的穩(wěn)定性為什么要討論這個(gè)問題呢?其實(shí)還是因?yàn)楦叨阮A(yù)測的不準(zhǔn)確。在模型訓(xùn)練初期,物體高度的預(yù)測往往存在較大偏差,也因此導(dǎo)致了深度估算偏差較大。較大誤差往往導(dǎo)致網(wǎng)絡(luò)訓(xùn)練困難,從而影響整體網(wǎng)絡(luò)性能。


          Another problem is the instability of model training. In particular, at the beginning of the training phase, the estimation of 2D/3D height tends to be noisy, and the errors will be amplified and cause outrageous depth estimation. Consequently, the training process of the network will be misled, which will lead to the degradation of the final performance


          因此,作者整體的網(wǎng)絡(luò)設(shè)計(jì)旨在于解決:推斷可靠性和模型穩(wěn)定性兩個(gè)問題。其中,Geometry Uncertainty Projection (GUP) 用于處理推斷可靠性問題,Hierarchical Task Learning (HTL) 用于處理模型訓(xùn)練穩(wěn)定性問題。具體地,網(wǎng)絡(luò)框架流程可以理解為:
          輸入2D圖像 → 預(yù)測2D+3D box → GUP模塊優(yōu)化深度值 → 得到檢測結(jié)果,如下圖。
          網(wǎng)絡(luò)框架圖
          骨架網(wǎng)絡(luò)部分與通用的單目3D檢測一致,就不多說了,這里主要記錄一下兩個(gè)主要模塊GUP和HTL是怎么運(yùn)作的。
          第一,Geometry Uncertainty Projection (GUP) 模塊。這個(gè)模塊與傳統(tǒng)的定位模塊有什么區(qū)別呢?簡單地說,最顯著的區(qū)別就是:之前的方法只會(huì)輸出單一的深度值,本文的GUP模塊輸出深度值+不確定度。這里的不確定度是用來表征當(dāng)前深度值的可靠性,也就是解決了作者提出的推斷可靠性的問題。


          The overall module builds the projection process in the probability framework rather than single values so that the model can compute the theoretical uncertainty for the inferred depth, which can indicate the depth inference reliability and also be helpful for the depth learning.


          具體的做法是:
          預(yù)測物體3D高度 → 做映射得到深度值 → 預(yù)測偏移量 → 深度值+偏移量得到最終的不確定度。


          To achieve this goal, we first assume the prediction of the 3D height for each object is a Laplace distribution. The distribution parameters are predicted by the 3D size streame in an end-to-end way. The average denotes the regression target output and the variation is the uncertainty of the inference


          接下就是,怎么樣讓網(wǎng)絡(luò)朝著我們希望的方向發(fā)展呢,這就是損失函數(shù)干的活。因此,作者設(shè)計(jì)了具有針對性的3D高度預(yù)測的損失函數(shù):
          上式的函數(shù)可以比較明顯的看出,損失函數(shù)最小的情況無非就是:均值等于真值且方差為0。
          b)做映射得到深度值。從幾何位置到深度值計(jì)算這個(gè)話題已經(jīng)談了很久了,這里就不贅述了,如下式:



          將上文預(yù)測出的3D高度帶入,即可得到深度值。由于3D高度是符合拉普拉斯分布的,因此,這里計(jì)算出的深度值也是符合拉普拉斯分布的,記為  。


          Based on the learned height distribution, the depth distribution of the projection output can be approximated as above, where X is the standard Laplace distribution.


          c)預(yù)測偏移量。沒啥特別好講的,無非就是給深度值又加了一層不確定度的保障。


          We also assume that the learned bias is a Laplace distribution and independent with the projection one.




          其實(shí)就是直接相加就好了,均值和方差也都符合分布相加法則。我們希望這個(gè)估計(jì)出的depth符合什么特性呢?顯然與預(yù)測出的3D高度一樣,我們希望depth的均值無限接近于真值,其方差無限趨近于1。也就得到了下式的損失函數(shù):


          The overall loss would push the projection results close to the ground truth and the gradient would affect the depth bias, the 2D height and the 3D height simultaneously. Besides, the uncertainty of 3D height and depth bias is also trained in the optimization process.




          至此,第一個(gè)GUP模塊做完了。
          第二,Hierarchical Task Learning (HTL) 模塊。上文也提到,這個(gè)模塊是為了解決模型訓(xùn)練過程中的不穩(wěn)定性問題。作者的做法其實(shí)挺簡單,既然所有模塊合在一起訓(xùn)練不穩(wěn)定,那就分開好了,分級(jí)訓(xùn)練,為不同模塊指定不同的訓(xùn)練權(quán)重,用以控制其在模型訓(xùn)練中的重要性。
          The GUP module mainly addresses the error amplification effect in the inference stage. Yet, this effect also damages the training procedure. Specifically, at the beginning of the training, the prediction of both h2d and h3d are far from accurate, which will mislead the overall training and damage the performance. To tackle this problem, we design a Hierarchical Task Learning (HTL) to control weights for each task at each epoch.
          實(shí)驗(yàn)結(jié)果:
          KITTI test set
          沒啥好說的,照慣例,有提升。

          END



          雙一流大學(xué)研究生團(tuán)隊(duì)創(chuàng)建,專注于目標(biāo)檢測與深度學(xué)習(xí),希望可以將分享變成一種習(xí)慣!

          點(diǎn)贊三連,支持一下吧↓

          瀏覽 39
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  三级AV在线 | 91热爆TS人妖系列 | 国产精品大香蕉娱乐在线 | 老鸭窝在线免费视频 | 可以在线观看的AV |