計算機視覺四大基本任務(分類、定位、檢測、分割)
點擊上方“小白學視覺”,選擇加"星標"或“置頂”
重磅干貨,第一時間送達
轉載于:作者?|?張皓 來源?|?知乎(https://zhuanlan.zhihu.com/p/31727402)
引言:深度學習目前已成為發(fā)展最快、最令人興奮的機器學習領域之一,許多卓有建樹的論文已經發(fā)表,而且已有很多高質量的開源深度學習框架可供使用。然而,論文通常非常簡明扼要并假設讀者已對深度學習有相當的理解,這使得初學者經常卡在一些概念的理解上,讀論文似懂非懂,十分吃力。另一方面,即使有了簡單易用的深度學習框架,如果對深度學習常見概念和基本思路不了解,面對現實任務時不知道如何設計、診斷、及調試網絡,最終仍會束手無策。



圖像分類(image classification)












目標定位(object localization)
目標檢測(object detection)










?損失對異常值比較敏感,由于有平方,異常值會有大的損失值,同時會有很大的梯度,使訓練時很容易發(fā)生梯度爆炸。而?
?損失的梯度不連續(xù)。在對數空間中,由于數值的動態(tài)范圍小了很多,回歸訓練起來也會容易很多。此外,也有人用平滑的?
?損失進行優(yōu)化。預先將回歸目標規(guī)范化也會有助于訓練。語義分割(semantic segmentation)




實例分割(instance segmentation)

參考文獻
V. Badrinarayanan, et al. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. TPAMI, 2017.
Y. Bengio, et al. Representation learning: A review and new perspectives. TPAMI, 2013.
L.-C. Chen, et al. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. PAMI, 2017.
S. Chetlur, et al. cuDNN: Efficient primitives for deep learning. arXiv: 1410.0759, 2014.
J. Cong, and B. Xiao. Minimizing computation in convolutional neural networks. ICANN, 2014.
J. Dai, et al. R-FCN: Object detection via region-based fully convolutional networks. NIPS, 2016.
A. Garcia-Garcia, et al. A review on deep learning techniques applied to semantic segmentation. arXiv: 1704.06857, 2017.
R. Girshick, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014.
R. Girshick. Fast R-CNN. ICCV, 2015.
K. He, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition. ECCV, 2014.
K. He, et al. Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. ICCV, 2015.
K. He, et al. Deep residual learning for image recognition. CVPR, 2016.
K. He, et al. Identity mappings in deep residual networks. ECCV, 2016.
K. He, et al. Mask R-CNN. ICCV, 2017.
J. Hu, et al. Squeeze-and-excitation networks. CVPR, 2018.
G. Huang, et al. Deep networks with stochastic depth. ECCV, 2016.
G. Huang, et al. Densely connected convolutional networks. CVPR, 2017.
J. Huang, et al. Speed/Accuracy trade-offs for modern convolutional object detectors. CVPR, 2017.
A. Krizhevsky, and G. Hinton. Learning multiple layers of features from tiny images. Technical Report, 2009.
A. Krizhevsky, et al. ImageNet classification with deep convolutional neural networks. NIPS, 2012.
A. Lavin, and S. Gray. Fast algorithms for convolutional neural networks. CVPR, 2016.
Y. LeCun, et al. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998.
M. Lin, et al. Network in network. ICLR, 2014.
T.-Y. Lin, et al. Microsoft COCO: Common objects in context. ECCV, 2014.
T.-Y. Lin, et al. Feature pyramid networks for object detection. CVPR, 2017.
T.-Y. Lin, et al. Focal loss for dense object detection. ICCV, 2017.
W. Liu, et al. SSD: Single shot multibox detector. ECCV, 2016.
J. Long, et al. Fully convolutional networks for semantic segmentation. CVPR, 2015.
H. Noh, et al. Learning deconvolution network for semantic segmentation. ICCV, 2015.
G. Pleiss, et al. Memory-efficient implementation of DenseNets. arXiv: 1707.06990, 2017.
J. Redmon, et al. You only look once: Unified, real-time object detection. CVPR, 2016.
S. Ren, et al. Faster R-CNN: Towards real-time object detection with region proposal networks. NIPS, 2015.
S. Ren, et al. Object detection networks on convolutional feature maps. TPAMI, 2017.
O. Ronneberger, et al. U-net: Convolutional networks for biomedical image segmentation. MICCAI, 2015.
O. Russakovsky, et al. ImageNet large scale visual recognition challenge. IJCV, 2015.
P. Sermanet, et al. OverFeat: Integrated recognition, localization, and detection using convolutional networks. ICLR, 2014.
A. Shrivastava, et al. Training region-based object detectors with online hard example mining. CVPR, 2016.
K. Simonyan, and A. Zisserman. Very deep convolutional networks for large-scale image recognition. ICLR, 2015.
J. T. Springenberg, et al. Striving for simplicity: The all convolutional net. ICLR Workshop, 2015.
V. Sze, et al. Efficient processing of deep neural networks: A tutorial and survey. Proceedings of IEEE, 2017.
C. Szegedy, et al. Going deep with convolutions. CVPR, 2015.
C. Szegedy, et al. Rethinking the Inception architecture for computer vision. CVPR, 2016.
C. Szegedy, et al. Inception v4, Inception-ResNet and the impact of residual connections on learning. AAAI, 2017.
A. Toshev, and C. Szegedy. DeepPose: Human pose estimation via deep neural networks. CVPR, 2014.
A. Veit, et al. Residual networks behave like ensembles of relatively shallow networks. NIPS, 2016.
S. Xie, et al. Aggregated residual transformations for deep neural networks. CVPR, 2017.
F. Yu, and V. Koltun. Multi-scale context aggregation by dilated convolutions. ICLR, 2016.
M. D. Zeiler, and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014.
S. Zheng, et al. Conditional random fields as recurrent neural networks. ICCV, 2015.
end
交流群
歡迎加入公眾號讀者群一起和同行交流,目前有SLAM、三維視覺、傳感器、自動駕駛、計算攝影、檢測、分割、識別、醫(yī)學影像、GAN、算法競賽等微信群(以后會逐漸細分),請掃描下面微信號加群,備注:”昵稱+學校/公司+研究方向“,例如:”張三?+?上海交大?+?視覺SLAM“。請按照格式備注,否則不予通過。添加成功后會根據研究方向邀請進入相關微信群。請勿在群內發(fā)送廣告,否則會請出群,謝謝理解~

