圖像分類:來自Kaggle中13個(gè)項(xiàng)目的Tips和Tricks
導(dǎo)讀

這些比賽是:
Intel Image Classification: https://www.kaggle.com/puneet6060/intel-image-classification Recursion Cellular Image Classification: https://www.kaggle.com/c/recursion-cellular-image-classification SIIM-ISIC Melanoma Classification: https://www.kaggle.com/c/siim-isic-melanoma-classification APTOS 2019 Blindness Detection: https://www.kaggle.com/c/aptos2019-blindness-detection/notebooks Diabetic Retinopathy Detection: https://www.kaggle.com/c/diabetic-retinopathy-detection ML Project?—?Image Classification: https://www.kaggle.com/c/image-classification-fashion-mnist/notebooks Cdiscount’s Image Classification Challenge: https://www.kaggle.com/c/cdiscount-image-classification-challenge/notebooks
Plant seedlings classifications: https://www.kaggle.com/c/plant-seedlings-classification/notebooks
Aesthetic Visual Analysis: https://www.kaggle.com/c/aesthetic-visual-analysis/notebooks
我們會(huì)討論調(diào)試深度學(xué)習(xí)解決方案的三個(gè)主要方面:
數(shù)據(jù)
模型
損失函數(shù)
還有很多例子項(xiàng)目(和參考資料)供你參考。
數(shù)據(jù)
圖像預(yù)處理 + EDA

Visualisation: https://www.kaggle.com/allunia/protein-atlas-exploration-and-baseline#Building-a-baseline-model-
Dealing with Class imbalance: https://www.kaggle.com/rohandeysarkar/ultimate-image-classification-guide-2020 Fill missing values (labels, features and, etc.): https://www.kaggle.com/datafan07/analysis-of-melanoma-metadata-and-effnet-ensemble Normalisation?: https://www.kaggle.com/vincee/intel-image-classification-cnn-keras Pre-processing: https://www.kaggle.com/ratthachat/aptos-eye-preprocessing-in-diabetic-retinopathy#3.A-Important-Update-on-Color-Version-of-Cropping-&-Ben's-Preprocessing
數(shù)據(jù)增強(qiáng)

Horizontal Flip: https://www.kaggle.com/datafan07/analysis-of-melanoma-metadata-and-effnet-ensemble
Random Rotate and Random Dihedral: https://www.kaggle.com/iafoss/pretrained-resnet34-with-rgby-0-460-public-lb Hue, Saturation, Contrast, Brightness, Crop: https://www.kaggle.com/cdeotte/triple-stratified-kfold-with-tfrecords Colour jitter: https://www.kaggle.com/nroman/melanoma-pytorch-starter-efficientnet
模型

開發(fā)一個(gè)基線
用Jeremy Howard的名言:“你應(yīng)該能夠在15分鐘內(nèi)使用50%或更少的數(shù)據(jù)集快速測(cè)試你是否正在朝著一個(gè)有希望的方向前進(jìn),如果沒有,你必須重新考慮一切?!?span style="color: rgb(51, 51, 51);font-size: 17px;">
開發(fā)一個(gè)足夠大可以過擬合的模型
添加更多層
使用更好的結(jié)構(gòu)
更完善的訓(xùn)練流程
結(jié)構(gòu)
Residual Networks
Wide Residual Networks
Inception
EfficientNet
Swish activation
Residual Attention Network
訓(xùn)練過程
Mixed-Precision Training
Large Batch-Size Training
Cross-Validation Set
Weight Initialization
Self-Supervised Training (Knowledge Distillation)
Learning Rate Scheduler
Learning Rate Warmup
Early Stopping
Differential Learning Rates
Ensemble
Transfer Learning
Fine-Tuning
超參數(shù)調(diào)試

正則化
Adding Dropout: https://www.kaggle.com/allunia/protein-atlas-exploration-and-baseline Adding or changing the position of Batch Norm: https://www.kaggle.com/allunia/protein-atlas-exploration-and-baseline Data augmentation: https://www.kaggle.com/cdeotte/triple-stratified-kfold-with-tfrecords Mixup: https://arxiv.org/abs/1710.09412 Weight regularization: https://www.kaggle.com/allunia/protein-atlas-exploration-and-baseline Gradient clipping: https://www.kaggle.com/allunia/protein-atlas-exploration-and-baseline
損失函數(shù)

這里是一些最流行的損失函數(shù),與項(xiàng)目實(shí)例,你會(huì)發(fā)現(xiàn)一些技巧,以提高你的模型的能力:
Label smoothing Focal loss SparseMax loss and Weighted cross-entropy BCE loss, BCE with logits loss and Categorical cross-entropy loss Additive Angular Margin Loss for Deep Face Recognition
評(píng)估 + 錯(cuò)誤分析

在這里,我們做消融研究,并分析我們的實(shí)驗(yàn)結(jié)果。我們確定了我們的模型的弱點(diǎn)和長(zhǎng)處,并確定了未來需要改進(jìn)的地方。在這個(gè)階段,你可以使用以下技術(shù),并在鏈接的示例中查看它們是如何實(shí)現(xiàn)的:
Tracking metrics and Confusion matrix: https://www.kaggle.com/vincee/intel-image-classification-cnn-keras
Grad CAM: https://arxiv.org/pdf/1610.02391v1.pdf Test Time Augmentation (TTA): https://www.kaggle.com/iafoss/pretrained-resnet34-with-rgby-0-460-public-lb
最后
有許多方法來調(diào)整你的模型,并且新的想法總是會(huì)出現(xiàn)。深度學(xué)習(xí)是一個(gè)快速發(fā)展的領(lǐng)域,沒有什么靈丹妙藥。我們必須做很多實(shí)驗(yàn),足夠的試驗(yàn)和錯(cuò)誤會(huì)帶來突破。
—版權(quán)聲明—
來源丨AI公園 作者 | Prince Canuma 編譯 | ronghuaiyang
僅用于學(xué)術(shù)分享,版權(quán)屬于原作者。
若有侵權(quán),請(qǐng)聯(lián)系微信號(hào):yiyang-sy 刪除或修改!
