損失函數(shù)理解匯總,結(jié)合 PyTorch1.7 和 TensorFlow2
TensorFlow2.3 PyTorch1.7.0

01?交叉熵?fù)p失(CrossEntropyLoss)
對于單事件的信息量而言,當(dāng)事件發(fā)生的概率越大時,信息量越小,需要明確的是,信息量是對于單個事件來說的,實際事件存在很多種可能,所以這個時候熵就派上用場了,熵是表示隨機變量不確定的度量,是對所有可能發(fā)生的事件產(chǎn)生的信息量的期望。交叉熵用來描述兩個分布之間的差距,交叉熵越小,假設(shè)分布離真實分布越近,模型越好。
在分類問題模型中(不一定是二分類),如邏輯回歸、神經(jīng)網(wǎng)絡(luò)等,在這些模型的最后通常會經(jīng)過一個sigmoid函數(shù)(softmax函數(shù)),輸出一個概率值(一組概率值),這個概率值反映了預(yù)測為正類的可能性(一組概率值反應(yīng)了所有分類的可能性)。而對于預(yù)測的概率分布和真實的概率分布之間,使用交叉熵來計算他們之間的差距,換句不嚴(yán)謹(jǐn)?shù)脑拋碚f,交叉熵?fù)p失函數(shù)的輸入,是softmax或者sigmoid函數(shù)的輸出。交叉熵?fù)p失可以從理論公式推導(dǎo)出幾個結(jié)論(優(yōu)點),具體公式推導(dǎo)不在這里詳細(xì)講解,如下:
預(yù)測的值跟目標(biāo)值越遠(yuǎn)時,參數(shù)調(diào)整就越快,收斂就越快;
不會陷入局部最優(yōu)解
交叉熵?fù)p失函數(shù)的標(biāo)準(zhǔn)形式(也就是二分類交叉熵?fù)p失)如下:

?表示樣本?
?的標(biāo)簽,正類為1,負(fù)類為0,?
?表示樣本?
?預(yù)測為正的概率。
?表示類別的數(shù)量,?
?表示變量(0或1),如果該類別和樣本?
?的類別相同就是1,否則是0,?
?表示對于觀測樣本?
?屬于類別?
?的預(yù)測概率。BinaryCrossentropy[1]:二分類,經(jīng)常搭配Sigmoid使用
tf.keras.losses.BinaryCrossentropy(from_logits=False, label_smoothing=0, reduction=losses_utils.ReductionV2.AUTO, name='binary_crossentropy')參數(shù):from_logits:默認(rèn)False。為True,表示接收到了原始的logits,為False表示輸出層經(jīng)過了概率處理(softmax)label_smoothing:[0,1]之間浮點值,加入噪聲,減少了真實樣本標(biāo)簽的類別在計算損失函數(shù)時的權(quán)重,最終起到抑制過擬合的效果。reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
binary_crossentropy[2]
tf.keras.losses.binary_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0)參數(shù):from_logits:默認(rèn)False。為True,表示接收到了原始的logits,為False表示輸出層經(jīng)過了概率處理(softmax)label_smoothing:[0,1]之間浮點值,加入噪聲,減少了真實樣本標(biāo)簽的類別在計算損失函數(shù)時的權(quán)重,最終起到抑制過擬合的效果。
CategoricalCrossentropy[3]:多分類,經(jīng)常搭配Softmax使用
tf.keras.losses.CategoricalCrossentropy(from_logits=False, label_smoothing=0, reduction=losses_utils.ReductionV2.AUTO, name='categorical_crossentropy')參數(shù):from_logits:默認(rèn)False。為True,表示接收到了原始的logits,為False表示輸出層經(jīng)過了概率處理(softmax)label_smoothing:[0,1]之間浮點值,加入噪聲,減少了真實樣本標(biāo)簽的類別在計算損失函數(shù)時的權(quán)重,最終起到抑制過擬合的效果。? reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
categorical_crossentropy[4]
tf.keras.losses.categorical_crossentropy(y_true, y_pred, from_logits=False, label_smoothing=0)參數(shù):from_logits:默認(rèn)False。為True,表示接收到了原始的logits,為False表示輸出層經(jīng)過了概率處理(softmax)label_smoothing:[0,1]之間浮點值,加入噪聲,減少了真實樣本標(biāo)簽的類別在計算損失函數(shù)時的權(quán)重,最終起到抑制過擬合的效果。
SparseCategoricalCrossentropy[5]:多分類,經(jīng)常搭配Softmax使用,和CategoricalCrossentropy不同之處在于,CategoricalCrossentropy是one-hot編碼,而SparseCategoricalCrossentropy使用一個位置整數(shù)表示類別
tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False, reduction=losses_utils.ReductionV2.AUTO, name='sparse_categorical_crossentropy')參數(shù):from_logits:默認(rèn)False。為True,表示接收到了原始的logits,為False表示輸出層經(jīng)過了概率處理(softmax)reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
sparse_categorical_crossentropy[6]
tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred, from_logits=False, axis=-1)參數(shù):from_logits:默認(rèn)False。為True,表示接收到了原始的logits,為False表示輸出層經(jīng)過了概率處理(softmax)axis:默認(rèn)是-1,計算交叉熵的維度
BCELoss[7]
torch.nn.BCELoss(weight: Optional[torch.Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean')參數(shù):weight:每個分類的縮放權(quán)重,傳入的大小必須和類別數(shù)量一至size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為True? reduction:string類型,'none'?|?'mean'?|?'sum'三種參數(shù)值
BCEWithLogitsLoss[8]:其實和TensorFlow是的`from_logits`參數(shù)很像,在BCELoss的基礎(chǔ)上合并了Sigmoid
torch.nn.BCEWithLogitsLoss(weight: Optional[torch.Tensor] = None, size_average=None, reduce=None, reduction: str = 'mean', pos_weight: Optional[torch.Tensor] = None)參數(shù):weight:每個分類的縮放權(quán)重,傳入的大小必須和類別數(shù)量一至size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction:string類型,'none' | 'mean' | 'sum'三種參數(shù)值pos_weight:正樣本的權(quán)重, 當(dāng)p>1,提高召回率,當(dāng)p<1,提高精確度??蛇_(dá)到權(quán)衡召回率(Recall)和精確度(Precision)的作用。
CrossEntropyLoss[9]
torch.nn.CrossEntropyLoss(weight: Optional[torch.Tensor]
= None, size_average=None, ignore_index: int = -100,
reduce=None, reduction: str = 'mean')
參數(shù):weight:每個分類的縮放權(quán)重,傳入的大小必須和類別數(shù)量一至size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和ignore_index:忽略某一類別,不計算其loss,其loss會為0,并且,在采用size_average時,不會計算那一類的loss,除的時候的分母也不會統(tǒng)計那一類的樣本reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction:string類型,'none' | 'mean' | 'sum'三種參數(shù)值
02 KL散度
我們在計算預(yù)測和真實標(biāo)簽之間損失時,需要拉近他們分布之間的差距,即模型得到的預(yù)測分布應(yīng)該與數(shù)據(jù)的實際分布情況盡可能相近。KL散度(相對熵)是用來衡量兩個概率分布之間的差異。模型需要得到最大似然估計,乘以負(fù)Log以后就相當(dāng)于求最小值,此時等價于求最小化KL散度(相對熵)。所以得到KL散度就得到了最大似然。又因為KL散度中包含兩個部分,第一部分是交叉熵,第二部分是信息熵,即KL=交叉熵?信息熵。信息熵是消除不確定性所需信息量的度量,簡單來說就是真實的概率分布,而這部分是固定的,所以優(yōu)化KL散度就是近似于優(yōu)化交叉熵。下面是KL散度的公式:

聯(lián)系上面的交叉熵,我們可以將公式簡化為(KL散度 = 交叉熵 - 熵):

監(jiān)督學(xué)習(xí)中,因為訓(xùn)練集中每個樣本的標(biāo)簽是已知的,此時標(biāo)簽和預(yù)測的標(biāo)簽之間的KL散度等價于交叉熵。
KLD | kullback_leibler_divergence[10]
tf.keras.losses.KLD(y_true,?y_pred)KLDivergence[11]
tf.keras.losses.KLDivergence(reduction=losses_utils.ReductionV2.AUTO, name='kl_divergence')參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
KLDivLoss[12]
torch.nn.KLDivLoss(size_average=None, reduce=None, reduction: str = 'mean', log_target: bool = False)參數(shù):size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction-三個值,none: 不使用約簡;mean:返回loss和的平均值;sum:返回loss的和。默認(rèn):meanlog_target:默認(rèn)False,指定是否在日志空間中傳遞目標(biāo)
03 平均絕對誤差(L1范數(shù)損失)
?與估計值?
?的絕對差值的總和?
?最小化:
梯度恒定,不論預(yù)測值是否接近真實值,這很容易導(dǎo)致發(fā)散,或者錯過極值點。 導(dǎo)數(shù)不連續(xù),導(dǎo)致求解困難。這也是L1損失函數(shù)不廣泛使用的主要原因。
收斂速度比L2損失函數(shù)要快,這是通過對比函數(shù)圖像得出來的,L1能提供更大且穩(wěn)定的梯度。 對異常的離群點有更好的魯棒性,下面會以例子證實。
MAE | mean_absolute_error[13]
tf.keras.losses.MAE(y_true, y_pred)MeanAbsoluteError[14]
tf.keras.losses.MeanAbsoluteError(reduction=losses_utils.ReductionV2.AUTO,?name='mean_absolute_error')參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
MeanAbsolutePercentageError[15]:平均絕對百分比誤差
tf.keras.losses.MeanAbsolutePercentageError(reduction=losses_utils.ReductionV2.AUTO, name='mean_absolute_percentage_error')公式:loss = 100 * abs(y_true - y_pred) / y_true參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
MAPE | mean_absolute_percentage_error[16]:平均絕對百分比誤差
tf.keras.losses.MAPE(y_true,?y_pred)公式:loss = 100 * mean(abs((y_true - y_pred) / y_true), axis=-1)
Huber[17]
tf.keras.losses.Huber(delta=1.0, reduction=losses_utils.ReductionV2.AUTO, name='huber_loss')公式:error = y_true - y_pred參數(shù):delta:float類型,Huber損失函數(shù)從二次變?yōu)榫€性的點。reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
L1Loss[18]
torch.nn.L1Loss(size_average=None, reduce=None, reduction: str = 'mean')參數(shù):size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction-三個值,none: 不使用約簡;mean:返回loss和的平均值;sum:返回loss的和。默認(rèn):mean
l1_loss[19]
torch.nn.functional.l1_loss(input,?target,?size_average=None,?reduce=None,?reduction='mean')SmoothL1Loss[20]:平滑版L1損失,也被稱為 Huber 損失函數(shù)。

?時,?
?,否則?
torch.nn.SmoothL1Loss(size_average=None, reduce=None, reduction: str = 'mean', beta: float = 1.0)參數(shù):size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction-三個值,none: 不使用約簡;mean:返回loss和的平均值;sum:返回loss的和。默認(rèn):meanbeta:默認(rèn)為1,指定在L1和L2損耗之間切換的閾值
smooth_l1_loss[21]
torch.nn.functional.smooth_l1_loss(input,?target,?size_average=None,?reduce=None,?reduction='mean',?beta=1.0)04 均方誤差損失(L2范數(shù)損失)
?與估計值?
?的差值的平方和?
?最小化:
收斂速度比L1慢,因為梯度會隨著預(yù)測值接近真實值而不斷減小。 對異常數(shù)據(jù)比L1敏感,這是平方項引起的,異常數(shù)據(jù)會引起很大的損失。
它使訓(xùn)練更容易,因為它的梯度隨著預(yù)測值接近真實值而不斷減小,那么它不會輕易錯過極值點,但也容易陷入局部最優(yōu)。 它的導(dǎo)數(shù)具有封閉解,優(yōu)化和編程非常容易,所以很多回歸任務(wù)都是用MSE作為損失函數(shù)。
MeanSquaredError[22]
tf.keras.losses.MeanSquaredError(reduction=losses_utils.ReductionV2.AUTO, name='mean_squared_error')公式:loss = square(y_true - y_pred)參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
MSE | mean_squared_error[23]
tf.keras.losses.MSE(y_true,?y_pred)公式:loss = mean(square(y_true - y_pred), axis=-1)
MeanSquaredLogarithmicError[24]
tf.keras.losses.MeanSquaredLogarithmicError(reduction=losses_utils.ReductionV2.AUTO, name='mean_squared_logarithmic_error')公式:loss = square(log(y_true + 1.) - log(y_pred + 1.))參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
MSLE | mean_squared_logarithmic_error[25]
tf.keras.losses.MSLE(y_true, y_pred)公式:loss = mean(square(log(y_true + 1) - log(y_pred + 1)), axis=-1)
MSELoss[26]
torch.nn.MSELoss(size_average=None, reduce=None, reduction: str = 'mean')參數(shù):size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction-三個值,none: 不使用約簡;mean:返回loss和的平均值;sum:返回loss的和。默認(rèn):mean
mse_loss[27]
torch.nn.functional.mse_loss(input, target, size_average=None, reduce=None, reduction='mean')05 Hinge loss
?,預(yù)測值?
?。二分類問題的目標(biāo)函數(shù)的要求如下:當(dāng)?
?大于等于?
?或者小于等于?
?時,都是分類器確定的分類結(jié)果,此時的損失函數(shù)loss為0。而當(dāng)預(yù)測值?
?時,分類器對分類結(jié)果不確定,loss不為0。顯然,當(dāng)?
?時,loss達(dá)到最大值。對于輸出?
?,當(dāng)前?
?的損失為:

CategoricalHinge[28]
tf.keras.losses.CategoricalHinge(reduction=losses_utils.ReductionV2.AUTO, name='categorical_hinge')公式:loss = maximum(neg - pos + 1, 0) where neg=maximum((1-y_true)*y_pred) and pos=sum(y_true*y_pred)參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
categorical_hinge[29]
tf.keras.losses.categorical_hinge(y_true, y_pred)公式:loss = maximum(neg - pos + 1, 0) where neg=maximum((1-y_true)*y_pred) and pos=sum(y_true*y_pred)
Hinge[30]
tf.keras.losses.Hinge(reduction=losses_utils.ReductionV2.AUTO, name='hinge')公式:loss = maximum(1 - y_true * y_pred, 0),y_true值應(yīng)為-1或1。如果提供了二進(jìn)制(0或1)標(biāo)簽,會將其轉(zhuǎn)換為-1或1參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
hinge[31]
tf.keras.losses.hinge(y_true, y_pred)公式:loss = mean(maximum(1 - y_true * y_pred, 0), axis=-1)
SquaredHinge[32]
tf.keras.losses.SquaredHinge(reduction=losses_utils.ReductionV2.AUTO, name='squared_hinge')公式:loss = square(maximum(1 - y_true * y_pred, 0)),y_true值應(yīng)為-1或1。如果提供了二進(jìn)制(0或1)標(biāo)簽,會將其轉(zhuǎn)換為-1或1。參數(shù):reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
squared_hinge[33]
tf.keras.losses.squared_hinge(y_true,?y_pred)公式:loss = mean(square(maximum(1 - y_true * y_pred, 0)), axis=-1)
HingeEmbeddingLoss[34]:當(dāng)?
?時,?
?,當(dāng)?
?時,?
torch.nn.HingeEmbeddingLoss(margin:?float?=?1.0,?size_average=None,?reduce=None,?reduction:?str?=?'mean')參數(shù):margin:float類型,默認(rèn)為1.size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction-三個值,none: 不使用約簡;mean:返回loss和的平均值;sum:返回loss的和。默認(rèn):mean
06 余弦相似度

CosineSimilarity[35]:請注意,所得值是介于-1和0之間的負(fù)數(shù),其中0表示正交性,而接近-1的值表示更大的相似性。如果y_true或y_pred是零向量,則余弦相似度將為0,而與預(yù)測值和目標(biāo)值之間的接近程度無關(guān)。
tf.keras.losses.CosineSimilarity(axis=-1,?reduction=losses_utils.ReductionV2.AUTO,?name='cosine_similarity')公式:loss = -sum(l2_norm(y_true) * l2_norm(y_pred))參數(shù):axis:默認(rèn)-1,沿其計算余弦相似度的維reduction:傳入tf.keras.losses.Reduction類型值,默認(rèn)AUTO,定義對損失的計算方式。
cosine_similarity[36]
tf.keras.losses.cosine_similarity(y_true, y_pred, axis=-1)公式:loss = -sum(l2_norm(y_true) * l2_norm(y_pred))參數(shù):axis:默認(rèn)-1,沿其計算余弦相似度的維
CosineEmbeddingLoss[37]:當(dāng)?
?時,?
?,當(dāng)?
?時,?
torch.nn.CosineEmbeddingLoss(margin: float = 0.0, size_average=None, reduce=None, reduction: str = 'mean')參數(shù):margin:float類型,應(yīng)為-1到1之間的數(shù)字,建議為0到0.5,默認(rèn)值為0size_average:bool類型,為True時,返回的loss為平均值,為False時,返回的各樣本的loss之和reduce:bool類型,返回值是否為標(biāo)量,默認(rèn)為Truereduction-三個值,none: 不使用約簡;mean:返回loss和的平均值;sum:返回loss的和。默認(rèn):mean
7 總結(jié)
外鏈地址:
[1] https://www.tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy
[2] https://www.tensorflow.org/api_docs/python/tf/keras/losses/binary_crossentropy
[3] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy
[4] https://www.tensorflow.org/api_docs/python/tf/keras/losses/categorical_crossentropy
[5] https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy
[6] https://www.tensorflow.org/api_docs/python/tf/keras/losses/sparse_categorical_crossentropy
[7] https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html
[8] https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html
[9] https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
[10] https://www.tensorflow.org/api_docs/python/tf/keras/losses/KLD
[11] https://www.tensorflow.org/api_docs/python/tf/keras/losses/KLDivergence
[12] https://pytorch.org/docs/stable/generated/torch.nn.KLDivLoss.html
[13] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MAE
[14] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanAbsoluteError
[15] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanAbsolutePercentageError
[16] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MAPE
[17] https://www.tensorflow.org/api_docs/python/tf/keras/losses/Huber
[18] https://pytorch.org/docs/stable/generated/torch.nn.L1Loss.html
[19] https://pytorch.org/docs/stable/nn.functional.html?highlight=loss#torch.nn.functional.l1_loss
[20] https://pytorch.org/docs/stable/generated/torch.nn.SmoothL1Loss.html
[21] https://pytorch.org/docs/stable/nn.functional.html?highlight=loss#torch.nn.functional.smooth_l1_loss
[22] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanSquaredError
[23] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MSE
[24] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanSquaredLogarithmicError
[25] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MSLE
[26] https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html
[27] https://pytorch.org/docs/stable/nn.functional.html?highlight=loss#torch.nn.functional.mse_loss
[28] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalHinge
[29] https://www.tensorflow.org/api_docs/python/tf/keras/losses/categorical_hinge
[30] https://www.tensorflow.org/api_docs/python/tf/keras/losses/Hinge
[31] https://www.tensorflow.org/api_docs/python/tf/keras/losses/hinge
[32] https://www.tensorflow.org/api_docs/python/tf/keras/losses/SquaredHinge
[33] https://www.tensorflow.org/api_docs/python/tf/keras/losses/squared_hinge
[34] https://pytorch.org/docs/stable/generated/torch.nn.HingeEmbeddingLoss.html
[35] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CosineSimilarity
[36] https://www.tensorflow.org/api_docs/python/tf/keras/losses/cosine_similarity
[37] https://pytorch.org/docs/stable/generated/torch.nn.CosineEmbeddingLoss.html
[1] https://www.tensorflow.org/api_docs/python/tf/keras/losses/BinaryCrossentropy
[2] https://www.tensorflow.org/api_docs/python/tf/keras/losses/binary_crossentropy
[3] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalCrossentropy
[4] https://www.tensorflow.org/api_docs/python/tf/keras/losses/categorical_crossentropy
[5] https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy
[6] https://www.tensorflow.org/api_docs/python/tf/keras/losses/sparse_categorical_crossentropy
[7] https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html
[8] https://pytorch.org/docs/stable/generated/torch.nn.BCEWithLogitsLoss.html
[9] https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
[10] https://www.tensorflow.org/api_docs/python/tf/keras/losses/KLD
[11] https://www.tensorflow.org/api_docs/python/tf/keras/losses/KLDivergence
[12] https://pytorch.org/docs/stable/generated/torch.nn.KLDivLoss.html
[13] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MAE
[14] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanAbsoluteError
[15] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanAbsolutePercentageError
[16] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MAPE
[17] https://www.tensorflow.org/api_docs/python/tf/keras/losses/Huber
[18] https://pytorch.org/docs/stable/generated/torch.nn.L1Loss.html
[19] https://pytorch.org/docs/stable/nn.functional.html?highlight=loss#torch.nn.functional.l1_loss
[20] https://pytorch.org/docs/stable/generated/torch.nn.SmoothL1Loss.html
[21] https://pytorch.org/docs/stable/nn.functional.html?highlight=loss#torch.nn.functional.smooth_l1_loss
[22] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanSquaredError
[23] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MSE
[24] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MeanSquaredLogarithmicError
[25] https://www.tensorflow.org/api_docs/python/tf/keras/losses/MSLE
[26] https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html
[27] https://pytorch.org/docs/stable/nn.functional.html?highlight=loss#torch.nn.functional.mse_loss
[28] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CategoricalHinge
[29] https://www.tensorflow.org/api_docs/python/tf/keras/losses/categorical_hinge
[30] https://www.tensorflow.org/api_docs/python/tf/keras/losses/Hinge
[31] https://www.tensorflow.org/api_docs/python/tf/keras/losses/hinge
[32] https://www.tensorflow.org/api_docs/python/tf/keras/losses/SquaredHinge
[33] https://www.tensorflow.org/api_docs/python/tf/keras/losses/squared_hinge
[34] https://pytorch.org/docs/stable/generated/torch.nn.HingeEmbeddingLoss.html
[35] https://www.tensorflow.org/api_docs/python/tf/keras/losses/CosineSimilarity
[36] https://www.tensorflow.org/api_docs/python/tf/keras/losses/cosine_similarity
[37] https://pytorch.org/docs/stable/generated/torch.nn.CosineEmbeddingLoss.html
往期精彩:
【原創(chuàng)首發(fā)】機器學(xué)習(xí)公式推導(dǎo)與代碼實現(xiàn)30講.pdf
【原創(chuàng)首發(fā)】深度學(xué)習(xí)語義分割理論與實戰(zhàn)指南.pdf
