【數(shù)據(jù)競賽】Kaggle秘技,用Sigmoid函數(shù)做回歸問題!
基于Sigmoid的回歸損失函數(shù)設(shè)計

這是一個非常有意思的Loss設(shè)計,在你的問題是回歸問題的時候,都可以考慮嘗試使用一下,并不能保證所有的問題都能奏效,但是在某些特定的問題中卻可以帶來巨大的提升,最不濟也可以作為一個用于后期stacking的方案。
該方案是設(shè)計者是:數(shù)據(jù)科學(xué)家danzel ,作者對于該設(shè)計奏效的原因描述如下,
I used a sigmoid-output and scaled its range afterwards (to look like the target). Training like this helps the model to converge faster and gives better results.

假設(shè)對于我們的回歸的問題為最小化平方損失,而且我們第個標簽為,
, 為我們的樣本個數(shù);
1. Baseline Loss
一般都是Dense(1,activation = 'linear')的形式
2. 基于Sigmoid的Loss
是Dense(1,activation = 'sigmoid') * (max_val - min_val) + min_val的形式; ,

上面說的究竟靠譜不靠譜呢?我們摘取kaggle數(shù)據(jù)進行實驗,眼見為真。有興趣的朋友可以去文末鏈接下載。
1.導(dǎo)入工具包
1.1導(dǎo)入使用的工具包
import?pandas????????????????as?pd?
from?sklearn.metrics?????????import?mean_squared_error
from?sklearn.model_selection?import?KFold
import?xgboost???????????????as?xgb
from???tqdm??????????????????import?tqdm
import?numpy?????????????????as?np
import?pandas????????????????as?pd?
import?tensorflow????????????as?tf?
from?lightgbm????????????????import?LGBMRegressor
from?sklearn.model_selection?import?KFold
import?numpy?????????????????as?np
import?seaborn???????????????as?sns
from?sklearn.metrics?????????import?mean_squared_error
def?RMSE(y_true,?y_pred):
????return?tf.sqrt(tf.reduce_mean(tf.square(y_true?-?y_pred)))
1.2 數(shù)據(jù)讀取
train?=?pd.read_csv('./data/train.csv')
test??=?pd.read_csv('./data/test.csv')
sub???=?pd.read_csv('./data/sample_submission.csv')
2. 數(shù)據(jù)預(yù)處理
2.1 數(shù)據(jù)拼接
train_test?=?pd.concat([train,test],axis=0,ignore_index=True)
train_test.head()
| id | cont1 | cont2 | cont3 | cont4 | cont5 | cont6 | cont7 | cont8 | cont9 | cont10 | cont11 | cont12 | cont13 | cont14 | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0.670390 | 0.811300 | 0.643968 | 0.291791 | 0.284117 | 0.855953 | 0.890700 | 0.285542 | 0.558245 | 0.779418 | 0.921832 | 0.866772 | 0.878733 | 0.305411 | 7.243043 |
| 1 | 3 | 0.388053 | 0.621104 | 0.686102 | 0.501149 | 0.643790 | 0.449805 | 0.510824 | 0.580748 | 0.418335 | 0.432632 | 0.439872 | 0.434971 | 0.369957 | 0.369484 | 8.203331 |
| 2 | 4 | 0.834950 | 0.227436 | 0.301584 | 0.293408 | 0.606839 | 0.829175 | 0.506143 | 0.558771 | 0.587603 | 0.823312 | 0.567007 | 0.677708 | 0.882938 | 0.303047 | 7.776091 |
| 3 | 5 | 0.820708 | 0.160155 | 0.546887 | 0.726104 | 0.282444 | 0.785108 | 0.752758 | 0.823267 | 0.574466 | 0.580843 | 0.769594 | 0.818143 | 0.914281 | 0.279528 | 6.957716 |
| 4 | 8 | 0.935278 | 0.421235 | 0.303801 | 0.880214 | 0.665610 | 0.830131 | 0.487113 | 0.604157 | 0.874658 | 0.863427 | 0.983575 | 0.900464 | 0.935918 | 0.435772 | 7.951046 |
2.2. 用于神經(jīng)網(wǎng)絡(luò)預(yù)處理的GaussianRank
如果希望知道細節(jié),可以參考之前分享的RankGaussian的部分
import?numpy?as?np
from?joblib?import?Parallel,?delayed
from?scipy.interpolate?import?interp1d
from?scipy.special?import?erf,?erfinv
from?sklearn.preprocessing?import?QuantileTransformer,PowerTransformer
from?sklearn.base?import?BaseEstimator,?TransformerMixin
from?sklearn.utils.validation?import?FLOAT_DTYPES,?check_array,?check_is_fitted
class?GaussRankScaler(BaseEstimator,?TransformerMixin):
????"""Transform?features?by?scaling?each?feature?to?a?normal?distribution.
????Parameters
????????----------
????????epsilon?:?float,?optional,?default?1e-4
????????????A?small?amount?added?to?the?lower?bound?or?subtracted
????????????from?the?upper?bound.?This?value?prevents?infinite?number
????????????from?occurring?when?applying?the?inverse?error?function.
????????copy?:?boolean,?optional,?default?True
????????????If?False,?try?to?avoid?a?copy?and?do?inplace?scaling?instead.
????????????This?is?not?guaranteed?to?always?work?inplace;?e.g.?if?the?data?is
????????????not?a?NumPy?array,?a?copy?may?still?be?returned.
????????n_jobs?:?int?or?None,?optional,?default?None
????????????Number?of?jobs?to?run?in?parallel.
????????????``None``?means?1?and?``-1``?means?using?all?processors.
????????interp_kind?:?str?or?int,?optional,?default?'linear'
???????????Specifies?the?kind?of?interpolation?as?a?string
????????????('linear',?'nearest',?'zero',?'slinear',?'quadratic',?'cubic',
????????????'previous',?'next',?where?'zero',?'slinear',?'quadratic'?and?'cubic'
????????????refer?to?a?spline?interpolation?of?zeroth,?first,?second?or?third
????????????order;?'previous'?and?'next'?simply?return?the?previous?or?next?value
????????????of?the?point)?or?as?an?integer?specifying?the?order?of?the?spline
????????????interpolator?to?use.
????????interp_copy?:?bool,?optional,?default?False
????????????If?True,?the?interpolation?function?makes?internal?copies?of?x?and?y.
????????????If?False,?references?to?`x`?and?`y`?are?used.
????????Attributes
????????----------
????????interp_func_?:?list
????????????The?interpolation?function?for?each?feature?in?the?training?set.
????????"""
????def?__init__(self,?epsilon=1e-4,?copy=True,?n_jobs=None,?interp_kind='linear',?interp_copy=False):
????????self.epsilon?????=?epsilon
????????self.copy????????=?copy
????????self.interp_kind?=?interp_kind
????????self.interp_copy?=?interp_copy
????????self.fill_value??=?'extrapolate'
????????self.n_jobs??????=?n_jobs
????def?fit(self,?X,?y=None):
????????"""Fit?interpolation?function?to?link?rank?with?original?data?for?future?scaling
????????Parameters
????????----------
????????X?:?array-like,?shape?(n_samples,?n_features)
????????????The?data?used?to?fit?interpolation?function?for?later?scaling?along?the?features?axis.
????????y
????????????Ignored
????????"""
????????X?=?check_array(X,?copy=self.copy,?estimator=self,?dtype=FLOAT_DTYPES,?force_all_finite=True)
????????self.interp_func_?=?Parallel(n_jobs=self.n_jobs)(delayed(self._fit)(x)?for?x?in?X.T)
????????return?self
????def?_fit(self,?x):
????????x?=?self.drop_duplicates(x)
????????rank?=?np.argsort(np.argsort(x))
????????bound?=?1.0?-?self.epsilon
????????factor?=?np.max(rank)?/?2.0?*?bound
????????scaled_rank?=?np.clip(rank?/?factor?-?bound,?-bound,?bound)
????????return?interp1d(
????????????x,?scaled_rank,?kind=self.interp_kind,?copy=self.interp_copy,?fill_value=self.fill_value)
????def?transform(self,?X,?copy=None):
????????"""Scale?the?data?with?the?Gauss?Rank?algorithm
????????Parameters
????????----------
????????X?:?array-like,?shape?(n_samples,?n_features)
????????????The?data?used?to?scale?along?the?features?axis.
????????copy?:?bool,?optional?(default:?None)
????????????Copy?the?input?X?or?not.
????????"""
????????check_is_fitted(self,?'interp_func_')
????????copy?=?copy?if?copy?is?not?None?else?self.copy
????????X?=?check_array(X,?copy=copy,?estimator=self,?dtype=FLOAT_DTYPES,?force_all_finite=True)
????????X?=?np.array(Parallel(n_jobs=self.n_jobs)(delayed(self._transform)(i,?x)?for?i,?x?in?enumerate(X.T))).T
????????return?X
????def?_transform(self,?i,?x):
????????return?erfinv(self.interp_func_[i](x))
????def?inverse_transform(self,?X,?copy=None):
????????"""Scale?back?the?data?to?the?original?representation
????????Parameters
????????----------
????????X?:?array-like,?shape?[n_samples,?n_features]
????????????The?data?used?to?scale?along?the?features?axis.
????????copy?:?bool,?optional?(default:?None)
????????????Copy?the?input?X?or?not.
????????"""
????????check_is_fitted(self,?'interp_func_')
????????copy?=?copy?if?copy?is?not?None?else?self.copy
????????X?=?check_array(X,?copy=copy,?estimator=self,?dtype=FLOAT_DTYPES,?force_all_finite=True)
????????X?=?np.array(Parallel(n_jobs=self.n_jobs)(delayed(self._inverse_transform)(i,?x)?for?i,?x?in?enumerate(X.T))).T
????????return?X
????def?_inverse_transform(self,?i,?x):
????????inv_interp_func?=?interp1d(self.interp_func_[i].y,?self.interp_func_[i].x,?kind=self.interp_kind,
???????????????????????????????????copy=self.interp_copy,?fill_value=self.fill_value)
????????return?inv_interp_func(erf(x))
????@staticmethod
????def?drop_duplicates(x):
????????is_unique?=?np.zeros_like(x,?dtype=bool)
????????is_unique[np.unique(x,?return_index=True)[1]]?=?True
????????return?x[is_unique]
2.3 RankGaussian處理
feature_names?=?['cont1',?'cont2',?'cont3',?'cont4',?'cont5',?'cont6',?'cont7','cont8',?'cont9',?'cont10',?'cont11',?'cont12',?'cont13',?'cont14']
scaler_linear????=?GaussRankScaler(interp_kind='linear',)?
for?c?in?feature_names:
????train_test[c+'_linear_grank']?=?scaler_linear.fit_transform(train_test[c].values.reshape(-1,1))
????
gaussian_linear_feature_names?=?[c?+?'_linear_grank'?for?c?in?feature_names]
3. NN模型建模
from?tensorflow.keras?import?regularizers
from?sklearn.model_selection?import?KFold,?StratifiedKFold
import?tensorflow?as?tf
#?import?tensorflow_addons?as?tfa
import?tensorflow.keras.backend?as?K
from?tensorflow.keras.layers?import?*
from?tensorflow.keras.models?import?*
from?tensorflow.keras.optimizers?import?*
from?tensorflow.keras.callbacks?import?*
from?tensorflow.keras.layers?import?Input
import?os?
3.1 訓(xùn)練&驗證劃分
隨機劃分訓(xùn)練集和驗證集
tr?=?train_test.iloc[:train.shape[0],:].copy()
te?=?train_test.iloc[train.shape[0]:,:].copy()?
kf??????????=?KFold(n_splits=5,random_state=48,shuffle=False)?
cnt?????????=?0
for?trn_idx,?test_idx?in?kf.split(tr,tr['target']):
????if?cnt?==?0:
????????cnt?+=?1
????????continue
????X_tr_gbdt,X_val_gbdt?=?tr[feature_names].iloc[trn_idx],tr[feature_names].iloc[test_idx]
????X_tr_dnn_linear_gaussian,X_val_dnn_linear_gaussian?=?tr[gaussian_linear_feature_names].iloc[trn_idx],tr[gaussian_linear_feature_names].iloc[test_idx]
????y_tr,y_val?=?tr['target'].iloc[trn_idx],train['target'].iloc[test_idx]
????break
/home/inf/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_split.py:297: FutureWarning: Setting a random_state has no effect since shuffle is False. This will raise an error in 0.24. You should leave random_state to its default (None), or set shuffle=True.
FutureWarning
3.2 MLP模型(sigmoid):0.7108
基于sigmoid的回歸
class?MLP_Model(tf.keras.Model):?
????def?__init__(self):
????????super(MLP_Model,?self).__init__()?
????????self.dense1?=Dense(1000,?activation='relu')??
????????self.drop1??=?Dropout(0.25)
????????self.dense2?=Dense(500,?activation='relu')??
????????self.drop2??=?Dropout(0.25)?
????????self.dense_out?=Dense(1,activation='sigmoid')?
????def?call(self,?inputs):
????????min_target?=?0
????????max_target?=?10.26757
????????x1??????=?self.dense1(inputs)
????????x1??????=?self.drop1(x1)
????????x2??????=?self.dense2(x1)
????????x2??????=?self.drop2(x2)
????????outputs??????=?self.dense_out(x2)
????????outputs??=??outputs?*?(max_target?-?min_target)?+?min_target??
????????return?outputs
import?time??
def?RMSE(y_true,?y_pred):
????return?tf.sqrt(tf.reduce_mean(tf.square(y_true?-?y_pred)))
model?=?MLP_Model()
adam?=?tf.optimizers.Adam(lr=1e-3)
model.compile(optimizer=adam,?loss=RMSE)
K.clear_session()?
model_weights?=?f'./models/model_gauss_mlp_mlp.h5'
checkpoint?=?ModelCheckpoint(model_weights,?monitor='loss',?verbose=0,?save_best_only=True,?mode='min',
?????????????????????????????save_weights_only=True)
plateau????????=?ReduceLROnPlateau(monitor='val_loss',?factor=0.5,?patience=10,?verbose=1,?min_delta=1e-4,?mode='min')
early_stopping?=?EarlyStopping(monitor="val_loss",?patience=25)
history?=?model.fit(X_tr_dnn_linear_gaussian.values,?y_tr.values,
????????????????????????validation_data=(X_val_dnn_linear_gaussian.values,?y_val.values),
????????????????????batch_size=1024,?epochs=100,
????????????????????callbacks=[plateau,?checkpoint,?early_stopping],
????????????????????verbose=2
???????????????????)?
WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Failed to parse source code of >, which Python reported as:
def call(self, inputs):
min_target = 0
max_target = 10.26757
x1 = self.dense1(inputs)
x1 = self.drop1(x1)
x2 = self.dense2(x1)
x2 = self.drop2(x2)
outputs = self.dense_out(x2)
outputs = outputs * (max_target - min_target) + min_target
# outputs = self.dense_out(x3) # 1500 original
return outputs
This may be caused by multiline strings or comments not indented at the same level as the code.
WARNING: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Failed to parse source code of >, which Python reported as:
def call(self, inputs):
min_target = 0
max_target = 10.26757
x1 = self.dense1(inputs)
x1 = self.drop1(x1)
x2 = self.dense2(x1)
x2 = self.drop2(x2)
outputs = self.dense_out(x2)
outputs = outputs * (max_target - min_target) + min_target
# outputs = self.dense_out(x3) # 1500 original
return outputs
This may be caused by multiline strings or comments not indented at the same level as the code.
Train on 240000 samples, validate on 60000 samples
Epoch 1/100
WARNING:tensorflow:Entity .initialize_variables at 0x7f4818c32950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Num'
WARNING: Entity .initialize_variables at 0x7f4818c32950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Num'
240000/240000 - 1s - loss: 0.8020 - val_loss: 0.7203
Epoch 2/100
240000/240000 - 0s - loss: 0.7345 - val_loss: 0.7225
Epoch 3/100
240000/240000 - 0s - loss: 0.7290 - val_loss: 0.7183
Epoch 4/100
240000/240000 - 0s - loss: 0.7270 - val_loss: 0.7197
Epoch 5/100
240000/240000 - 0s - loss: 0.7247 - val_loss: 0.7170
Epoch 6/100
240000/240000 - 0s - loss: 0.7232 - val_loss: 0.7190
Epoch 7/100
240000/240000 - 0s - loss: 0.7227 - val_loss: 0.7157
Epoch 8/100
240000/240000 - 0s - loss: 0.7205 - val_loss: 0.7215
Epoch 9/100
240000/240000 - 0s - loss: 0.7199 - val_loss: 0.7144
Epoch 10/100
240000/240000 - 0s - loss: 0.7185 - val_loss: 0.7148
Epoch 11/100
240000/240000 - 0s - loss: 0.7175 - val_loss: 0.7176
Epoch 12/100
240000/240000 - 0s - loss: 0.7170 - val_loss: 0.7147
Epoch 13/100
240000/240000 - 0s - loss: 0.7165 - val_loss: 0.7142
Epoch 14/100
240000/240000 - 0s - loss: 0.7157 - val_loss: 0.7140
Epoch 15/100
240000/240000 - 0s - loss: 0.7150 - val_loss: 0.7132
Epoch 16/100
240000/240000 - 0s - loss: 0.7145 - val_loss: 0.7127
Epoch 17/100
240000/240000 - 0s - loss: 0.7136 - val_loss: 0.7127
Epoch 18/100
240000/240000 - 0s - loss: 0.7131 - val_loss: 0.7124
Epoch 19/100
240000/240000 - 0s - loss: 0.7126 - val_loss: 0.7165
Epoch 20/100
240000/240000 - 0s - loss: 0.7120 - val_loss: 0.7130
Epoch 21/100
240000/240000 - 0s - loss: 0.7116 - val_loss: 0.7119
Epoch 22/100
240000/240000 - 0s - loss: 0.7111 - val_loss: 0.7129
Epoch 23/100
240000/240000 - 0s - loss: 0.7104 - val_loss: 0.7129
Epoch 24/100
240000/240000 - 0s - loss: 0.7102 - val_loss: 0.7136
Epoch 25/100
240000/240000 - 0s - loss: 0.7097 - val_loss: 0.7120
Epoch 26/100
240000/240000 - 0s - loss: 0.7089 - val_loss: 0.7126
Epoch 27/100
240000/240000 - 0s - loss: 0.7084 - val_loss: 0.7154
Epoch 28/100
240000/240000 - 0s - loss: 0.7078 - val_loss: 0.7111
Epoch 29/100
240000/240000 - 0s - loss: 0.7075 - val_loss: 0.7132
Epoch 30/100
240000/240000 - 0s - loss: 0.7074 - val_loss: 0.7126
Epoch 31/100
240000/240000 - 0s - loss: 0.7062 - val_loss: 0.7129
Epoch 32/100
240000/240000 - 0s - loss: 0.7059 - val_loss: 0.7119
Epoch 33/100
240000/240000 - 0s - loss: 0.7054 - val_loss: 0.7135
Epoch 34/100
240000/240000 - 0s - loss: 0.7048 - val_loss: 0.7108
Epoch 35/100
240000/240000 - 0s - loss: 0.7048 - val_loss: 0.7116
Epoch 36/100
240000/240000 - 0s - loss: 0.7037 - val_loss: 0.7161
Epoch 37/100
240000/240000 - 0s - loss: 0.7034 - val_loss: 0.7131
Epoch 38/100
240000/240000 - 0s - loss: 0.7031 - val_loss: 0.7148
Epoch 39/100
240000/240000 - 0s - loss: 0.7022 - val_loss: 0.7113
Epoch 40/100
240000/240000 - 0s - loss: 0.7013 - val_loss: 0.7117
Epoch 41/100
240000/240000 - 0s - loss: 0.7012 - val_loss: 0.7124
Epoch 42/100
240000/240000 - 0s - loss: 0.7008 - val_loss: 0.7116
Epoch 43/100
240000/240000 - 0s - loss: 0.7001 - val_loss: 0.7124
Epoch 44/100
Epoch 00044: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
240000/240000 - 0s - loss: 0.6995 - val_loss: 0.7113
Epoch 45/100
240000/240000 - 0s - loss: 0.6962 - val_loss: 0.7116
Epoch 46/100
240000/240000 - 0s - loss: 0.6954 - val_loss: 0.7118
Epoch 47/100
240000/240000 - 0s - loss: 0.6940 - val_loss: 0.7116
Epoch 48/100
240000/240000 - 0s - loss: 0.6938 - val_loss: 0.7120
Epoch 49/100
240000/240000 - 0s - loss: 0.6930 - val_loss: 0.7118
Epoch 50/100
240000/240000 - 0s - loss: 0.6927 - val_loss: 0.7123
Epoch 51/100
240000/240000 - 0s - loss: 0.6920 - val_loss: 0.7123
Epoch 52/100
240000/240000 - 0s - loss: 0.6915 - val_loss: 0.7125
Epoch 53/100
240000/240000 - 0s - loss: 0.6912 - val_loss: 0.7144
Epoch 54/100
Epoch 00054: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
240000/240000 - 0s - loss: 0.6905 - val_loss: 0.7146
Epoch 55/100
240000/240000 - 0s - loss: 0.6885 - val_loss: 0.7123
Epoch 56/100
240000/240000 - 0s - loss: 0.6874 - val_loss: 0.7135
Epoch 57/100
240000/240000 - 0s - loss: 0.6872 - val_loss: 0.7136
Epoch 58/100
240000/240000 - 0s - loss: 0.6868 - val_loss: 0.7138
Epoch 59/100
240000/240000 - 0s - loss: 0.6863 - val_loss: 0.7134
3.3 MLP模型(linear):0.7137
class?MLP_Model(tf.keras.Model):
????def?__init__(self):
????????super(MLP_Model,?self).__init__()?
????????self.dense1?=Dense(1000,?activation='relu')??
????????self.drop1??=?Dropout(0.25)
????????self.dense2?=Dense(500,?activation='relu')?
????????self.drop2??=?Dropout(0.25)?
????????self.dense_out?=Dense(1)
????def?call(self,?inputs):?
????????x1??????=?self.dense1(inputs)
????????x1??????=?self.drop1(x1)
????????x2??????=?self.dense2(x1)
????????x2??????=?self.drop2(x2)
????????outputs?=?self.dense_out(x2)?
????????
????????return?outputs
model?=?MLP_Model()
adam?=?tf.optimizers.Adam(lr=1e-3)?
model.compile(optimizer=adam,?loss=RMSE)
K.clear_session()?
model_weights?=?f'./models/model_gauss_mlp_mlp.h5'
checkpoint?=?ModelCheckpoint(model_weights,?monitor='loss',?verbose=0,?save_best_only=True,?mode='min',
?????????????????????????????save_weights_only=True)
plateau????????=?ReduceLROnPlateau(monitor='val_loss',?factor=0.5,?patience=10,?verbose=1,?min_delta=1e-4,?mode='min')
early_stopping?=?EarlyStopping(monitor="val_loss",?patience=25)
history?=?model.fit(X_tr_dnn_linear_gaussian.values,?y_tr.values,
????????????????????????validation_data=(X_val_dnn_linear_gaussian.values,?y_val.values),
????????????????????batch_size=1024,?epochs=100,
????????????????????callbacks=[plateau,?checkpoint,?early_stopping],
????????????????????verbose=2
???????????????????)?
WARNING:tensorflow:Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
WARNING: Entity > could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
Train on 240000 samples, validate on 60000 samples
Epoch 1/100
WARNING:tensorflow:Entity .initialize_variables at 0x7f4818c487a0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Num'
WARNING: Entity .initialize_variables at 0x7f4818c487a0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Num'
240000/240000 - 1s - loss: 1.3292 - val_loss: 0.7767
Epoch 2/100
240000/240000 - 0s - loss: 0.8163 - val_loss: 0.7251
Epoch 3/100
240000/240000 - 0s - loss: 0.8072 - val_loss: 0.7251
Epoch 4/100
240000/240000 - 0s - loss: 0.8040 - val_loss: 0.7496
Epoch 5/100
240000/240000 - 0s - loss: 0.7997 - val_loss: 0.7324
Epoch 6/100
240000/240000 - 0s - loss: 0.7982 - val_loss: 0.7271
Epoch 7/100
240000/240000 - 0s - loss: 0.7936 - val_loss: 0.7202
Epoch 8/100
240000/240000 - 0s - loss: 0.7950 - val_loss: 0.7249
Epoch 9/100
240000/240000 - 0s - loss: 0.7914 - val_loss: 0.7284
Epoch 10/100
240000/240000 - 0s - loss: 0.7882 - val_loss: 0.7313
Epoch 11/100
240000/240000 - 0s - loss: 0.7886 - val_loss: 0.7303
Epoch 12/100
240000/240000 - 0s - loss: 0.7857 - val_loss: 0.7292
Epoch 13/100
240000/240000 - 0s - loss: 0.7855 - val_loss: 0.7257
Epoch 14/100
240000/240000 - 0s - loss: 0.7847 - val_loss: 0.7204
Epoch 15/100
240000/240000 - 0s - loss: 0.7825 - val_loss: 0.7224
Epoch 16/100
240000/240000 - 0s - loss: 0.7813 - val_loss: 0.7220
Epoch 17/100
Epoch 00017: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
240000/240000 - 0s - loss: 0.7808 - val_loss: 0.7208
Epoch 18/100
240000/240000 - 0s - loss: 0.7752 - val_loss: 0.7187
Epoch 19/100
240000/240000 - 0s - loss: 0.7743 - val_loss: 0.7234
Epoch 20/100
240000/240000 - 0s - loss: 0.7730 - val_loss: 0.7190
Epoch 21/100
240000/240000 - 0s - loss: 0.7750 - val_loss: 0.7196
Epoch 22/100
240000/240000 - 0s - loss: 0.7742 - val_loss: 0.7286
Epoch 23/100
240000/240000 - 0s - loss: 0.7722 - val_loss: 0.7198
Epoch 24/100
240000/240000 - 0s - loss: 0.7720 - val_loss: 0.7227
Epoch 25/100
240000/240000 - 0s - loss: 0.7724 - val_loss: 0.7176
Epoch 26/100
240000/240000 - 0s - loss: 0.7705 - val_loss: 0.7194
Epoch 27/100
240000/240000 - 0s - loss: 0.7689 - val_loss: 0.7206
Epoch 28/100
240000/240000 - 0s - loss: 0.7696 - val_loss: 0.7168
Epoch 29/100
240000/240000 - 0s - loss: 0.7695 - val_loss: 0.7171
Epoch 30/100
240000/240000 - 0s - loss: 0.7681 - val_loss: 0.7164
Epoch 31/100
240000/240000 - 0s - loss: 0.7676 - val_loss: 0.7225
Epoch 32/100
240000/240000 - 0s - loss: 0.7681 - val_loss: 0.7177
Epoch 33/100
240000/240000 - 0s - loss: 0.7660 - val_loss: 0.7198
Epoch 34/100
240000/240000 - 0s - loss: 0.7668 - val_loss: 0.7202
Epoch 35/100
240000/240000 - 0s - loss: 0.7653 - val_loss: 0.7160
Epoch 36/100
240000/240000 - 0s - loss: 0.7647 - val_loss: 0.7248
Epoch 37/100
240000/240000 - 0s - loss: 0.7638 - val_loss: 0.7173
Epoch 38/100
240000/240000 - 0s - loss: 0.7626 - val_loss: 0.7197
Epoch 39/100
240000/240000 - 0s - loss: 0.7624 - val_loss: 0.7182
Epoch 40/100
240000/240000 - 0s - loss: 0.7615 - val_loss: 0.7195
Epoch 41/100
240000/240000 - 0s - loss: 0.7621 - val_loss: 0.7195
Epoch 42/100
240000/240000 - 0s - loss: 0.7616 - val_loss: 0.7192
Epoch 43/100
240000/240000 - 0s - loss: 0.7604 - val_loss: 0.7162
Epoch 44/100
240000/240000 - 0s - loss: 0.7592 - val_loss: 0.7152
Epoch 45/100
240000/240000 - 0s - loss: 0.7600 - val_loss: 0.7193
Epoch 46/100
240000/240000 - 0s - loss: 0.7594 - val_loss: 0.7206
Epoch 47/100
240000/240000 - 0s - loss: 0.7578 - val_loss: 0.7201
Epoch 48/100
240000/240000 - 0s - loss: 0.7583 - val_loss: 0.7164
Epoch 49/100
240000/240000 - 0s - loss: 0.7581 - val_loss: 0.7163
Epoch 50/100
240000/240000 - 0s - loss: 0.7572 - val_loss: 0.7163
Epoch 51/100
240000/240000 - 0s - loss: 0.7554 - val_loss: 0.7166
Epoch 52/100
240000/240000 - 0s - loss: 0.7564 - val_loss: 0.7212
Epoch 53/100
240000/240000 - 0s - loss: 0.7560 - val_loss: 0.7156
Epoch 54/100
Epoch 00054: ReduceLROnPlateau reducing learning rate to 0.0002500000118743628.
240000/240000 - 0s - loss: 0.7547 - val_loss: 0.7180
Epoch 55/100
240000/240000 - 0s - loss: 0.7530 - val_loss: 0.7154
Epoch 56/100
240000/240000 - 0s - loss: 0.7534 - val_loss: 0.7150
Epoch 57/100
240000/240000 - 0s - loss: 0.7531 - val_loss: 0.7148
Epoch 58/100
240000/240000 - 0s - loss: 0.7530 - val_loss: 0.7156
Epoch 59/100
240000/240000 - 0s - loss: 0.7523 - val_loss: 0.7166
Epoch 60/100
240000/240000 - 0s - loss: 0.7522 - val_loss: 0.7152
Epoch 61/100
240000/240000 - 0s - loss: 0.7520 - val_loss: 0.7155
Epoch 62/100
240000/240000 - 0s - loss: 0.7514 - val_loss: 0.7148
Epoch 63/100
240000/240000 - 0s - loss: 0.7514 - val_loss: 0.7149
Epoch 64/100
240000/240000 - 0s - loss: 0.7506 - val_loss: 0.7156
Epoch 65/100
240000/240000 - 0s - loss: 0.7508 - val_loss: 0.7150
Epoch 66/100
240000/240000 - 0s - loss: 0.7516 - val_loss: 0.7154
Epoch 67/100
Epoch 00067: ReduceLROnPlateau reducing learning rate to 0.0001250000059371814.
240000/240000 - 0s - loss: 0.7507 - val_loss: 0.7153
Epoch 68/100
240000/240000 - 0s - loss: 0.7502 - val_loss: 0.7149
Epoch 69/100
240000/240000 - 0s - loss: 0.7497 - val_loss: 0.7147
Epoch 70/100
240000/240000 - 0s - loss: 0.7496 - val_loss: 0.7148
Epoch 71/100
240000/240000 - 0s - loss: 0.7502 - val_loss: 0.7142
Epoch 72/100
240000/240000 - 0s - loss: 0.7492 - val_loss: 0.7148
Epoch 73/100
240000/240000 - 0s - loss: 0.7487 - val_loss: 0.7148
Epoch 74/100
240000/240000 - 0s - loss: 0.7485 - val_loss: 0.7143
Epoch 75/100
240000/240000 - 0s - loss: 0.7496 - val_loss: 0.7154
Epoch 76/100
240000/240000 - 0s - loss: 0.7482 - val_loss: 0.7144
Epoch 77/100
240000/240000 - 0s - loss: 0.7488 - val_loss: 0.7142
Epoch 78/100
240000/240000 - 0s - loss: 0.7492 - val_loss: 0.7145
Epoch 79/100
240000/240000 - 0s - loss: 0.7483 - val_loss: 0.7143
Epoch 80/100
240000/240000 - 0s - loss: 0.7478 - val_loss: 0.7143
Epoch 81/100
Epoch 00081: ReduceLROnPlateau reducing learning rate to 6.25000029685907e-05.
240000/240000 - 0s - loss: 0.7481 - val_loss: 0.7143
Epoch 82/100
240000/240000 - 0s - loss: 0.7480 - val_loss: 0.7146
Epoch 83/100
240000/240000 - 0s - loss: 0.7477 - val_loss: 0.7141
Epoch 84/100
240000/240000 - 0s - loss: 0.7471 - val_loss: 0.7139
Epoch 85/100
240000/240000 - 0s - loss: 0.7475 - val_loss: 0.7140
Epoch 86/100
240000/240000 - 0s - loss: 0.7473 - val_loss: 0.7141
Epoch 87/100
240000/240000 - 0s - loss: 0.7469 - val_loss: 0.7141
Epoch 88/100
240000/240000 - 0s - loss: 0.7474 - val_loss: 0.7148
Epoch 89/100
240000/240000 - 0s - loss: 0.7467 - val_loss: 0.7138
Epoch 90/100
240000/240000 - 0s - loss: 0.7466 - val_loss: 0.7142
Epoch 91/100
240000/240000 - 0s - loss: 0.7460 - val_loss: 0.7141
Epoch 92/100
240000/240000 - 0s - loss: 0.7465 - val_loss: 0.7138
Epoch 93/100
240000/240000 - 0s - loss: 0.7469 - val_loss: 0.7142
Epoch 94/100
240000/240000 - 0s - loss: 0.7467 - val_loss: 0.7141
Epoch 95/100
240000/240000 - 0s - loss: 0.7465 - val_loss: 0.7148
Epoch 96/100
240000/240000 - 0s - loss: 0.7465 - val_loss: 0.7138
Epoch 97/100
240000/240000 - 0s - loss: 0.7461 - val_loss: 0.7138
Epoch 98/100
240000/240000 - 0s - loss: 0.7456 - val_loss: 0.7140
Epoch 99/100
Epoch 00099: ReduceLROnPlateau reducing learning rate to 3.125000148429535e-05.
240000/240000 - 0s - loss: 0.7463 - val_loss: 0.7139
Epoch 100/100
240000/240000 - 0s - loss: 0.7461 - val_loss: 0.7137

https://www.kaggle.com/c/tabular-playground-series-jan-2021/data https://www.kaggle.com/c/tabular-playground-series-jan-2021/discussion/216037
往期精彩回顧
本站知識星球“黃博的機器學(xué)習(xí)圈子”(92416895)
本站qq群704220115。
加入微信群請掃碼:
評論
圖片
表情
