5分鐘掌握手動優(yōu)化機器學習模型超參數(shù)
可以使用隨機優(yōu)化算法代替網(wǎng)格和隨機搜索來進行超參數(shù)優(yōu)化。 如何使用隨機爬山算法調(diào)整 Perceptron 算法的超參數(shù)。 如何手動優(yōu)化 XGBoost 梯度提升算法的超參數(shù)。
手動超參數(shù)優(yōu)化 感知器超參數(shù)優(yōu)化 XGBoost 超參數(shù)優(yōu)化
https://machinelearningmastery.com/hyperparameter-optimization-with-random-search-and-grid-search/make_classification()函數(shù)來定義一個包含1,000行和五個輸入變量的二進制分類問題。下面的示例創(chuàng)建數(shù)據(jù)集并總結(jié)數(shù)據(jù)的形狀。# define a binary classification dataset
from sklearn.datasets import make_classification
# define dataset
X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1)
# summarize the shape of the dataset
print(X.shape, y.shape)
(1000, 5) (1000,)
# perceptron default hyperparameters for binary classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.linear_model import Perceptron
# define dataset
X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1)
# define model
model = Perceptron()
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report result
print('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))
Mean Accuracy: 0.786 (0.069)
學習率(eta0) 正則化(alpha)
Objective() 函數(shù)實現(xiàn)了這一點,采用了數(shù)據(jù)集和配置值列表。將配置值(學習率和正則化權(quán)重)解壓縮,用于配置模型,然后對模型進行評估,并返回平均準確度。# objective function
def objective(X, y, cfg):
# unpack config
eta, alpha = cfg
# define model
model = Perceptron(penalty='elasticnet', alpha=alpha, eta0=eta)
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# calculate mean accuracy
result = mean(scores)
return result
randn(),NumPy函數(shù)生成具有高斯分布的隨機數(shù)。下面的step()函數(shù)實現(xiàn)了這一點,并將在搜索空間中邁出一步,并使用現(xiàn)有配置生成新配置。# take a step in the search space
def step(cfg, step_size):
# unpack the configuration
eta, alpha = cfg
# step eta
new_eta = eta + randn() * step_size
# check the bounds of eta
if new_eta <= 0.0:
new_eta = 1e-8
# step alpha
new_alpha = alpha + randn() * step_size
# check the bounds of alpha
if new_alpha < 0.0:
new_alpha = 0.0
# return the new configuration
return [new_eta, new_alpha]
Objective()函數(shù)來評估候選解,而我們的step()函數(shù)將在搜索空間中邁出一步。搜索首先生成一個隨機初始解,在這種情況下,eta和alpha值在0到1范圍內(nèi)。然后評估初始解并將其視為當前最佳工作解。# starting point for the search
solution = [rand(), rand()]
# evaluate the initial point
solution_eval = objective(X, y, solution)
# take a step
candidate = step(solution, step_size)
# evaluate candidate point
candidte_eval = objective(X, y, candidate)
# check if we should keep the new point
if candidte_eval >= solution_eval:
# store the new point
solution, solution_eval = candidate, candidte_eval
# report progress
print('>%d, cfg=%s %.5f' % (i, solution, solution_eval))
hillclimbing()函數(shù)以數(shù)據(jù)集,目標函數(shù),迭代次數(shù)和步長為參數(shù),實現(xiàn)了用于調(diào)整 Perceptron 算法的隨機爬山算法。# hill climbing local search algorithm
def hillclimbing(X, y, objective, n_iter, step_size):
# starting point for the search
solution = [rand(), rand()]
# evaluate the initial point
solution_eval = objective(X, y, solution)
# run the hill climb
for i in range(n_iter):
# take a step
candidate = step(solution, step_size)
# evaluate candidate point
candidte_eval = objective(X, y, candidate)
# check if we should keep the new point
if candidte_eval >= solution_eval:
# store the new point
solution, solution_eval = candidate, candidte_eval
# report progress
print('>%d, cfg=%s %.5f' % (i, solution, solution_eval))
return [solution, solution_eval]
# define the total iterations
n_iter = 100
# step size in the search space
step_size = 0.1
# perform the hill climbing search
cfg, score = hillclimbing(X, y, objective, n_iter, step_size)
print('Done!')
print('cfg=%s: Mean Accuracy: %f' % (cfg, score))
# manually search perceptron hyperparameters for binary classification
from numpy import mean
from numpy.random import randn
from numpy.random import rand
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.linear_model import Perceptron
# objective function
def objective(X, y, cfg):
# unpack config
eta, alpha = cfg
# define model
model = Perceptron(penalty='elasticnet', alpha=alpha, eta0=eta)
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# calculate mean accuracy
result = mean(scores)
return result
# take a step in the search space
def step(cfg, step_size):
# unpack the configuration
eta, alpha = cfg
# step eta
new_eta = eta + randn() * step_size
# check the bounds of eta
if new_eta <= 0.0:
new_eta = 1e-8
# step alpha
new_alpha = alpha + randn() * step_size
# check the bounds of alpha
if new_alpha < 0.0:
new_alpha = 0.0
# return the new configuration
return [new_eta, new_alpha]
# hill climbing local search algorithm
def hillclimbing(X, y, objective, n_iter, step_size):
# starting point for the search
solution = [rand(), rand()]
# evaluate the initial point
solution_eval = objective(X, y, solution)
# run the hill climb
for i in range(n_iter):
# take a step
candidate = step(solution, step_size)
# evaluate candidate point
candidte_eval = objective(X, y, candidate)
# check if we should keep the new point
if candidte_eval >= solution_eval:
# store the new point
solution, solution_eval = candidate, candidte_eval
# report progress
print('>%d, cfg=%s %.5f' % (i, solution, solution_eval))
return [solution, solution_eval]
# define dataset
X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1)
# define the total iterations
n_iter = 100
# step size in the search space
step_size = 0.1
# perform the hill climbing search
cfg, score = hillclimbing(X, y, objective, n_iter, step_size)
print('Done!')
print('cfg=%s: Mean Accuracy: %f' % (cfg, score))
>0, cfg=[0.5827274503894747, 0.260872709578015] 0.70533
>4, cfg=[0.5449820307807399, 0.3017271170801444] 0.70567
>6, cfg=[0.6286475606495414, 0.17499090243915086] 0.71933
>7, cfg=[0.5956196828965779, 0.0] 0.78633
>8, cfg=[0.5878361167354715, 0.0] 0.78633
>10, cfg=[0.6353507984485595, 0.0] 0.78633
>13, cfg=[0.5690530537610675, 0.0] 0.78633
>17, cfg=[0.6650936023999641, 0.0] 0.78633
>22, cfg=[0.9070451625704087, 0.0] 0.78633
>23, cfg=[0.9253366187387938, 0.0] 0.78633
>26, cfg=[0.9966143540220266, 0.0] 0.78633
>31, cfg=[1.0048613895650054, 0.002162219228449132] 0.79133
Done!
cfg=[1.0048613895650054, 0.002162219228449132]: Mean Accuracy: 0.791333
sudo pip install xgboost
# xgboost
import xgboost
print("xgboost", xgboost.__version__)
xgboost 1.0.1
# define model
model = XGBClassifier()
# xgboost with default hyperparameters for binary classification
from numpy import mean
from numpy import std
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from xgboost import XGBClassifier
# define dataset
X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1)
# define model
model = XGBClassifier()
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# report result
print('Mean Accuracy: %.3f (%.3f)' % (mean(scores), std(scores)))
Mean Accuracy: 0.849 (0.040)
https://machinelearningmastery.com/configure-gradient-boosting-algorithm/學習率( learning_rate)樹數(shù)( n_estimators)子樣本百分比(子樣本) 樹深(最大深度)
Objective()函數(shù)以解包XGBoost模型的超參數(shù),對其進行配置,然后評估平均分類精度。# objective function
def objective(X, y, cfg):
# unpack config
lrate, n_tree, subsam, depth = cfg
# define model
model = XGBClassifier(learning_rate=lrate, n_estimators=n_tree, subsample=subsam, max_depth=depth)
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# calculate mean accuracy
result = mean(scores)
return result
step()函數(shù)。# take a step in the search space
def step(cfg):
# unpack config
lrate, n_tree, subsam, depth = cfg
# learning rate
lrate = lrate + randn() * 0.01
if lrate <= 0.0:
lrate = 1e-8
if lrate > 1:
lrate = 1.0
# number of trees
n_tree = round(n_tree + randn() * 50)
if n_tree <= 0.0:
n_tree = 1
# subsample percentage
subsam = subsam + randn() * 0.1
if subsam <= 0.0:
subsam = 1e-8
if subsam > 1:
subsam = 1.0
# max tree depth
depth = round(depth + randn() * 7)
if depth <= 1:
depth = 1
# return new config
return [lrate, n_tree, subsam, depth]
hillclimbing()算法,以定義具有適當值的初始解。在這種情況下,我們將使用合理的默認值,匹配默認的超參數(shù)或接近它們來定義初始解決方案。# starting point for the search
solution = step([0.1, 100, 1.0, 7])
# xgboost manual hyperparameter optimization for binary classification
from numpy import mean
from numpy.random import randn
from numpy.random import rand
from numpy.random import randint
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from xgboost import XGBClassifier
# objective function
def objective(X, y, cfg):
# unpack config
lrate, n_tree, subsam, depth = cfg
# define model
model = XGBClassifier(learning_rate=lrate, n_estimators=n_tree, subsample=subsam, max_depth=depth)
# define evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=10, n_repeats=3, random_state=1)
# evaluate model
scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# calculate mean accuracy
result = mean(scores)
return result
# take a step in the search space
def step(cfg):
# unpack config
lrate, n_tree, subsam, depth = cfg
# learning rate
lrate = lrate + randn() * 0.01
if lrate <= 0.0:
lrate = 1e-8
if lrate > 1:
lrate = 1.0
# number of trees
n_tree = round(n_tree + randn() * 50)
if n_tree <= 0.0:
n_tree = 1
# subsample percentage
subsam = subsam + randn() * 0.1
if subsam <= 0.0:
subsam = 1e-8
if subsam > 1:
subsam = 1.0
# max tree depth
depth = round(depth + randn() * 7)
if depth <= 1:
depth = 1
# return new config
return [lrate, n_tree, subsam, depth]
# hill climbing local search algorithm
def hillclimbing(X, y, objective, n_iter):
# starting point for the search
solution = step([0.1, 100, 1.0, 7])
# evaluate the initial point
solution_eval = objective(X, y, solution)
# run the hill climb
for i in range(n_iter):
# take a step
candidate = step(solution)
# evaluate candidate point
candidte_eval = objective(X, y, candidate)
# check if we should keep the new point
if candidte_eval >= solution_eval:
# store the new point
solution, solution_eval = candidate, candidte_eval
# report progress
print('>%d, cfg=[%s] %.5f' % (i, solution, solution_eval))
return [solution, solution_eval]
# define dataset
X, y = make_classification(n_samples=1000, n_features=5, n_informative=2, n_redundant=1, random_state=1)
# define the total iterations
n_iter = 200
# perform the hill climbing search
cfg, score = hillclimbing(X, y, objective, n_iter)
print('Done!')
print('cfg=[%s]: Mean Accuracy: %f' % (cfg, score))
>0, cfg=[[0.1058242692126418, 67, 0.9228490731610172, 12]] 0.85933
>1, cfg=[[0.11060813799692253, 51, 0.859353656735739, 13]] 0.86100
>4, cfg=[[0.11890247679234153, 58, 0.7135275461723894, 12]] 0.86167
>5, cfg=[[0.10226257987735601, 61, 0.6086462443373852, 17]] 0.86400
>15, cfg=[[0.11176962034280596, 106, 0.5592742266405146, 13]] 0.86500
>19, cfg=[[0.09493587069112454, 153, 0.5049124222437619, 34]] 0.86533
>23, cfg=[[0.08516531024154426, 88, 0.5895201311518876, 31]] 0.86733
>46, cfg=[[0.10092590898175327, 32, 0.5982811365027455, 30]] 0.86867
>75, cfg=[[0.099469211050998, 20, 0.36372573610040404, 32]] 0.86900
>96, cfg=[[0.09021536590375884, 38, 0.4725379807796971, 20]] 0.86900
>100, cfg=[[0.08979482274655906, 65, 0.3697395430835758, 14]] 0.87000
>110, cfg=[[0.06792737273465625, 89, 0.33827505722318224, 17]] 0.87000
>118, cfg=[[0.05544969684589669, 72, 0.2989721608535262, 23]] 0.87200
>122, cfg=[[0.050102976159097, 128, 0.2043203965148931, 24]] 0.87200
>123, cfg=[[0.031493266763680444, 120, 0.2998819062922256, 30]] 0.87333
>128, cfg=[[0.023324201169625292, 84, 0.4017169945431015, 42]] 0.87333
>140, cfg=[[0.020224220443108752, 52, 0.5088096815056933, 53]] 0.87367
Done!
cfg=[[0.020224220443108752, 52, 0.5088096815056933, 53]]: Mean Accuracy: 0.873667
作者:沂水寒城,CSDN博客專家,個人研究方向:機器學習、深度學習、NLP、CV
Blog: http://yishuihancheng.blog.csdn.net
贊 賞 作 者

更多閱讀
特別推薦

點擊下方閱讀原文加入社區(qū)會員
評論
圖片
表情
