<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          使用scikit-learn為PyTorch 模型進行超參數(shù)網(wǎng)格搜索

          共 31014字,需瀏覽 63分鐘

           ·

          2023-03-04 09:16


          :Deephub Imba
          本文約8500,建議閱讀10分鐘
          本文介紹了如何使用 scikit-learn中的網(wǎng)格搜索功能來調(diào)整 PyTorch 深度學習模型的超參數(shù)。


          scikit-learn是Python中最好的機器學習庫,而PyTorch又為我們構(gòu)建模型提供了方便的操作,能否將它們的優(yōu)點整合起來呢?在本文中,我們將介紹如何使用 scikit-learn中的網(wǎng)格搜索功能來調(diào)整 PyTorch 深度學習模型的超參數(shù):

          • 如何包裝 PyTorch 模型以用于 scikit-learn 以及如何使用網(wǎng)格搜索
          • 如何網(wǎng)格搜索常見的神經(jīng)網(wǎng)絡(luò)參數(shù),如學習率、Dropout、epochs、神經(jīng)元數(shù)
          • 在自己的項目上定義自己的超參數(shù)調(diào)優(yōu)實驗


          如何在 scikit-learn 中使用 PyTorch 模型


          要讓PyTorch 模型可以在 scikit-learn 中使用的一個最簡單的方法是使用skorch包。這個包為 PyTorch 模型提供與 scikit-learn 兼容的 API。 在skorch中,有分類神經(jīng)網(wǎng)絡(luò)的NeuralNetClassifier和回歸神經(jīng)網(wǎng)絡(luò)的NeuralNetRegressor。

           pip install skorch



          要使用這些包裝器,必須使用 nn.Module 將 PyTorch 模型定義為類,然后在構(gòu)造 NeuralNetClassifier 類時將類的名稱傳遞給模塊參數(shù)。 例如:


           class MyClassifier(nn.Module):
              def __init__(self):
                  super().__init__()
                  ...
            
              def forward(self, x):
                  ...
                  return x
            
           # create the skorch wrapper
           model = NeuralNetClassifier(
              module=MyClassifier
           )


          NeuralNetClassifier 類的構(gòu)造函數(shù)可以獲得傳遞給 model.fit() 調(diào)用的參數(shù)(在 scikit-learn 模型中調(diào)用訓練循環(huán)的方法),例如輪次數(shù)和批量大小等。 例如:


           model = NeuralNetClassifier(
              module=MyClassifier,
              max_epochs=150,
              batch_size=10
           )

          NeuralNetClassifier類的構(gòu)造函數(shù)也可以接受新的參數(shù),這些參數(shù)可以傳遞給你的模型類的構(gòu)造函數(shù),要求是必須在它前面加上module__(兩個下劃線)。這些新參數(shù)可能在構(gòu)造函數(shù)中帶有默認值,但當包裝器實例化模型時,它們將被覆蓋。例如:

           import torch.nn as nn
           from skorch import NeuralNetClassifier
           
           class SonarClassifier(nn.Module):
              def __init__(self, n_layers=3):
                  super().__init__()
                  self.layers = []
                  self.acts = []
                  for i in range(n_layers):
                      self.layers.append(nn.Linear(60, 60))
                      self.acts.append(nn.ReLU())
                      self.add_module(f"layer{i}", self.layers[-1])
                      self.add_module(f"act{i}", self.acts[-1])
                  self.output = nn.Linear(60, 1)
           
              def forward(self, x):
                  for layer, act in zip(self.layers, self.acts):
                      x = act(layer(x))
                  x = self.output(x)
                  return x
           
           model = NeuralNetClassifier(
              module=SonarClassifier,
              max_epochs=150,
              batch_size=10,
              module__n_layers=2
           )

          我們可以通過初始化一個模型并打印來驗證結(jié)果:

           print(model.initialize())
           
           #結(jié)果如下:
           <class 'skorch.classifier.NeuralNetClassifier'>[initialized](
            module_=SonarClassifier(
              (layer0): Linear(in_features=60, out_features=60, bias=True)
              (act0): ReLU()
              (layer1): Linear(in_features=60, out_features=60, bias=True)
              (act1): ReLU()
              (output): Linear(in_features=60, out_features=1, bias=True)
            ),
           )


          在scikit-learn中使用網(wǎng)格搜索


          網(wǎng)格搜索是一種模型超參數(shù)優(yōu)化技術(shù)。它只是簡單地窮盡超參數(shù)的所有組合,并找到給出最佳分數(shù)的組合。在scikit-learn中,GridSearchCV類提供了這種技術(shù)。在構(gòu)造這個類時,必須在param_grid參數(shù)中提供一個超參數(shù)字典。這是模型參數(shù)名和要嘗試的值數(shù)組的映射。

          默認使用精度作為優(yōu)化的分數(shù),但其他分數(shù)可以在GridSearchCV構(gòu)造函數(shù)的score參數(shù)中指定。GridSearchCV將為每個參數(shù)組合構(gòu)建一個模型進行評估。并且使用默認的3倍交叉驗證,這些都是可以通過參數(shù)來進行設(shè)置的。

          下面是定義一個簡單網(wǎng)格搜索的例子:


           param_grid = {
              'epochs': [10,20,30]
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, Y)


          通過將GridSearchCV構(gòu)造函數(shù)中的n_jobs參數(shù)設(shè)置為 -1表示將使用機器上的所有核心。否則,網(wǎng)格搜索進程將只在單線程中運行,這在多核cpu中較慢。

          運行完畢就可以在grid.fit()返回的結(jié)果對象中訪問網(wǎng)格搜索的結(jié)果。best_score提供了在優(yōu)化過程中觀察到的最佳分數(shù),best_params_描述了獲得最佳結(jié)果的參數(shù)組合。

          示例問題描述


          我們的示例都將在一個小型標準機器學習數(shù)據(jù)集上進行演示,該數(shù)據(jù)集是一個糖尿病發(fā)作分類數(shù)據(jù)集。這是一個小型數(shù)據(jù)集,所有的數(shù)值屬性都很容易處理。

          如何調(diào)優(yōu)批大小和訓練的輪次


          在第一個簡單示例中,我們將介紹如何調(diào)優(yōu)批大小和擬合網(wǎng)絡(luò)時使用的epoch數(shù)。

          我們將簡單評估從10到100的不批大小,代碼清單如下所示:


           import random
           import numpy as np
           import torch
           import torch.nn as nn
           import torch.optim as optim
           from skorch import NeuralNetClassifier
           from sklearn.model_selection import GridSearchCV
           
           # load the dataset, split into input (X) and output (y) variables
           dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
           X = dataset[:,0:8]
           y = dataset[:,8]
           X = torch.tensor(X, dtype=torch.float32)
           y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)
           
           # PyTorch classifier
           class PimaClassifier(nn.Module):
              def __init__(self):
                  super().__init__()
                  self.layer = nn.Linear(8, 12)
                  self.act = nn.ReLU()
                  self.output = nn.Linear(12, 1)
                  self.prob = nn.Sigmoid()
           
              def forward(self, x):
                  x = self.act(self.layer(x))
                  x = self.prob(self.output(x))
                  return x
           
           # create model with skorch
           model = NeuralNetClassifier(
              PimaClassifier,
              criterion=nn.BCELoss,
              optimizer=optim.Adam,
              verbose=False
           )
           
           # define the grid search parameters
           param_grid = {
              'batch_size': [10, 20, 40, 60, 80, 100],
              'max_epochs': [10, 50, 100]
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, y)
           
           # summarize results
           print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
           means = grid_result.cv_results_['mean_test_score']
           stds = grid_result.cv_results_['std_test_score']
           params = grid_result.cv_results_['params']
           for mean, stdev, param in zip(means, stds, params):
              print("%f (%f) with: %r" % (mean, stdev, param))


          結(jié)果如下:


           Best: 0.714844 using {'batch_size': 10, 'max_epochs': 100}
           0.665365 (0.020505) with: {'batch_size': 10, 'max_epochs': 10}
           0.588542 (0.168055) with: {'batch_size': 10, 'max_epochs': 50}
           0.714844 (0.032369) with: {'batch_size': 10, 'max_epochs': 100}
           0.671875 (0.022326) with: {'batch_size': 20, 'max_epochs': 10}
           0.696615 (0.008027) with: {'batch_size': 20, 'max_epochs': 50}
           0.714844 (0.019918) with: {'batch_size': 20, 'max_epochs': 100}
           0.666667 (0.009744) with: {'batch_size': 40, 'max_epochs': 10}
           0.687500 (0.033603) with: {'batch_size': 40, 'max_epochs': 50}
           0.707031 (0.024910) with: {'batch_size': 40, 'max_epochs': 100}
           0.667969 (0.014616) with: {'batch_size': 60, 'max_epochs': 10}
           0.694010 (0.036966) with: {'batch_size': 60, 'max_epochs': 50}
           0.694010 (0.042473) with: {'batch_size': 60, 'max_epochs': 100}
           0.670573 (0.023939) with: {'batch_size': 80, 'max_epochs': 10}
           0.674479 (0.020752) with: {'batch_size': 80, 'max_epochs': 50}
           0.703125 (0.026107) with: {'batch_size': 80, 'max_epochs': 100}
           0.680990 (0.014382) with: {'batch_size': 100, 'max_epochs': 10}
           0.670573 (0.013279) with: {'batch_size': 100, 'max_epochs': 50}
           0.687500 (0.017758) with: {'batch_size': 100, 'max_epochs': 100}


          可以看到'batch_size': 10, 'max_epochs': 100達到了約71%的精度的最佳結(jié)果。

          如何調(diào)整訓練優(yōu)化器


          下面我們看看如何調(diào)整優(yōu)化器,我們知道有很多個優(yōu)化器可以選擇比如SDG,Adam等,那么如何選擇呢?

          完整的代碼如下:


           import numpy as np
           import torch
           import torch.nn as nn
           import torch.optim as optim
           from skorch import NeuralNetClassifier
           from sklearn.model_selection import GridSearchCV
           
           # load the dataset, split into input (X) and output (y) variables
           dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
           X = dataset[:,0:8]
           y = dataset[:,8]
           X = torch.tensor(X, dtype=torch.float32)
           y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)
           
           # PyTorch classifier
           class PimaClassifier(nn.Module):
              def __init__(self):
                  super().__init__()
                  self.layer = nn.Linear(8, 12)
                  self.act = nn.ReLU()
                  self.output = nn.Linear(12, 1)
                  self.prob = nn.Sigmoid()
           
              def forward(self, x):
                  x = self.act(self.layer(x))
                  x = self.prob(self.output(x))
                  return x
           
           # create model with skorch
           model = NeuralNetClassifier(
              PimaClassifier,
              criterion=nn.BCELoss,
              max_epochs=100,
              batch_size=10,
              verbose=False
           )
           
           # define the grid search parameters
           param_grid = {
              'optimizer': [optim.SGD, optim.RMSprop, optim.Adagrad, optim.Adadelta,
                            optim.Adam, optim.Adamax, optim.NAdam],
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, y)
           
           # summarize results
           print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
           means = grid_result.cv_results_['mean_test_score']
           stds = grid_result.cv_results_['std_test_score']
           params = grid_result.cv_results_['params']
           for mean, stdev, param in zip(means, stds, params):
              print("%f (%f) with: %r" % (mean, stdev, param))


          輸出如下:


           Best: 0.721354 using {'optimizer': <class 'torch.optim.adamax.Adamax'>}
           0.674479 (0.036828) with: {'optimizer': <class 'torch.optim.sgd.SGD'>}
           0.700521 (0.043303) with: {'optimizer': <class 'torch.optim.rmsprop.RMSprop'>}
           0.682292 (0.027126) with: {'optimizer': <class 'torch.optim.adagrad.Adagrad'>}
           0.572917 (0.051560) with: {'optimizer': <class 'torch.optim.adadelta.Adadelta'>}
           0.714844 (0.030758) with: {'optimizer': <class 'torch.optim.adam.Adam'>}
           0.721354 (0.019225) with: {'optimizer': <class 'torch.optim.adamax.Adamax'>}
           0.709635 (0.024360) with: {'optimizer': <class 'torch.optim.nadam.NAdam'>}


          可以看到對于我們的模型和數(shù)據(jù)集Adamax優(yōu)化算法是最佳的,準確率約為72%。

          如何調(diào)整學習率


          雖然pytorch里面學習率計劃可以讓我們根據(jù)輪次動態(tài)調(diào)整學習率,但是作為樣例,我們將學習率和學習率的參數(shù)作為網(wǎng)格搜索的一個參數(shù)來進行演示。在PyTorch中,設(shè)置學習率和動量的方法如下:


           optimizer = optim.SGD(lr=0.001, momentum=0.9)


          在skorch包中,使用前綴optimizer__將參數(shù)路由到優(yōu)化器。


           import numpy as np
           import torch
           import torch.nn as nn
           import torch.optim as optim
           from skorch import NeuralNetClassifier
           from sklearn.model_selection import GridSearchCV
           
           # load the dataset, split into input (X) and output (y) variables
           dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
           X = dataset[:,0:8]
           y = dataset[:,8]
           X = torch.tensor(X, dtype=torch.float32)
           y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)
           
           # PyTorch classifier
           class PimaClassifier(nn.Module):
              def __init__(self):
                  super().__init__()
                  self.layer = nn.Linear(8, 12)
                  self.act = nn.ReLU()
                  self.output = nn.Linear(12, 1)
                  self.prob = nn.Sigmoid()
           
              def forward(self, x):
                  x = self.act(self.layer(x))
                  x = self.prob(self.output(x))
                  return x
           
           # create model with skorch
           model = NeuralNetClassifier(
              PimaClassifier,
              criterion=nn.BCELoss,
              optimizer=optim.SGD,
              max_epochs=100,
              batch_size=10,
              verbose=False
           )
           
           # define the grid search parameters
           param_grid = {
              'optimizer__lr': [0.001, 0.01, 0.1, 0.2, 0.3],
              'optimizer__momentum': [0.0, 0.2, 0.4, 0.6, 0.8, 0.9],
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, y)
           
           # summarize results
           print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
           means = grid_result.cv_results_['mean_test_score']
           stds = grid_result.cv_results_['std_test_score']
           params = grid_result.cv_results_['params']
           for mean, stdev, param in zip(means, stds, params):
              print("%f (%f) with: %r" % (mean, stdev, param))


          結(jié)果如下:

           Best: 0.682292 using {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9}
           0.648438 (0.016877) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.0}
           0.671875 (0.017758) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.2}
           0.674479 (0.022402) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.4}
           0.677083 (0.011201) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.6}
           0.679688 (0.027621) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.8}
           0.682292 (0.026557) with: {'optimizer__lr': 0.001, 'optimizer__momentum': 0.9}
           0.671875 (0.019918) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.0}
           0.648438 (0.024910) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.2}
           0.546875 (0.143454) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.4}
           0.567708 (0.153668) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.6}
           0.552083 (0.141790) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.8}
           0.451823 (0.144561) with: {'optimizer__lr': 0.01, 'optimizer__momentum': 0.9}
           0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.0}
           0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.2}
           0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.4}
           0.450521 (0.142719) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.6}
           0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.8}
           0.348958 (0.001841) with: {'optimizer__lr': 0.1, 'optimizer__momentum': 0.9}
           0.444010 (0.136265) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.0}
           0.450521 (0.142719) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.2}
           0.348958 (0.001841) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.4}
           0.552083 (0.141790) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.6}
           0.549479 (0.142719) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.8}
           0.651042 (0.001841) with: {'optimizer__lr': 0.2, 'optimizer__momentum': 0.9}
           0.552083 (0.141790) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.0}
           0.348958 (0.001841) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.2}
           0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.4}
           0.552083 (0.141790) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.6}
           0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.8}
           0.450521 (0.142719) with: {'optimizer__lr': 0.3, 'optimizer__momentum': 0.9}


          對于SGD,使用0.001的學習率和0.9的動量獲得了最佳結(jié)果,準確率約為68%。

          如何激活函數(shù)


          激活函數(shù)控制單個神經(jīng)元的非線性。我們將演示評估PyTorch中可用的一些激活函數(shù)。


           import numpy as np
           import torch
           import torch.nn as nn
           import torch.nn.init as init
           import torch.optim as optim
           from skorch import NeuralNetClassifier
           from sklearn.model_selection import GridSearchCV
           
           # load the dataset, split into input (X) and output (y) variables
           dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
           X = dataset[:,0:8]
           y = dataset[:,8]
           X = torch.tensor(X, dtype=torch.float32)
           y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)
           
           # PyTorch classifier
           class PimaClassifier(nn.Module):
              def __init__(self, activation=nn.ReLU):
                  super().__init__()
                  self.layer = nn.Linear(8, 12)
                  self.act = activation()
                  self.output = nn.Linear(12, 1)
                  self.prob = nn.Sigmoid()
                  # manually init weights
                  init.kaiming_uniform_(self.layer.weight)
                  init.kaiming_uniform_(self.output.weight)
           
              def forward(self, x):
                  x = self.act(self.layer(x))
                  x = self.prob(self.output(x))
                  return x
           
           # create model with skorch
           model = NeuralNetClassifier(
              PimaClassifier,
              criterion=nn.BCELoss,
              optimizer=optim.Adamax,
              max_epochs=100,
              batch_size=10,
              verbose=False
           )
           
           # define the grid search parameters
           param_grid = {
              'module__activation': [nn.Identity, nn.ReLU, nn.ELU, nn.ReLU6,
                                      nn.GELU, nn.Softplus, nn.Softsign, nn.Tanh,
                                      nn.Sigmoid, nn.Hardsigmoid]
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, y)
           
           # summarize results
           print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
           means = grid_result.cv_results_['mean_test_score']
           stds = grid_result.cv_results_['std_test_score']
           params = grid_result.cv_results_['params']
           for mean, stdev, param in zip(means, stds, params):
              print("%f (%f) with: %r" % (mean, stdev, param))


          結(jié)果如下:


           Best: 0.699219 using {'module__activation': <class 'torch.nn.modules.activation.ReLU'>}
           0.687500 (0.025315) with: {'module__activation': <class 'torch.nn.modules.linear.Identity'>}
           0.699219 (0.011049) with: {'module__activation': <class 'torch.nn.modules.activation.ReLU'>}
           0.674479 (0.035849) with: {'module__activation': <class 'torch.nn.modules.activation.ELU'>}
           0.621094 (0.063549) with: {'module__activation': <class 'torch.nn.modules.activation.ReLU6'>}
           0.674479 (0.017566) with: {'module__activation': <class 'torch.nn.modules.activation.GELU'>}
           0.558594 (0.149189) with: {'module__activation': <class 'torch.nn.modules.activation.Softplus'>}
           0.675781 (0.014616) with: {'module__activation': <class 'torch.nn.modules.activation.Softsign'>}
           0.619792 (0.018688) with: {'module__activation': <class 'torch.nn.modules.activation.Tanh'>}
           0.643229 (0.019225) with: {'module__activation': <class 'torch.nn.modules.activation.Sigmoid'>}
           0.636719 (0.022326) with: {'module__activation': <class 'torch.nn.modules.activation.Hardsigmoid'>}


          ReLU激活函數(shù)獲得了最好的結(jié)果,準確率約為70%。

          如何調(diào)整Dropout參數(shù)


          在本例中,我們將嘗試在0.0到0.9之間的dropout百分比(1.0沒有意義)和在0到5之間的MaxNorm權(quán)重約束值。


           import numpy as np
           import torch
           import torch.nn as nn
           import torch.nn.init as init
           import torch.optim as optim
           from skorch import NeuralNetClassifier
           from sklearn.model_selection import GridSearchCV
           
           # load the dataset, split into input (X) and output (y) variables
           dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
           X = dataset[:,0:8]
           y = dataset[:,8]
           X = torch.tensor(X, dtype=torch.float32)
           y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)
           
           # PyTorch classifier
           class PimaClassifier(nn.Module):
              def __init__(self, dropout_rate=0.5, weight_constraint=1.0):
                  super().__init__()
                  self.layer = nn.Linear(8, 12)
                  self.act = nn.ReLU()
                  self.dropout = nn.Dropout(dropout_rate)
                  self.output = nn.Linear(12, 1)
                  self.prob = nn.Sigmoid()
                  self.weight_constraint = weight_constraint
                  # manually init weights
                  init.kaiming_uniform_(self.layer.weight)
                  init.kaiming_uniform_(self.output.weight)
           
              def forward(self, x):
                  # maxnorm weight before actual forward pass
                  with torch.no_grad():
                      norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2)
                      desired = torch.clamp(norm, max=self.weight_constraint)
                      self.layer.weight *= (desired / norm)
                  # actual forward pass
                  x = self.act(self.layer(x))
                  x = self.dropout(x)
                  x = self.prob(self.output(x))
                  return x
           
           # create model with skorch
           model = NeuralNetClassifier(
              PimaClassifier,
              criterion=nn.BCELoss,
              optimizer=optim.Adamax,
              max_epochs=100,
              batch_size=10,
              verbose=False
           )
           
           # define the grid search parameters
           param_grid = {
              'module__weight_constraint': [1.0, 2.0, 3.0, 4.0, 5.0],
              'module__dropout_rate': [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, y)
           
           # summarize results
           print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
           means = grid_result.cv_results_['mean_test_score']
           stds = grid_result.cv_results_['std_test_score']
           params = grid_result.cv_results_['params']
           for mean, stdev, param in zip(means, stds, params):
              print("%f (%f) with: %r" % (mean, stdev, param))


          結(jié)果如下:


           Best: 0.701823 using {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0}
           0.669271 (0.015073) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 1.0}
           0.692708 (0.035132) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 2.0}
           0.589844 (0.170180) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 3.0}
           0.561198 (0.151131) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 4.0}
           0.688802 (0.021710) with: {'module__dropout_rate': 0.0, 'module__weight_constraint': 5.0}
           0.697917 (0.009744) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 1.0}
           0.701823 (0.016367) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 2.0}
           0.694010 (0.010253) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 3.0}
           0.686198 (0.025976) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 4.0}
           0.679688 (0.026107) with: {'module__dropout_rate': 0.1, 'module__weight_constraint': 5.0}
           0.701823 (0.029635) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 1.0}
           0.682292 (0.014731) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 2.0}
           0.701823 (0.009744) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 3.0}
           0.701823 (0.026557) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 4.0}
           0.687500 (0.015947) with: {'module__dropout_rate': 0.2, 'module__weight_constraint': 5.0}
           0.686198 (0.006639) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 1.0}
           0.656250 (0.006379) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 2.0}
           0.565104 (0.155608) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 3.0}
           0.700521 (0.028940) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 4.0}
           0.669271 (0.012890) with: {'module__dropout_rate': 0.3, 'module__weight_constraint': 5.0}
           0.661458 (0.018688) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 1.0}
           0.669271 (0.017566) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 2.0}
           0.652344 (0.006379) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 3.0}
           0.680990 (0.037783) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 4.0}
           0.692708 (0.042112) with: {'module__dropout_rate': 0.4, 'module__weight_constraint': 5.0}
           0.666667 (0.006639) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 1.0}
           0.652344 (0.011500) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 2.0}
           0.662760 (0.007366) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 3.0}
           0.558594 (0.146610) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 4.0}
           0.552083 (0.141826) with: {'module__dropout_rate': 0.5, 'module__weight_constraint': 5.0}
           0.548177 (0.141826) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 1.0}
           0.653646 (0.013279) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 2.0}
           0.661458 (0.008027) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 3.0}
           0.553385 (0.142719) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 4.0}
           0.669271 (0.035132) with: {'module__dropout_rate': 0.6, 'module__weight_constraint': 5.0}
           0.662760 (0.015733) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 1.0}
           0.636719 (0.024910) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 2.0}
           0.550781 (0.146818) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 3.0}
           0.537760 (0.140094) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 4.0}
           0.542969 (0.138144) with: {'module__dropout_rate': 0.7, 'module__weight_constraint': 5.0}
           0.565104 (0.148654) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 1.0}
           0.657552 (0.008027) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 2.0}
           0.428385 (0.111418) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 3.0}
           0.549479 (0.142719) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 4.0}
           0.648438 (0.005524) with: {'module__dropout_rate': 0.8, 'module__weight_constraint': 5.0}
           0.540365 (0.136861) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 1.0}
           0.605469 (0.053083) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 2.0}
           0.553385 (0.139948) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 3.0}
           0.549479 (0.142719) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 4.0}
           0.595052 (0.075566) with: {'module__dropout_rate': 0.9, 'module__weight_constraint': 5.0}


          可以看到,10%的Dropout和2.0的權(quán)重約束獲得了70%的最佳精度。

          如何調(diào)整隱藏層神經(jīng)元的數(shù)量


          單層神經(jīng)元的數(shù)量是一個需要調(diào)優(yōu)的重要參數(shù)。一般來說,一層神經(jīng)元的數(shù)量控制著網(wǎng)絡(luò)的表示能力,至少在拓撲的這一點上是這樣。

          理論上來說:由于通用逼近定理,一個足夠大的單層網(wǎng)絡(luò)可以近似任何其他神經(jīng)網(wǎng)絡(luò)。

          在本例中,將嘗試從1到30的值,步驟為5。一個更大的網(wǎng)絡(luò)需要更多的訓練,至少批大小和epoch的數(shù)量應該與神經(jīng)元的數(shù)量一起優(yōu)化。


           import numpy as np
           import torch
           import torch.nn as nn
           import torch.nn.init as init
           import torch.optim as optim
           from skorch import NeuralNetClassifier
           from sklearn.model_selection import GridSearchCV
           
           # load the dataset, split into input (X) and output (y) variables
           dataset = np.loadtxt('pima-indians-diabetes.csv', delimiter=',')
           X = dataset[:,0:8]
           y = dataset[:,8]
           X = torch.tensor(X, dtype=torch.float32)
           y = torch.tensor(y, dtype=torch.float32).reshape(-1, 1)
           
           class PimaClassifier(nn.Module):
              def __init__(self, n_neurons=12):
                  super().__init__()
                  self.layer = nn.Linear(8, n_neurons)
                  self.act = nn.ReLU()
                  self.dropout = nn.Dropout(0.1)
                  self.output = nn.Linear(n_neurons, 1)
                  self.prob = nn.Sigmoid()
                  self.weight_constraint = 2.0
                  # manually init weights
                  init.kaiming_uniform_(self.layer.weight)
                  init.kaiming_uniform_(self.output.weight)
           
              def forward(self, x):
                  # maxnorm weight before actual forward pass
                  with torch.no_grad():
                      norm = self.layer.weight.norm(2, dim=0, keepdim=True).clamp(min=self.weight_constraint / 2)
                      desired = torch.clamp(norm, max=self.weight_constraint)
                      self.layer.weight *= (desired / norm)
                  # actual forward pass
                  x = self.act(self.layer(x))
                  x = self.dropout(x)
                  x = self.prob(self.output(x))
                  return x
           
           # create model with skorch
           model = NeuralNetClassifier(
              PimaClassifier,
              criterion=nn.BCELoss,
              optimizer=optim.Adamax,
              max_epochs=100,
              batch_size=10,
              verbose=False
           )
           
           # define the grid search parameters
           param_grid = {
              'module__n_neurons': [1, 5, 10, 15, 20, 25, 30]
           }
           grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
           grid_result = grid.fit(X, y)
           
           # summarize results
           print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
           means = grid_result.cv_results_['mean_test_score']
           stds = grid_result.cv_results_['std_test_score']
           params = grid_result.cv_results_['params']
           for mean, stdev, param in zip(means, stds, params):
              print("%f (%f) with: %r" % (mean, stdev, param))


          結(jié)果如下:


           Best: 0.708333 using {'module__n_neurons': 30}
           0.654948 (0.003683) with: {'module__n_neurons': 1}
           0.666667 (0.023073) with: {'module__n_neurons': 5}
           0.694010 (0.014382) with: {'module__n_neurons': 10}
           0.682292 (0.014382) with: {'module__n_neurons': 15}
           0.707031 (0.028705) with: {'module__n_neurons': 20}
           0.703125 (0.030758) with: {'module__n_neurons': 25}
           0.708333 (0.015733) with: {'module__n_neurons': 30}


          你可以看到,在隱藏層中有30個神經(jīng)元的網(wǎng)絡(luò)獲得了最好的結(jié)果,準確率約為71%。

          總結(jié)


          在這篇文章中,我們介紹了如何使用PyTorch和scikit-learn在Python中優(yōu)化深度學習網(wǎng)絡(luò)的超參數(shù)。如果你對skorch 感興趣,可以看看他的文檔
          https://skorch.readthedocs.io/en/latest/

          如果你對GridSearchCV 不熟悉,請先看它的文檔
          https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html

          編輯:王菁


          瀏覽 20
          點贊
          評論
          收藏
          分享

          手機掃一掃分享

          分享
          舉報
          評論
          圖片
          表情
          推薦
          點贊
          評論
          收藏
          分享

          手機掃一掃分享

          分享
          舉報
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  水蜜桃成人在线 | a片一级富二代表兄妹淫乱新春 | 69av在线播放 | 99热这里只有精品3 | 亚洲免费在线看 |