<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          【機(jī)器學(xué)習(xí)】太香啦!只需一行Python代碼就可以自動完成模型訓(xùn)練!

          共 28099字,需瀏覽 57分鐘

           ·

          2021-05-24 16:34

          自動化機(jī)器學(xué)習(xí)(Auto-ML)是指數(shù)據(jù)科學(xué)模型開發(fā)的管道組件自動化。AutoML 減少了數(shù)據(jù)科學(xué)家的工作量并加快了工作流程。AutoML 可用于自動化各種管道組件,包括數(shù)據(jù)理解,EDA,數(shù)據(jù)處理,模型訓(xùn)練,超參數(shù)調(diào)整等。

          對于端到端機(jī)器學(xué)習(xí)項(xiàng)目,每個組件的復(fù)雜性取決于項(xiàng)目。我們知道市面上有很多的 AutoML 開源庫可加快開發(fā)的速度。在本文中,我將分享一個非常棒的python工具庫「LazyPredict」

          什么是LazyPredict?

          LazyPredict是一個開源Python庫,可自動執(zhí)行模型訓(xùn)練管道并加快工作流程。LazyPredict可以為分類數(shù)據(jù)集訓(xùn)練約30個分類模型,為回歸數(shù)據(jù)集訓(xùn)練約40個回歸模型。

          LazyPredict將返回經(jīng)過訓(xùn)練的模型以及其性能指標(biāo),而無需編寫太多代碼??梢暂p松比較每個模型的性能指標(biāo),并調(diào)整最佳模型以進(jìn)一步提高性能。

          安裝

          可以使用以下方法從PyPl庫中安裝LazyPredict:

          pip install lazypredict

          安裝后,可以導(dǎo)入庫以執(zhí)行分類和回歸模型的自動訓(xùn)練。

          from lazypredict.Supervised import LazyRegressor, LazyClassifier

          用法

          LazyPredict 同時支持分類和回歸問題,因此我將利用案例說明:波士頓住房(回歸)和泰坦尼克號(分類)數(shù)據(jù)集用于LazyPredict庫的演示。

          分類任務(wù)

          LazyPredict 的用法非常直觀,類似于scikit-learn。首先為分類任務(wù)創(chuàng)建一個估計(jì)器 LazyClassifier 的實(shí)例,可以通過自定義指標(biāo)進(jìn)行評估,默認(rèn)情況下,每個模型都將根據(jù)準(zhǔn)確性,ROC、AUC得分, F1-score進(jìn)行評估。

          在進(jìn)行 lazypredict 模型訓(xùn)練之前,必須先讀取數(shù)據(jù)集并進(jìn)行處理,以使其適合訓(xùn)練。在進(jìn)行特征工程并將數(shù)據(jù)拆分為訓(xùn)練測試數(shù)據(jù)之后,我們可以使用 LazyPredict 進(jìn)行模型訓(xùn)練。

          # LazyClassifier Instance and fiting data
          cls= LazyClassifier(ignore_warnings=False, custom_metric=None)
          models, predictions = cls.fit(X_train, X_test, y_train, y_test)

          回歸任務(wù)

          與分類模型訓(xùn)練相似,LazyPredict附帶了針對回歸數(shù)據(jù)集的自動模型訓(xùn)練。該實(shí)現(xiàn)類似于分類任務(wù),只是實(shí)例LazyRegressor有所更改。

          reg = LazyRegressor(ignore_warnings=False, custom_metric=None)
          models, predictions = reg.fit(X_train, X_test, y_train, y_test)

          觀察以上性能指標(biāo),AdaBoost分類器是分類任務(wù)的最佳表現(xiàn)模型,而GradientBoostingRegressor模型是回歸任務(wù)的最佳表現(xiàn)模型。

          完整版案例

          分類

          from lazypredict.Supervised import LazyClassifier
          from sklearn.datasets import load_breast_cancer
          from sklearn.model_selection import train_test_split

          data = load_breast_cancer()
          X = data.data
          y= data.target

          X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123)

          clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
          models,predictions = clf.fit(X_train, X_test, y_train, y_test)

          print(models)


          | Model                          |   Accuracy |   Balanced Accuracy |   ROC AUC |   F1 Score |   Time Taken |
          |:-------------------------------|-----------:|--------------------:|----------:|-----------:|-------------:|
          | LinearSVC                      |   0.989474 |            0.987544 |  0.987544 |   0.989462 |    0.0150008 |
          | SGDClassifier                  |   0.989474 |            0.987544 |  0.987544 |   0.989462 |    0.0109992 |
          | MLPClassifier                  |   0.985965 |            0.986904 |  0.986904 |   0.985994 |    0.426     |
          | Perceptron                     |   0.985965 |            0.984797 |  0.984797 |   0.985965 |    0.0120046 |
          | LogisticRegression             |   0.985965 |            0.98269  |  0.98269  |   0.985934 |    0.0200036 |
          | LogisticRegressionCV           |   0.985965 |            0.98269  |  0.98269  |   0.985934 |    0.262997  |
          | SVC                            |   0.982456 |            0.979942 |  0.979942 |   0.982437 |    0.0140011 |
          | CalibratedClassifierCV         |   0.982456 |            0.975728 |  0.975728 |   0.982357 |    0.0350015 |
          | PassiveAggressiveClassifier    |   0.975439 |            0.974448 |  0.974448 |   0.975464 |    0.0130005 |
          | LabelPropagation               |   0.975439 |            0.974448 |  0.974448 |   0.975464 |    0.0429988 |
          | LabelSpreading                 |   0.975439 |            0.974448 |  0.974448 |   0.975464 |    0.0310006 |
          | RandomForestClassifier         |   0.97193  |            0.969594 |  0.969594 |   0.97193  |    0.033     |
          | GradientBoostingClassifier     |   0.97193  |            0.967486 |  0.967486 |   0.971869 |    0.166998  |
          | QuadraticDiscriminantAnalysis  |   0.964912 |            0.966206 |  0.966206 |   0.965052 |    0.0119994 |
          | HistGradientBoostingClassifier |   0.968421 |            0.964739 |  0.964739 |   0.968387 |    0.682003  |
          | RidgeClassifierCV              |   0.97193  |            0.963272 |  0.963272 |   0.971736 |    0.0130029 |
          | RidgeClassifier                |   0.968421 |            0.960525 |  0.960525 |   0.968242 |    0.0119977 |
          | AdaBoostClassifier             |   0.961404 |            0.959245 |  0.959245 |   0.961444 |    0.204998  |
          | ExtraTreesClassifier           |   0.961404 |            0.957138 |  0.957138 |   0.961362 |    0.0270066 |
          | KNeighborsClassifier           |   0.961404 |            0.95503  |  0.95503  |   0.961276 |    0.0560005 |
          | BaggingClassifier              |   0.947368 |            0.954577 |  0.954577 |   0.947882 |    0.0559971 |
          | BernoulliNB                    |   0.950877 |            0.951003 |  0.951003 |   0.951072 |    0.0169988 |
          | LinearDiscriminantAnalysis     |   0.961404 |            0.950816 |  0.950816 |   0.961089 |    0.0199995 |
          | GaussianNB                     |   0.954386 |            0.949536 |  0.949536 |   0.954337 |    0.0139935 |
          | NuSVC                          |   0.954386 |            0.943215 |  0.943215 |   0.954014 |    0.019989  |
          | DecisionTreeClassifier         |   0.936842 |            0.933693 |  0.933693 |   0.936971 |    0.0170023 |
          | NearestCentroid                |   0.947368 |            0.933506 |  0.933506 |   0.946801 |    0.0160074 |
          | ExtraTreeClassifier            |   0.922807 |            0.912168 |  0.912168 |   0.922462 |    0.0109999 |
          | CheckingClassifier             |   0.361404 |            0.5      |  0.5      |   0.191879 |    0.0170043 |
          | DummyClassifier                |   0.512281 |            0.489598 |  0.489598 |   0.518924 |    0.0119965 |

          回歸

          from lazypredict.Supervised import LazyRegressor
          from sklearn import datasets
          from sklearn.utils import shuffle
          import numpy as np

          boston = datasets.load_boston()
          X, y = shuffle(boston.data, boston.target, random_state=13)
          X = X.astype(np.float32)

          offset = int(X.shape[0] * 0.9)

          X_train, y_train = X[:offset], y[:offset]
          X_test, y_test = X[offset:], y[offset:]

          reg = LazyRegressor(verbose=0, ignore_warnings=False, custom_metric=None)
          models, predictions = reg.fit(X_train, X_test, y_train, y_test)

          print(models)


          | Model                         | Adjusted R-Squared | R-Squared |  RMSE | Time Taken |
          |:------------------------------|-------------------:|----------:|------:|-----------:|
          | SVR                           |               0.83 |      0.88 |  2.62 |       0.01 |
          | BaggingRegressor              |               0.83 |      0.88 |  2.63 |       0.03 |
          | NuSVR                         |               0.82 |      0.86 |  2.76 |       0.03 |
          | RandomForestRegressor         |               0.81 |      0.86 |  2.78 |       0.21 |
          | XGBRegressor                  |               0.81 |      0.86 |  2.79 |       0.06 |
          | GradientBoostingRegressor     |               0.81 |      0.86 |  2.84 |       0.11 |
          | ExtraTreesRegressor           |               0.79 |      0.84 |  2.98 |       0.12 |
          | AdaBoostRegressor             |               0.78 |      0.83 |  3.04 |       0.07 |
          | HistGradientBoostingRegressor |               0.77 |      0.83 |  3.06 |       0.17 |
          | PoissonRegressor              |               0.77 |      0.83 |  3.11 |       0.01 |
          | LGBMRegressor                 |               0.77 |      0.83 |  3.11 |       0.07 |
          | KNeighborsRegressor           |               0.77 |      0.83 |  3.12 |       0.01 |
          | DecisionTreeRegressor         |               0.65 |      0.74 |  3.79 |       0.01 |
          | MLPRegressor                  |               0.65 |      0.74 |  3.80 |       1.63 |
          | HuberRegressor                |               0.64 |      0.74 |  3.84 |       0.01 |
          | GammaRegressor                |               0.64 |      0.73 |  3.88 |       0.01 |
          | LinearSVR                     |               0.62 |      0.72 |  3.96 |       0.01 |
          | RidgeCV                       |               0.62 |      0.72 |  3.97 |       0.01 |
          | BayesianRidge                 |               0.62 |      0.72 |  3.97 |       0.01 |
          | Ridge                         |               0.62 |      0.72 |  3.97 |       0.01 |
          | TransformedTargetRegressor    |               0.62 |      0.72 |  3.97 |       0.01 |
          | LinearRegression              |               0.62 |      0.72 |  3.97 |       0.01 |
          | ElasticNetCV                  |               0.62 |      0.72 |  3.98 |       0.04 |
          | LassoCV                       |               0.62 |      0.72 |  3.98 |       0.06 |
          | LassoLarsIC                   |               0.62 |      0.72 |  3.98 |       0.01 |
          | LassoLarsCV                   |               0.62 |      0.72 |  3.98 |       0.02 |
          | Lars                          |               0.61 |      0.72 |  3.99 |       0.01 |
          | LarsCV                        |               0.61 |      0.71 |  4.02 |       0.04 |
          | SGDRegressor                  |               0.60 |      0.70 |  4.07 |       0.01 |
          | TweedieRegressor              |               0.59 |      0.70 |  4.12 |       0.01 |
          | GeneralizedLinearRegressor    |               0.59 |      0.70 |  4.12 |       0.01 |
          | ElasticNet                    |               0.58 |      0.69 |  4.16 |       0.01 |
          | Lasso                         |               0.54 |      0.66 |  4.35 |       0.02 |
          | RANSACRegressor               |               0.53 |      0.65 |  4.41 |       0.04 |
          | OrthogonalMatchingPursuitCV   |               0.45 |      0.59 |  4.78 |       0.02 |
          | PassiveAggressiveRegressor    |               0.37 |      0.54 |  5.09 |       0.01 |
          | GaussianProcessRegressor      |               0.23 |      0.43 |  5.65 |       0.03 |
          | OrthogonalMatchingPursuit     |               0.16 |      0.38 |  5.89 |       0.01 |
          | ExtraTreeRegressor            |               0.08 |      0.32 |  6.17 |       0.01 |
          | DummyRegressor                |              -0.38 |     -0.02 |  7.56 |       0.01 |
          | LassoLars                     |              -0.38 |     -0.02 |  7.56 |       0.01 |
          | KernelRidge                   |             -11.50 |     -8.25 | 22.74 |       0.01 |

          結(jié)論

          在本文中,我們討論了LazyPredict庫的實(shí)現(xiàn),該庫可以在幾行Python代碼中訓(xùn)練大約70個分類和回歸模型。這是一個非常方便的工具,因?yàn)樗峁┝四P蛨?zhí)行的總體情況,并且可以比較每個模型的性能。

          每個模型都使用其默認(rèn)參數(shù)進(jìn)行訓(xùn)練,因?yàn)樗粓?zhí)行超參數(shù)調(diào)整。選擇性能最佳的模型后,開發(fā)人員可以調(diào)整模型以進(jìn)一步提高性能。

          往期精彩回顧





          本站qq群851320808,加入微信群請掃碼:

          瀏覽 104
          點(diǎn)贊
          評論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報
          評論
          圖片
          表情
          推薦
          點(diǎn)贊
          評論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  欧美三级韩国三级日本三级在线观看 | 美女扒开粉嫩尿囗桶爽免费网站 | 青青草一级黄色视频 | 在线无码一区二区三区四区 | 久草视频在线免费播放 |