automated classifier selection with bayesian and asha optimization -凯发k8网页登录
this example shows how to use fitcauto
to automatically try a selection of classification model types with different hyperparameter values, given training predictor and response data. by default, the function uses bayesian optimization to select and assess models. if your training data set contains many observations, you can use an asynchronous successive halving algorithm (asha) instead. after the optimization is complete, fitcauto
returns the model, trained on the entire data set, that is expected to best classify new data. check the model performance on test data.
load sample data
this example uses the 1994 census data stored in census1994.mat
. the data set consists of demographic information from the us census bureau that can be used to predict whether an individual makes over $50,000 per year.
load the sample data census1994
, which contains the training data adultdata
and the test data adulttest
. preview the first few rows of the training data set.
load census1994
head(adultdata)
ans=8×15 table
age workclass fnlwgt education education_num marital_status occupation relationship race sex capital_gain capital_loss hours_per_week native_country salary
___ ________________ __________ _________ _____________ _____________________ _________________ _____________ _____ ______ ____________ ____________ ______________ ______________ ______
39 state-gov 77516 bachelors 13 never-married adm-clerical not-in-family white male 2174 0 40 united-states <=50k
50 self-emp-not-inc 83311 bachelors 13 married-civ-spouse exec-managerial husband white male 0 0 13 united-states <=50k
38 private 2.1565e 05 hs-grad 9 divorced handlers-cleaners not-in-family white male 0 0 40 united-states <=50k
53 private 2.3472e 05 11th 7 married-civ-spouse handlers-cleaners husband black male 0 0 40 united-states <=50k
28 private 3.3841e 05 bachelors 13 married-civ-spouse prof-specialty wife black female 0 0 40 cuba <=50k
37 private 2.8458e 05 masters 14 married-civ-spouse exec-managerial wife white female 0 0 40 united-states <=50k
49 private 1.6019e 05 9th 5 married-spouse-absent other-service not-in-family black female 0 0 16 jamaica <=50k
52 self-emp-not-inc 2.0964e 05 hs-grad 9 married-civ-spouse exec-managerial husband white male 0 0 45 united-states >50k
each row contains the demographic information for one adult. the last column salary
shows whether a person has a salary less than or equal to $50,000 per year or greater than $50,000 per year.
remove observations from adultdata
and adulttest
that contain missing values.
adultdata = rmmissing(adultdata); adulttest = rmmissing(adulttest);
use automated model selection with bayesian optimization
find an appropriate classifier for the data in adultdata
by using fitcauto
. by default, fitcauto
uses bayesian optimization to select models and their hyperparameter values, and computes the cross-validation classification error (validation loss
) for each model. by default, fitcauto
provides a plot of the optimization and an iterative display of the optimization results. for more information on how to interpret these results, see verbose display.
set the observation weights, and specify to run the bayesian optimization in parallel, which requires parallel computing toolbox™. due to the nonreproducibility of parallel timing, parallel bayesian optimization does not necessarily yield reproducible results. because of the complexity of the optimization, this process can take some time, especially for larger data sets.
bayesianoptions = struct("useparallel",true); [bayesianmdl,bayesianresults] = fitcauto(adultdata,"salary","weights","fnlwgt", ... "hyperparameteroptimizationoptions",bayesianoptions);
warning: data set has more than 10000 observations. because asha optimization often finds good solutions faster than bayesian optimization for data sets with many observations, try specifying the 'optimizer' field value as 'asha' in the 'hyperparameteroptimizationoptions' value structure.
warning: it is recommended that you first standardize all numeric predictors when optimizing the naive bayes 'width' parameter. ignore this warning if you have done that.
starting parallel pool (parpool) using the 'local' profile ... connected to the parallel pool (number of workers: 8). copying objective function to workers... done copying objective function to workers. learner types to explore: ensemble, nb, svm, tree total iterations (maxobjectiveevaluations): 120 total time (maxtime): inf |=======================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | estimated min | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 1 | 8 | best | 0.14546 | 9.0638 | 0.14546 | 0.14546 | tree | minleafsize: 118 | | 2 | 7 | accept | 0.24677 | 11.074 | 0.14546 | 0.14704 | svm | boxconstraint: 3.8856 | | | | | | | | | | kernelscale: 0.00432 | | 3 | 7 | accept | 0.14868 | 9.9597 | 0.14546 | 0.14704 | tree | minleafsize: 26 | | 4 | 8 | accept | 0.15457 | 156.73 | 0.14546 | 0.14704 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 203 | | | | | | | | | | minleafsize: 105 | | 5 | 8 | accept | 0.18015 | 153.77 | 0.14546 | 0.14704 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 239 | | | | | | | | | | minleafsize: 2417 | | 6 | 8 | accept | 0.14921 | 233.35 | 0.14546 | 0.14704 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 258 | | | | | | | | | | minleafsize: 542 | | 7 | 8 | accept | 0.24677 | 113.54 | 0.14546 | 0.14704 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 298 | | | | | | | | | | minleafsize: 11733 | | 8 | 8 | best | 0.14517 | 318.06 | 0.14517 | 0.14704 | svm | boxconstraint: 0.10777 | | | | | | | | | | kernelscale: 4.1321 | | 9 | 8 | accept | 0.15414 | 196.23 | 0.14517 | 0.14704 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 267 | | | | | | | | | | minleafsize: 1 | | 10 | 8 | accept | 0.24677 | 382.69 | 0.14517 | 0.14704 | svm | boxconstraint: 0.043508 | | | | | | | | | | kernelscale: 931.39 | | 11 | 8 | accept | 0.24677 | 393.3 | 0.14517 | 0.14704 | svm | boxconstraint: 0.043508 | | | | | | | | | | kernelscale: 931.39 | | 12 | 8 | accept | 0.16391 | 11.028 | 0.14517 | 0.1523 | tree | minleafsize: 5 | | 13 | 8 | accept | 0.17328 | 2.6893 | 0.14517 | 0.15762 | tree | minleafsize: 846 | | 14 | 8 | accept | 0.24677 | 1.4486 | 0.14517 | 0.17323 | tree | minleafsize: 8098 | | 15 | 8 | accept | 0.24169 | 425.67 | 0.14517 | 0.17323 | svm | boxconstraint: 0.15557 | | | | | | | | | | kernelscale: 72.871 | | 16 | 8 | accept | 0.24677 | 1.0011 | 0.14517 | 0.16798 | tree | minleafsize: 11604 | | 17 | 8 | accept | 0.1511 | 88.912 | 0.14517 | 0.1511 | nb | distributionnames: kernel | | | | | | | | | | width: 0.67978 | | 18 | 8 | accept | 0.18099 | 162.32 | 0.14517 | 0.16694 | nb | distributionnames: kernel | | | | | | | | | | width: 232.3 | | 19 | 8 | accept | 0.18403 | 2.8724 | 0.14517 | 0.16363 | tree | minleafsize: 2226 | | 20 | 8 | accept | 0.24677 | 9.3034 | 0.14517 | 0.16363 | svm | boxconstraint: 0.019712 | | | | | | | | | | kernelscale: 0.0047701 | |=======================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | estimated min | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 21 | 8 | accept | 0.15364 | 141.13 | 0.14517 | 0.16209 | nb | distributionnames: kernel | | | | | | | | | | width: 3.2128 | | 22 | 8 | accept | 0.16345 | 351.87 | 0.14517 | 0.16209 | svm | boxconstraint: 0.0017882 | | | | | | | | | | kernelscale: 3.4862 | | 23 | 8 | accept | 0.1585 | 391.52 | 0.14517 | 0.16209 | svm | boxconstraint: 19.675 | | | | | | | | | | kernelscale: 205.34 | | 24 | 8 | accept | 0.18472 | 180.24 | 0.14517 | 0.16363 | nb | distributionnames: kernel | | | | | | | | | | width: 1676.9 | | 25 | 8 | accept | 0.15005 | 209.17 | 0.14517 | 0.15973 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 218 | | | | | | | | | | minleafsize: 34 | | 26 | 8 | accept | 0.15429 | 172.14 | 0.14517 | 0.15973 | nb | distributionnames: kernel | | | | | | | | | | width: 3.6464 | | 27 | 8 | accept | 0.168 | 1.2833 | 0.14517 | 0.15973 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 28 | 8 | accept | 0.15045 | 225.03 | 0.14517 | 0.15445 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 235 | | | | | | | | | | minleafsize: 7 | | 29 | 8 | accept | 0.168 | 2.4337 | 0.14517 | 0.15445 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 30 | 8 | accept | 0.168 | 2.0997 | 0.14517 | 0.15445 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 31 | 8 | accept | 0.15761 | 198.66 | 0.14517 | 0.15445 | nb | distributionnames: kernel | | | | | | | | | | width: 5.6802 | | 32 | 8 | accept | 0.15055 | 214.5 | 0.14517 | 0.1515 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 202 | | | | | | | | | | minleafsize: 2 | | 33 | 8 | accept | 0.15429 | 143.42 | 0.14517 | 0.1515 | nb | distributionnames: kernel | | | | | | | | | | width: 3.6279 | | 34 | 8 | accept | 0.14984 | 211.73 | 0.14517 | 0.15084 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 271 | | | | | | | | | | minleafsize: 13 | | 35 | 8 | accept | 0.15455 | 182.76 | 0.14517 | 0.15058 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 261 | | | | | | | | | | minleafsize: 1 | | 36 | 8 | accept | 0.24182 | 435.16 | 0.14517 | 0.15058 | svm | boxconstraint: 0.088843 | | | | | | | | | | kernelscale: 60.035 | | 37 | 8 | accept | 0.15528 | 150.14 | 0.14517 | 0.15062 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 244 | | | | | | | | | | minleafsize: 3 | | 38 | 8 | accept | 0.15115 | 5.0809 | 0.14517 | 0.15062 | tree | minleafsize: 262 | | 39 | 8 | accept | 0.14982 | 154.77 | 0.14517 | 0.15003 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 218 | | | | | | | | | | minleafsize: 19 | | 40 | 8 | accept | 0.15044 | 145.66 | 0.14517 | 0.1501 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 201 | | | | | | | | | | minleafsize: 1 | |=======================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | estimated min | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 41 | 8 | accept | 0.24677 | 0.61497 | 0.14517 | 0.1501 | tree | minleafsize: 12837 | | 42 | 8 | accept | 0.17342 | 8.1501 | 0.14517 | 0.1501 | tree | minleafsize: 3 | | 43 | 8 | accept | 0.15795 | 137.62 | 0.14517 | 0.15019 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 224 | | | | | | | | | | minleafsize: 2353 | | 44 | 8 | accept | 0.17696 | 1.4365 | 0.14517 | 0.15019 | tree | minleafsize: 1170 | | 45 | 8 | accept | 0.1683 | 144.68 | 0.14517 | 0.15019 | nb | distributionnames: kernel | | | | | | | | | | width: 18.063 | | 46 | 8 | accept | 0.2011 | 125.96 | 0.14517 | 0.15019 | nb | distributionnames: kernel | | | | | | | | | | width: 58214 | | 47 | 8 | accept | 0.1883 | 134.06 | 0.14517 | 0.15019 | nb | distributionnames: kernel | | | | | | | | | | width: 2612.2 | | 48 | 8 | accept | 0.1494 | 865.49 | 0.14517 | 0.15019 | svm | boxconstraint: 0.028795 | | | | | | | | | | kernelscale: 1.4357 | | 49 | 8 | accept | 0.17342 | 7.5968 | 0.14517 | 0.15019 | tree | minleafsize: 3 | | 50 | 8 | accept | 0.15016 | 3.9429 | 0.14517 | 0.15019 | tree | minleafsize: 171 | | 51 | 8 | accept | 0.16106 | 170.49 | 0.14517 | 0.1505 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 248 | | | | | | | | | | minleafsize: 358 | | 52 | 8 | accept | 0.17901 | 9.2773 | 0.14517 | 0.1505 | tree | minleafsize: 2 | | 53 | 8 | accept | 0.17487 | 146.78 | 0.14517 | 0.1505 | nb | distributionnames: kernel | | | | | | | | | | width: 65.783 | | 54 | 8 | accept | 0.15025 | 243.33 | 0.14517 | 0.15019 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 293 | | | | | | | | | | minleafsize: 37 | | 55 | 8 | accept | 0.23138 | 406.36 | 0.14517 | 0.15019 | svm | boxconstraint: 0.064193 | | | | | | | | | | kernelscale: 38.054 | | 56 | 8 | accept | 0.15038 | 245.69 | 0.14517 | 0.15032 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 300 | | | | | | | | | | minleafsize: 16 | | 57 | 8 | accept | 0.14853 | 224.07 | 0.14517 | 0.15057 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 259 | | | | | | | | | | minleafsize: 684 | | 58 | 8 | accept | 0.15075 | 210.06 | 0.14517 | 0.15016 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 296 | | | | | | | | | | minleafsize: 33 | | 59 | 8 | accept | 0.14981 | 221.88 | 0.14517 | 0.15016 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 300 | | | | | | | | | | minleafsize: 18 | | 60 | 8 | accept | 0.15111 | 192.41 | 0.14517 | 0.15004 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 250 | | | | | | | | | | minleafsize: 1571 | |=======================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | estimated min | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 61 | 8 | accept | 0.15018 | 219.2 | 0.14517 | 0.15006 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 297 | | | | | | | | | | minleafsize: 29 | | 62 | 8 | accept | 0.14983 | 207.92 | 0.14517 | 0.14994 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 272 | | | | | | | | | | minleafsize: 553 | | 63 | 8 | accept | 0.15006 | 208.62 | 0.14517 | 0.15004 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 299 | | | | | | | | | | minleafsize: 13 | | 64 | 8 | accept | 0.15116 | 199.41 | 0.14517 | 0.14986 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 256 | | | | | | | | | | minleafsize: 1574 | | 65 | 8 | accept | 0.14934 | 214.65 | 0.14517 | 0.14923 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 264 | | | | | | | | | | minleafsize: 1469 | | 66 | 8 | accept | 0.16747 | 359.01 | 0.14517 | 0.14923 | svm | boxconstraint: 0.033521 | | | | | | | | | | kernelscale: 12.508 | | 67 | 8 | accept | 0.14883 | 219.75 | 0.14517 | 0.14919 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 254 | | | | | | | | | | minleafsize: 986 | | 68 | 8 | accept | 0.14959 | 207 | 0.14517 | 0.14912 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 259 | | | | | | | | | | minleafsize: 1173 | | 69 | 8 | accept | 0.15016 | 207.07 | 0.14517 | 0.14918 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 290 | | | | | | | | | | minleafsize: 1897 | | 70 | 8 | accept | 0.14754 | 275.92 | 0.14517 | 0.14918 | svm | boxconstraint: 0.10603 | | | | | | | | | | kernelscale: 5.0157 | | 71 | 8 | accept | 0.14794 | 282.2 | 0.14517 | 0.14918 | svm | boxconstraint: 0.11374 | | | | | | | | | | kernelscale: 5.4724 | | 72 | 8 | accept | 0.15093 | 190.3 | 0.14517 | 0.14914 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 267 | | | | | | | | | | minleafsize: 1707 | | 73 | 8 | accept | 0.14908 | 300.63 | 0.14517 | 0.14914 | svm | boxconstraint: 0.087041 | | | | | | | | | | kernelscale: 6.0336 | | 74 | 8 | accept | 0.14875 | 246.04 | 0.14517 | 0.14913 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 299 | | | | | | | | | | minleafsize: 934 | | 75 | 8 | accept | 0.16303 | 345.34 | 0.14517 | 0.14913 | svm | boxconstraint: 0.10529 | | | | | | | | | | kernelscale: 18.53 | | 76 | 8 | accept | 0.14828 | 274.59 | 0.14517 | 0.14913 | svm | boxconstraint: 0.060389 | | | | | | | | | | kernelscale: 4.7508 | | 77 | 8 | accept | 0.15908 | 326.16 | 0.14517 | 0.14913 | svm | boxconstraint: 0.11 | | | | | | | | | | kernelscale: 16.289 | | 78 | 8 | accept | 0.15976 | 297.48 | 0.14517 | 0.14852 | svm | boxconstraint: 0.26147 | | | | | | | | | | kernelscale: 25.792 | | 79 | 8 | accept | 0.14857 | 279.1 | 0.14517 | 0.14802 | svm | boxconstraint: 0.048355 | | | | | | | | | | kernelscale: 4.4186 | | 80 | 8 | accept | 0.14944 | 302.39 | 0.14517 | 0.14845 | svm | boxconstraint: 0.041852 | | | | | | | | | | kernelscale: 4.9836 | |=======================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | estimated min | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 81 | 8 | accept | 0.19487 | 401.11 | 0.14517 | 0.14861 | svm | boxconstraint: 0.078452 | | | | | | | | | | kernelscale: 30.022 | | 82 | 8 | accept | 0.1461 | 315.95 | 0.14517 | 0.14876 | svm | boxconstraint: 0.15302 | | | | | | | | | | kernelscale: 4.7768 | | 83 | 8 | accept | 0.15583 | 333.62 | 0.14517 | 0.14716 | svm | boxconstraint: 0.0071375 | | | | | | | | | | kernelscale: 4.1809 | | 84 | 8 | accept | 0.14761 | 297.08 | 0.14517 | 0.1478 | svm | boxconstraint: 0.070313 | | | | | | | | | | kernelscale: 4.4613 | | 85 | 8 | accept | 0.14531 | 326.83 | 0.14517 | 0.14706 | svm | boxconstraint: 0.73357 | | | | | | | | | | kernelscale: 5.3351 | | 86 | 8 | accept | 0.14691 | 303.06 | 0.14517 | 0.14643 | svm | boxconstraint: 0.089237 | | | | | | | | | | kernelscale: 4.479 | | 87 | 8 | accept | 0.15009 | 313.15 | 0.14517 | 0.14635 | svm | boxconstraint: 0.028044 | | | | | | | | | | kernelscale: 4.6944 | | 88 | 8 | accept | 0.1578 | 212.73 | 0.14517 | 0.14635 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 283 | | | | | | | | | | minleafsize: 2103 | | 89 | 8 | accept | 0.14521 | 456.62 | 0.14517 | 0.14617 | svm | boxconstraint: 1.9049 | | | | | | | | | | kernelscale: 4.7834 | | 90 | 8 | accept | 0.15028 | 224.72 | 0.14517 | 0.14617 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 271 | | | | | | | | | | minleafsize: 1953 | | 91 | 8 | accept | 0.15213 | 316.5 | 0.14517 | 0.14614 | svm | boxconstraint: 0.016203 | | | | | | | | | | kernelscale: 4.3045 | | 92 | 8 | accept | 0.14533 | 405.52 | 0.14517 | 0.14618 | svm | boxconstraint: 2.5334 | | | | | | | | | | kernelscale: 5.0734 | | 93 | 8 | accept | 0.4955 | 5808 | 0.14517 | 0.14629 | svm | boxconstraint: 0.0055052 | | | | | | | | | | kernelscale: 0.11946 | | 94 | 8 | accept | 0.14599 | 421.4 | 0.14517 | 0.14552 | svm | boxconstraint: 0.1423 | | | | | | | | | | kernelscale: 2.5078 | | 95 | 8 | accept | 0.15696 | 334.19 | 0.14517 | 0.1459 | svm | boxconstraint: 0.0086085 | | | | | | | | | | kernelscale: 4.7011 | | 96 | 8 | accept | 0.16125 | 335.61 | 0.14517 | 0.14572 | svm | boxconstraint: 0.0043914 | | | | | | | | | | kernelscale: 4.3671 | | 97 | 8 | accept | 0.24245 | 6787.2 | 0.14517 | 0.14594 | svm | boxconstraint: 11.252 | | | | | | | | | | kernelscale: 0.48339 | | 98 | 8 | accept | 0.16294 | 354.47 | 0.14517 | 0.14602 | svm | boxconstraint: 0.0062158 | | | | | | | | | | kernelscale: 5.3367 | | 99 | 8 | accept | 0.16271 | 339.55 | 0.14517 | 0.14604 | svm | boxconstraint: 0.0046539 | | | | | | | | | | kernelscale: 4.7345 | | 100 | 8 | accept | 0.17184 | 5978.5 | 0.14517 | 0.14589 | svm | boxconstraint: 7.6879 | | | | | | | | | | kernelscale: 1.6339 | |=======================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | estimated min | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | validation loss | | | |=======================================================================================================================================================| | 101 | 8 | accept | 0.16142 | 355.72 | 0.14517 | 0.14624 | svm | boxconstraint: 0.0057866 | | | | | | | | | | kernelscale: 4.8082 | | 102 | 8 | accept | 0.15067 | 1030.9 | 0.14517 | 0.14639 | svm | boxconstraint: 0.0050117 | | | | | | | | | | kernelscale: 1.0093 | | 103 | 8 | best | 0.14513 | 501.17 | 0.14513 | 0.1457 | svm | boxconstraint: 3.1097 | | | | | | | | | | kernelscale: 5.013 | | 104 | 8 | accept | 0.35434 | 7976.9 | 0.14513 | 0.14584 | svm | boxconstraint: 0.06472 | | | | | | | | | | kernelscale: 0.0099007 | | 105 | 8 | accept | 0.26591 | 8022.4 | 0.14513 | 0.14609 | svm | boxconstraint: 3.0956 | | | | | | | | | | kernelscale: 0.013802 | | 106 | 8 | accept | 0.35499 | 5418.2 | 0.14513 | 0.14564 | svm | boxconstraint: 0.37919 | | | | | | | | | | kernelscale: 0.14009 | | 107 | 8 | accept | 0.167 | 6198.1 | 0.14513 | 0.14563 | svm | boxconstraint: 659.16 | | | | | | | | | | kernelscale: 4.3029 | | 108 | 8 | accept | 0.14539 | 747.98 | 0.14513 | 0.14574 | svm | boxconstraint: 5.9153 | | | | | | | | | | kernelscale: 4.9842 | | 109 | 8 | accept | 0.14543 | 586.07 | 0.14513 | 0.14532 | svm | boxconstraint: 3.9618 | | | | | | | | | | kernelscale: 4.9501 | | 110 | 8 | accept | 0.17056 | 5739.7 | 0.14513 | 0.1457 | svm | boxconstraint: 0.0024422 | | | | | | | | | | kernelscale: 0.46361 | | 111 | 8 | accept | 0.29426 | 4906.9 | 0.14513 | 0.14557 | svm | boxconstraint: 0.0064138 | | | | | | | | | | kernelscale: 0.31177 | | 112 | 8 | accept | 0.1717 | 5907.9 | 0.14513 | 0.14571 | svm | boxconstraint: 0.10504 | | | | | | | | | | kernelscale: 0.82207 | | 113 | 8 | accept | 0.17241 | 5909.3 | 0.14513 | 0.14549 | svm | boxconstraint: 0.028608 | | | | | | | | | | kernelscale: 0.64777 | | 114 | 8 | accept | 0.17806 | 6444.5 | 0.14513 | 0.14558 | svm | boxconstraint: 0.34511 | | | | | | | | | | kernelscale: 0.72366 | | 115 | 8 | accept | 0.16555 | 5481 | 0.14513 | 0.14533 | svm | boxconstraint: 0.17641 | | | | | | | | | | kernelscale: 1.0964 | | 116 | 8 | accept | 0.14788 | 1249.1 | 0.14513 | 0.14565 | svm | boxconstraint: 11.881 | | | | | | | | | | kernelscale: 4.8553 | | 117 | 8 | accept | 0.24664 | 5351.2 | 0.14513 | 0.14631 | svm | boxconstraint: 0.0048502 | | | | | | | | | | kernelscale: 0.38595 | | 118 | 8 | best | 0.14483 | 563.53 | 0.14483 | 0.1454 | svm | boxconstraint: 2.2083 | | | | | | | | | | kernelscale: 4.5125 | | 119 | 8 | accept | 0.15403 | 340.78 | 0.14483 | 0.14559 | svm | boxconstraint: 0.843 | | | | | | | | | | kernelscale: 26.671 | | 120 | 8 | accept | 0.27575 | 4752.3 | 0.14483 | 0.14532 | svm | boxconstraint: 0.0017988 | | | | | | | | | | kernelscale: 0.21724 |
__________________________________________________________ optimization completed. total iterations: 120 total elapsed time: 17271.6899 seconds total time for training and validation: 116613.8436 seconds best observed learner is an svm model with: learner: svm boxconstraint: 2.2083 kernelscale: 4.5125 observed validation loss: 0.14483 time for training and validation: 563.5264 seconds best estimated learner (returned model) is an svm model with: learner: svm boxconstraint: 3.1097 kernelscale: 5.013 estimated validation loss: 0.14532 estimated time for training and validation: 574.6554 seconds documentation for fitcauto display
the total elapsed time
value shows that the bayesian optimization took a while to run (about 4.8 hours).
the final model returned by fitcauto
corresponds to the best estimated learner. before returning the model, the function retrains it using the entire training data set (adultdata
), the listed learner
(or model) type, and the displayed hyperparameter values.
use automated model selection with asha optimization
when fitcauto
with bayesian optimization takes a long time to run because of the number of observations in your training set, consider using fitcauto
with asha optimization instead. given that adultdata
contains over 10,000 observations, try using fitcauto
with asha optimization to automatically find an appropriate classifier. when you use fitcauto
with asha optimization, the function randomly chooses several models with different hyperparameter values and trains them on a small subset of the training data. if the cross-validation classification error (validation loss
) of a particular model is promising, the model is promoted and trained on a larger amount of the training data. this process repeats, and successful models are trained on progressively larger amounts of data. by default, fitcauto
provides a plot of the optimization and an iterative display of the optimization results. for more information on how to interpret these results, see verbose display.
set the observation weights, and specify to run the asha optimization in parallel. note that asha optimization often has more iterations than bayesian optimization by default. if you have a time constraint, you can specify the maxtime
field of the hyperparameteroptimizationoptions
structure to limit the number of seconds fitcauto
runs.
ashaoptions = struct("optimizer","asha","useparallel",true); [ashamdl,asharesults] = fitcauto(adultdata,"salary","weights","fnlwgt", ... "hyperparameteroptimizationoptions",ashaoptions);
warning: it is recommended that you first standardize all numeric predictors when optimizing the naive bayes 'width' parameter. ignore this warning if you have done that.
copying objective function to workers... done copying objective function to workers. learner types to explore: ensemble, nb, svm, tree total iterations (maxobjectiveevaluations): 425 total time (maxtime): inf |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 1 | 8 | best | 0.1831 | 0.59595 | 0.1831 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 2 | 7 | accept | 0.24677 | 0.64255 | 0.1831 | 378 | tree | minleafsize: 2865 | | 3 | 7 | accept | 0.24677 | 0.88001 | 0.1831 | 378 | tree | minleafsize: 1629 | | 4 | 7 | accept | 0.24677 | 0.49804 | 0.1831 | 378 | svm | boxconstraint: 93.722 | | | | | | | | | | kernelscale: 0.0031746 | | 5 | 8 | best | 0.15872 | 2.6522 | 0.15872 | 378 | svm | boxconstraint: 622.14 | | | | | | | | | | kernelscale: 8.3947 | | 6 | 8 | accept | 0.17662 | 0.88532 | 0.15872 | 1509 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 7 | 8 | accept | 0.17844 | 0.7806 | 0.15872 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 8 | 7 | accept | 0.24677 | 1.6871 | 0.15872 | 378 | svm | boxconstraint: 16.149 | | | | | | | | | | kernelscale: 418.84 | | 9 | 7 | accept | 0.18401 | 0.70381 | 0.15872 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 10 | 7 | accept | 0.17244 | 0.31131 | 0.15872 | 378 | tree | minleafsize: 4 | | 11 | 8 | accept | 0.17135 | 0.45227 | 0.15872 | 378 | tree | minleafsize: 13 | | 12 | 8 | accept | 0.24677 | 0.53034 | 0.15872 | 378 | svm | boxconstraint: 272.5 | | | | | | | | | | kernelscale: 0.0013306 | | 13 | 8 | accept | 0.24677 | 0.34333 | 0.15872 | 378 | tree | minleafsize: 701 | | 14 | 8 | best | 0.15344 | 1.0465 | 0.15344 | 1509 | tree | minleafsize: 13 | | 15 | 8 | accept | 0.1538 | 12.371 | 0.15344 | 1509 | svm | boxconstraint: 622.14 | | | | | | | | | | kernelscale: 8.3947 | | 16 | 8 | accept | 0.24677 | 0.31034 | 0.15344 | 378 | tree | minleafsize: 1773 | | 17 | 8 | accept | 0.186 | 21.17 | 0.15344 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 9030.3 | | 18 | 8 | accept | 0.17043 | 21.308 | 0.15344 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 512.37 | | 19 | 8 | accept | 0.18322 | 20.058 | 0.15344 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 30.45 | | 20 | 8 | accept | 0.1802 | 1.1355 | 0.15344 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 21 | 8 | accept | 0.18721 | 23.236 | 0.15344 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 9222.9 | | 22 | 8 | accept | 0.20066 | 21.173 | 0.15344 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 129.74 | | 23 | 8 | accept | 0.2205 | 17.802 | 0.15344 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 30996 | | 24 | 8 | accept | 0.155 | 0.81516 | 0.15344 | 1509 | tree | minleafsize: 4 | | 25 | 8 | best | 0.14361 | 1.0991 | 0.14361 | 6033 | tree | minleafsize: 13 | | 26 | 8 | accept | 0.19409 | 33.202 | 0.14361 | 378 | svm | boxconstraint: 259.24 | | | | | | | | | | kernelscale: 0.32997 | | 27 | 8 | accept | 0.24677 | 2.6882 | 0.14361 | 378 | svm | boxconstraint: 0.0059822 | | | | | | | | | | kernelscale: 713.08 | | 28 | 8 | accept | 0.16587 | 1.8275 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 29 | 8 | accept | 0.18015 | 16.315 | 0.14361 | 378 | svm | boxconstraint: 0.5769 | | | | | | | | | | kernelscale: 0.36299 | | 30 | 8 | accept | 0.17001 | 1.1164 | 0.14361 | 1509 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 31 | 8 | accept | 0.18193 | 0.55339 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 32 | 8 | accept | 0.16862 | 24.418 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 102.7 | | 33 | 7 | accept | 0.241 | 2.5999 | 0.14361 | 378 | svm | boxconstraint: 0.6369 | | | | | | | | | | kernelscale: 23.843 | | 34 | 7 | accept | 0.1821 | 1.0709 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 35 | 8 | accept | 0.18464 | 29.794 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 9256.9 | | 36 | 8 | accept | 0.24677 | 34.988 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 289 | | | | | | | | | | minleafsize: 929 | | 37 | 8 | accept | 0.16073 | 50.947 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 283 | | | | | | | | | | minleafsize: 5 | | 38 | 8 | accept | 0.17086 | 52.897 | 0.14361 | 1509 | nb | distributionnames: kernel | | | | | | | | | | width: 512.37 | | 39 | 8 | accept | 0.17374 | 0.96688 | 0.14361 | 378 | tree | minleafsize: 7 | | 40 | 8 | accept | 0.16598 | 37.543 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 281 | | | | | | | | | | minleafsize: 2 | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 41 | 8 | accept | 0.242 | 1.7295 | 0.14361 | 378 | svm | boxconstraint: 0.0093388 | | | | | | | | | | kernelscale: 4.7672 | | 42 | 8 | accept | 0.20568 | 32.829 | 0.14361 | 378 | svm | boxconstraint: 0.091315 | | | | | | | | | | kernelscale: 0.30293 | | 43 | 8 | accept | 0.16465 | 56.109 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 265 | | | | | | | | | | minleafsize: 2 | | 44 | 8 | accept | 0.16598 | 2.096 | 0.14361 | 378 | svm | boxconstraint: 49.164 | | | | | | | | | | kernelscale: 1.7376 | | 45 | 8 | accept | 0.18456 | 52.012 | 0.14361 | 1509 | nb | distributionnames: kernel | | | | | | | | | | width: 102.7 | | 46 | 8 | accept | 0.24677 | 0.77083 | 0.14361 | 378 | tree | minleafsize: 13706 | | 47 | 8 | accept | 0.16477 | 25.694 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 218 | | | | | | | | | | minleafsize: 36 | | 48 | 8 | accept | 0.17952 | 1.107 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 49 | 8 | accept | 0.32627 | 84.486 | 0.14361 | 378 | svm | boxconstraint: 549.6 | | | | | | | | | | kernelscale: 0.036915 | | 50 | 8 | accept | 0.16346 | 14.925 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 0.65733 | | 51 | 8 | accept | 0.24677 | 4.6465 | 0.14361 | 378 | svm | boxconstraint: 0.9246 | | | | | | | | | | kernelscale: 398.37 | | 52 | 8 | accept | 0.15285 | 42.395 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 283 | | | | | | | | | | minleafsize: 5 | | 53 | 8 | accept | 0.24677 | 2.1044 | 0.14361 | 378 | svm | boxconstraint: 0.041664 | | | | | | | | | | kernelscale: 29.281 | | 54 | 8 | accept | 0.24677 | 39.272 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 216 | | | | | | | | | | minleafsize: 1096 | | 55 | 8 | accept | 0.15401 | 48.294 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 218 | | | | | | | | | | minleafsize: 36 | | 56 | 8 | accept | 0.14704 | 59.661 | 0.14361 | 1509 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 265 | | | | | | | | | | minleafsize: 2 | | 57 | 8 | accept | 0.15196 | 41.835 | 0.14361 | 1509 | nb | distributionnames: kernel | | | | | | | | | | width: 0.65733 | | 58 | 8 | accept | 0.24285 | 97.209 | 0.14361 | 378 | svm | boxconstraint: 62.688 | | | | | | | | | | kernelscale: 0.019392 | | 59 | 8 | accept | 0.24677 | 0.75744 | 0.14361 | 378 | tree | minleafsize: 159 | | 60 | 8 | accept | 0.1928 | 24.43 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 24.876 | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 61 | 8 | accept | 0.20489 | 21.827 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 6524.9 | | 62 | 8 | accept | 0.24677 | 0.5275 | 0.14361 | 378 | svm | boxconstraint: 0.0012686 | | | | | | | | | | kernelscale: 0.0019115 | | 63 | 8 | accept | 0.24677 | 0.43222 | 0.14361 | 378 | svm | boxconstraint: 68.024 | | | | | | | | | | kernelscale: 0.0023633 | | 64 | 8 | accept | 0.26755 | 94.048 | 0.14361 | 378 | svm | boxconstraint: 3.4696 | | | | | | | | | | kernelscale: 0.026518 | | 65 | 8 | accept | 0.25611 | 97.151 | 0.14361 | 378 | svm | boxconstraint: 14.395 | | | | | | | | | | kernelscale: 0.018438 | | 66 | 8 | accept | 0.24677 | 0.6123 | 0.14361 | 378 | svm | boxconstraint: 6.1487 | | | | | | | | | | kernelscale: 0.0012672 | | 67 | 8 | accept | 0.23521 | 38.994 | 0.14361 | 378 | svm | boxconstraint: 0.47847 | | | | | | | | | | kernelscale: 0.21848 | | 68 | 8 | accept | 0.14949 | 75.744 | 0.14361 | 6033 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 283 | | | | | | | | | | minleafsize: 5 | | 69 | 8 | accept | 0.1689 | 26.118 | 0.14361 | 1509 | svm | boxconstraint: 49.164 | | | | | | | | | | kernelscale: 1.7376 | | 70 | 8 | accept | 0.17936 | 25.8 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 228 | | | | | | | | | | minleafsize: 164 | | 71 | 8 | accept | 0.2466 | 34.448 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 239 | | | | | | | | | | minleafsize: 88 | | 72 | 8 | accept | 0.15654 | 2.6753 | 0.14361 | 1509 | tree | minleafsize: 7 | | 73 | 8 | accept | 0.18175 | 0.87169 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 74 | 8 | accept | 0.16188 | 42.445 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 232 | | | | | | | | | | minleafsize: 14 | | 75 | 8 | accept | 0.24677 | 2.9432 | 0.14361 | 378 | svm | boxconstraint: 5.2509 | | | | | | | | | | kernelscale: 566.41 | | 76 | 8 | accept | 0.24677 | 0.68866 | 0.14361 | 378 | svm | boxconstraint: 121.38 | | | | | | | | | | kernelscale: 0.0038679 | | 77 | 8 | accept | 0.24677 | 1.8912 | 0.14361 | 378 | svm | boxconstraint: 4.2627 | | | | | | | | | | kernelscale: 139.4 | | 78 | 8 | accept | 0.21364 | 25.979 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 69985 | | 79 | 8 | accept | 0.16991 | 22.652 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 12.131 | | 80 | 8 | accept | 0.24677 | 27.661 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 200 | | | | | | | | | | minleafsize: 285 | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 81 | 8 | accept | 0.16426 | 2.4413 | 0.14361 | 378 | svm | boxconstraint: 0.10011 | | | | | | | | | | kernelscale: 3.7337 | | 82 | 8 | accept | 0.17779 | 0.78782 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 83 | 8 | accept | 0.15315 | 42.755 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 281 | | | | | | | | | | minleafsize: 2 | | 84 | 7 | accept | 0.17499 | 1.8123 | 0.14361 | 378 | svm | boxconstraint: 235.48 | | | | | | | | | | kernelscale: 1.5618 | | 85 | 7 | accept | 0.1702 | 1.7494 | 0.14361 | 378 | svm | boxconstraint: 1.7191 | | | | | | | | | | kernelscale: 1.7936 | | 86 | 8 | accept | 0.17428 | 0.71717 | 0.14361 | 378 | tree | minleafsize: 1 | | 87 | 8 | accept | 0.24677 | 0.35012 | 0.14361 | 378 | tree | minleafsize: 10340 | | 88 | 8 | accept | 0.15506 | 4.6537 | 0.14361 | 1509 | svm | boxconstraint: 0.10011 | | | | | | | | | | kernelscale: 3.7337 | | 89 | 8 | accept | 0.17907 | 0.93664 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 90 | 8 | accept | 0.15925 | 7.5152 | 0.14361 | 1509 | svm | boxconstraint: 1.7191 | | | | | | | | | | kernelscale: 1.7936 | | 91 | 8 | accept | 0.16344 | 30.309 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 213 | | | | | | | | | | minleafsize: 25 | | 92 | 8 | accept | 0.18234 | 0.92613 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 93 | 8 | accept | 0.24677 | 1.8464 | 0.14361 | 378 | svm | boxconstraint: 0.019224 | | | | | | | | | | kernelscale: 574.94 | | 94 | 8 | accept | 0.1566 | 31.765 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 209 | | | | | | | | | | minleafsize: 2 | | 95 | 8 | accept | 0.15317 | 50.355 | 0.14361 | 1509 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 232 | | | | | | | | | | minleafsize: 14 | | 96 | 8 | accept | 0.24677 | 0.43111 | 0.14361 | 378 | tree | minleafsize: 95 | | 97 | 8 | accept | 0.16512 | 35.087 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 239 | | | | | | | | | | minleafsize: 37 | | 98 | 8 | accept | 0.17052 | 0.45685 | 0.14361 | 378 | tree | minleafsize: 2 | | 99 | 8 | accept | 0.17347 | 1.1812 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 100 | 8 | accept | 0.16896 | 54.887 | 0.14361 | 1509 | nb | distributionnames: kernel | | | | | | | | | | width: 12.131 | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 101 | 8 | accept | 0.18156 | 0.49723 | 0.14361 | 378 | tree | minleafsize: 59 | | 102 | 8 | accept | 0.24677 | 0.68678 | 0.14361 | 378 | svm | boxconstraint: 0.0010962 | | | | | | | | | | kernelscale: 0.0023771 | | 103 | 8 | accept | 0.24677 | 33.615 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 262 | | | | | | | | | | minleafsize: 1898 | | 104 | 8 | accept | 0.15192 | 81.549 | 0.14361 | 6033 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 265 | | | | | | | | | | minleafsize: 2 | | 105 | 8 | accept | 0.15412 | 32.695 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 209 | | | | | | | | | | minleafsize: 2 | | 106 | 8 | accept | 0.24677 | 2.1129 | 0.14361 | 378 | svm | boxconstraint: 0.51162 | | | | | | | | | | kernelscale: 901.32 | | 107 | 8 | accept | 0.24712 | 96.705 | 0.14361 | 378 | svm | boxconstraint: 0.12031 | | | | | | | | | | kernelscale: 0.03735 | | 108 | 8 | accept | 0.24677 | 0.5869 | 0.14361 | 378 | svm | boxconstraint: 878.36 | | | | | | | | | | kernelscale: 0.0032037 | | 109 | 8 | accept | 0.15335 | 46.629 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 213 | | | | | | | | | | minleafsize: 25 | | 110 | 8 | accept | 0.24677 | 42.807 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 229 | | | | | | | | | | minleafsize: 133 | | 111 | 8 | accept | 0.18895 | 1.9037 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 112 | 8 | accept | 0.16369 | 2.104 | 0.14361 | 1509 | tree | minleafsize: 2 | | 113 | 8 | accept | 0.16353 | 1.0029 | 0.14361 | 378 | tree | minleafsize: 3 | | 114 | 8 | accept | 0.15127 | 93.189 | 0.14361 | 6033 | nb | distributionnames: kernel | | | | | | | | | | width: 0.65733 | | 115 | 8 | accept | 0.16092 | 25.041 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 0.59189 | | 116 | 8 | accept | 0.24677 | 0.91976 | 0.14361 | 378 | tree | minleafsize: 123 | | 117 | 8 | accept | 0.16083 | 55.658 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 241 | | | | | | | | | | minleafsize: 4 | | 118 | 8 | accept | 0.17643 | 0.91472 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 119 | 8 | accept | 0.15053 | 9.4088 | 0.14361 | 24130 | tree | minleafsize: 13 | | 120 | 8 | accept | 0.18671 | 1.1927 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 121 | 8 | accept | 0.15504 | 52.066 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 239 | | | | | | | | | | minleafsize: 37 | | 122 | 7 | accept | 0.1873 | 0.88051 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 123 | 7 | accept | 0.18138 | 1.4255 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 124 | 8 | accept | 0.1836 | 1.6767 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 125 | 8 | accept | 0.24677 | 0.53547 | 0.14361 | 378 | svm | boxconstraint: 0.015196 | | | | | | | | | | kernelscale: 0.0050979 | | 126 | 8 | accept | 0.16782 | 76.114 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 270 | | | | | | | | | | minleafsize: 21 | | 127 | 8 | accept | 0.65702 | 109.5 | 0.14361 | 378 | svm | boxconstraint: 11.102 | | | | | | | | | | kernelscale: 0.02167 | | 128 | 8 | accept | 0.15785 | 0.97717 | 0.14361 | 1509 | tree | minleafsize: 3 | | 129 | 8 | accept | 0.16396 | 43.391 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 206 | | | | | | | | | | minleafsize: 52 | | 130 | 8 | accept | 0.20726 | 29.078 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 71433 | | 131 | 8 | accept | 0.15259 | 46.724 | 0.14361 | 1509 | nb | distributionnames: kernel | | | | | | | | | | width: 0.59189 | | 132 | 8 | accept | 0.19003 | 42.596 | 0.14361 | 378 | svm | boxconstraint: 6.2931 | | | | | | | | | | kernelscale: 0.33148 | | 133 | 8 | accept | 0.16091 | 28.84 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 1.1628 | | 134 | 8 | accept | 0.17784 | 29.872 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 886.95 | | 135 | 8 | accept | 0.15027 | 76.937 | 0.14361 | 1509 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 241 | | | | | | | | | | minleafsize: 4 | | 136 | 8 | accept | 0.18327 | 29.845 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 4528.8 | | 137 | 8 | accept | 0.24677 | 0.38473 | 0.14361 | 378 | tree | minleafsize: 10969 | | 138 | 8 | accept | 0.15068 | 106.4 | 0.14361 | 6033 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 281 | | | | | | | | | | minleafsize: 2 | | 139 | 8 | accept | 0.24677 | 27.445 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 223 | | | | | | | | | | minleafsize: 98 | | 140 | 8 | accept | 0.24341 | 100.59 | 0.14361 | 378 | svm | boxconstraint: 2.7209 | | | | | | | | | | kernelscale: 0.010215 | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 141 | 8 | accept | 0.24187 | 2.6075 | 0.14361 | 378 | svm | boxconstraint: 0.0079129 | | | | | | | | | | kernelscale: 4.8667 | | 142 | 8 | accept | 0.17089 | 0.53327 | 0.14361 | 378 | tree | minleafsize: 3 | | 143 | 8 | accept | 0.15068 | 41.021 | 0.14361 | 1509 | nb | distributionnames: kernel | | | | | | | | | | width: 1.1628 | | 144 | 8 | accept | 0.24677 | 26.637 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 258 | | | | | | | | | | minleafsize: 6413 | | 145 | 8 | accept | 0.24677 | 26.876 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 283 | | | | | | | | | | minleafsize: 3673 | | 146 | 8 | accept | 0.16076 | 3.2005 | 0.14361 | 378 | svm | boxconstraint: 4.4667 | | | | | | | | | | kernelscale: 13.015 | | 147 | 8 | accept | 0.1567 | 2.1777 | 0.14361 | 378 | svm | boxconstraint: 0.35692 | | | | | | | | | | kernelscale: 4.2583 | | 148 | 8 | accept | 0.18396 | 0.90132 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 149 | 8 | accept | 0.15063 | 6.9084 | 0.14361 | 1509 | svm | boxconstraint: 0.35692 | | | | | | | | | | kernelscale: 4.2583 | | 150 | 8 | accept | 0.17674 | 1.2601 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 151 | 8 | accept | 0.156 | 35.953 | 0.14361 | 1509 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 206 | | | | | | | | | | minleafsize: 52 | | 152 | 8 | accept | 0.16561 | 27.77 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 905.18 | | 153 | 8 | accept | 0.24677 | 34.957 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 295 | | | | | | | | | | minleafsize: 4587 | | 154 | 8 | accept | 0.17262 | 25.102 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 7.0619 | | 155 | 8 | accept | 0.1518 | 94.985 | 0.14361 | 6033 | nb | distributionnames: kernel | | | | | | | | | | width: 0.59189 | | 156 | 8 | accept | 0.15191 | 4.8239 | 0.14361 | 1509 | svm | boxconstraint: 4.4667 | | | | | | | | | | kernelscale: 13.015 | | 157 | 8 | accept | 0.70666 | 88.91 | 0.14361 | 378 | svm | boxconstraint: 144.03 | | | | | | | | | | kernelscale: 0.0087964 | | 158 | 8 | accept | 0.20845 | 1.2899 | 0.14361 | 378 | svm | boxconstraint: 14.266 | | | | | | | | | | kernelscale: 0.56134 | | 159 | 8 | accept | 0.16207 | 2.2155 | 0.14361 | 378 | svm | boxconstraint: 0.018808 | | | | | | | | | | kernelscale: 1.4491 | | 160 | 8 | accept | 0.17735 | 0.30364 | 0.14361 | 378 | tree | minleafsize: 16 | |====================================================================================================================================================| | iter | active | eval | validation | time for training | observed min | training set | learner | hyperparameter: value | | | workers | result | loss | & validation (sec)| validation loss | size | | | |====================================================================================================================================================| | 161 | 8 | accept | 0.14814 | 3.7088 | 0.14361 | 1509 | svm | boxconstraint: 0.018808 | | | | | | | | | | kernelscale: 1.4491 | | 162 | 8 | accept | 0.15674 | 54.506 | 0.14361 | 1509 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 270 | | | | | | | | | | minleafsize: 21 | | 163 | 8 | accept | 0.18367 | 22.255 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 79.973 | | 164 | 8 | accept | 0.1802 | 0.81271 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 165 | 8 | accept | 0.27102 | 87.837 | 0.14361 | 378 | svm | boxconstraint: 2.3737 | | | | | | | | | | kernelscale: 0.0096356 | | 166 | 8 | accept | 0.24677 | 15.367 | 0.14361 | 378 | ensemble | method: logitboost | | | | | | | | | | numlearningcycles: 211 | | | | | | | | | | minleafsize: 8387 | | 167 | 8 | accept | 0.16134 | 17.629 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 1.1013 | | 168 | 8 | accept | 0.24677 | 1.5152 | 0.14361 | 378 | svm | boxconstraint: 0.0011611 | | | | | | | | | | kernelscale: 20.262 | | 169 | 8 | accept | 0.1889 | 0.58768 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 170 | 8 | accept | 0.20571 | 21.075 | 0.14361 | 378 | nb | distributionnames: kernel | | | | | | | | | | width: 51434 | | 171 | 8 | accept | 0.17171 | 0.54565 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: nan | | 172 | 8 | accept | 0.16136 | 43.429 | 0.14361 | 378 | ensemble | method: bag | | | | | | | | | | numlearningcycles: 264 | | | | | | | | | | minleafsize: 1 | | 173 | 8 | accept | 0.24677 | 0.74563 | 0.14361 | 378 | svm | boxconstraint: 82.364 | | | | | | | | | | kernelscale: 0.0048723 | | 174 | 8 | accept | 0.17774 | 0.82021 | 0.14361 | 378 | nb | distributionnames: normal | | | | | | | | | | width: ...
__________________________________________________________ optimization completed. total iterations: 425 total elapsed time: 1225.1049 seconds total time for training and validation: 8476.6632 seconds best observed learner is a tree model with: learner: tree minleafsize: 14 observed validation loss: 0.14138 time for training and validation: 1.6545 seconds documentation for fitcauto display
the total elapsed time
value shows that the asha optimization took less time to run than the bayesian optimization (about 0.3 hours).
the final model returned by fitcauto
corresponds to the best observed learner. before returning the model, the function retrains it using the entire training data set (adultdata
), the listed learner
(or model) type, and the displayed hyperparameter values.
evaluate test set performance
evaluate the performance of the returned bayesianmdl
and ashamdl
models on the test set adulttest
by using confusion matrices and receiver operating characteristic (roc) curves.
for each model, find the predicted labels and score values for the test set.
[bayesianlabels,bayesianscores] = predict(bayesianmdl,adulttest); [ashalabels,ashascores] = predict(ashamdl,adulttest);
create confusion matrices from the test set results. the diagonal elements indicate the number of correctly classified instances of a given class. the off-diagonal elements are instances of misclassified observations. use a 1-by-2 tiled layout to compare the results.
tiledlayout(1,2) nexttile confusionchart(adulttest.salary,bayesianlabels) title("bayesian optimization") nexttile confusionchart(adulttest.salary,ashalabels) title("asha optimization")
compute the test set classification accuracy for each model, where the accuracy is the percentage of correctly classified test set observations.
bayesianaccuracy = (1-loss(bayesianmdl,adulttest,"salary"))*100
bayesianaccuracy = 85.2062
ashaaccuracy = (1-loss(ashamdl,adulttest,"salary"))*100
ashaaccuracy = 84.1612
based on the confusion matrices and the accuracy values, bayesianmdl
slightly outperforms ashamdl
on the test set. however, both models perform well.
for each model, plot the roc curve and compute the area under the roc curve (auc). the roc curve shows the true positive rate versus the false positive rate for different thresholds of classification scores. for a perfect classifier, whose true positive rate is always 1 regardless of the threshold, auc = 1. for a binary classifier that randomly assigns observations to classes, auc = 0.5. a large auc value (close to 1) indicates good classifier performance.
for each model, compute the metrics for the roc curve and find the auc value by creating a rocmetrics
object.
bayesianroc = rocmetrics(adulttest.salary,bayesianscores,bayesianmdl.classnames); asharoc = rocmetrics(adulttest.salary,ashascores,ashamdl.classnames);
plot the roc curves for the label <=50k
by using the plot
function of rocmetrics
.
figure [r1,g1] = plot(bayesianroc,"classnames","<=50k"); hold on [r2,g2] = plot(asharoc,"classnames","<=50k"); r1.displayname = replace(r1.displayname,"<=50k","bayesian optimization"); r2.displayname = replace(r2.displayname,"<=50k","asha optimization"); g1(1).displayname = "bayesian optimization model operating point"; g2(1).displayname = "asha optimization model operating point"; title("roc curves for class <=50k") hold off
based on the auc values, both classifiers perform well on the test data.
see also
fitcauto
| | | bayesianoptimization