main content

multiclass model for support vector machines (svms) and other classifiers -凯发k8网页登录

multiclass model for support vector machines (svms) and other classifiers

description

classificationecoc is an error-correcting output codes (ecoc) classifier for multiclass learning, where the classifier consists of multiple binary learners such as support vector machines (svms). trained classificationecoc classifiers store training data, parameter values, prior probabilities, and coding matrices. use these classifiers to perform tasks such as predicting labels or posterior probabilities for new data (see predict).

creation

create a classificationecoc object by using fitcecoc.

if you specify linear or kernel binary learners without specifying cross-validation options, then fitcecoc returns a compactclassificationecoc object instead.

properties

after you create a classificationecoc model object, you can use dot notation to access its properties. for an example, see train multiclass model using svm learners.

ecoc properties

trained binary learners, specified as a cell vector of model objects. the number of binary learners depends on the number of classes in y and the coding design.

the software trains binarylearner{j} according to the binary problem specified by codingmatrix(:,j). for example, for multiclass learning using svm learners, each element of binarylearners is a compactclassificationsvm classifier.

data types: cell

binary learner loss function, specified as a character vector representing the loss function name.

this table identifies the default binaryloss value, which depends on the score ranges returned by the binary learners.

assumptiondefault value

all binary learners are any of the following:

  • classification decision trees

  • discriminant analysis models

  • k-nearest neighbor models

  • linear or kernel classification models of logistic regression learners

  • naive bayes models

'quadratic'
all binary learners are svms or linear or kernel classification models of svm learners.'hinge'
all binary learners are ensembles trained by adaboostm1 or gentleboost.'exponential'
all binary learners are ensembles trained by logitboost.'binodeviance'
you specify to predict class posterior probabilities by setting 'fitposterior',true in fitcecoc.'quadratic'
binary learners are heterogeneous and use different loss functions.'hamming'

to check the default value, use dot notation to display the binaryloss property of the trained model at the command line.

to potentially increase accuracy, specify a binary loss function other than the default during a prediction or loss computation by using the binaryloss name-value argument of predict or . for more information, see binary loss.

data types: char

binary learner class labels, specified as a numeric matrix. binaryy is a numobservations-by-l matrix, where l is the number of binary learners (length(mdl.binarylearners)).

elements of binaryy are –1, 0, or 1, and the value corresponds to a dichotomous class assignment. this table describes how learner j assigns observation k to a dichotomous class corresponding to the value of binaryy(k,j).

valuedichotomous class assignment
–1learner j assigns observation k to a negative class.
0before training, learner j removes observation k from the data set.
1learner j assigns observation k to a positive class.

data types: double

this property is read-only.

bin edges for numeric predictors, specified as a cell array of p numeric vectors, where p is the number of predictors. each vector includes the bin edges for a numeric predictor. the element in the cell array for a categorical predictor is empty because the software does not bin categorical predictors.

the software bins numeric predictors only if you specify the 'numbins' name-value argument as a positive integer scalar when training a model with tree learners. the binedges property is empty if the 'numbins' value is empty (default).

you can reproduce the binned predictor data xbinned by using the binedges property of the trained model mdl.

x = mdl.x; % predictor data
xbinned = zeros(size(x));
edges = mdl.binedges;
% find indices of binned predictors.
idxnumeric = find(~cellfun(@isempty,edges));
if iscolumn(idxnumeric)
    idxnumeric = idxnumeric';
end
for j = idxnumeric 
    x = x(:,j);
    % convert x to array if x is a table.
    if istable(x) 
        x = table2array(x);
    end
    % group x into bins by using the  function.
    xbinned = discretize(x,[-inf; edges{j}; inf]); 
    xbinned(:,j) = xbinned;
end
xbinned contains the bin indices, ranging from 1 to the number of bins, for numeric predictors. xbinned values are 0 for categorical predictors. if x contains nans, then the corresponding xbinned values are nans.

data types: cell

class assignment codes for the binary learners, specified as a numeric matrix. codingmatrix is a k-by-l matrix, where k is the number of classes and l is the number of binary learners.

the elements of codingmatrix are –1, 0, and 1, and the values correspond to dichotomous class assignments. this table describes how learner j assigns observations in class i to a dichotomous class corresponding to the value of codingmatrix(i,j).

valuedichotomous class assignment
–1learner j assigns observations in class i to a negative class.
0before training, learner j removes observations in class i from the data set.
1learner j assigns observations in class i to a positive class.

data types: double | single | int8 | int16 | int32 | int64

coding design name, specified as a character vector. for more details, see coding design.

data types: char

binary learner weights, specified as a numeric row vector. the length of learnerweights is equal to the number of binary learners (length(mdl.binarylearners)).

learnerweights(j) is the sum of the observation weights that binary learner j uses to train its classifier.

the software uses learnerweights to fit posterior probabilities by minimizing the kullback-leibler divergence. the software ignores learnerweights when it uses the quadratic programming method of estimating posterior probabilities.

data types: double | single

other classification properties

categorical predictor indices, specified as a vector of positive integers. categoricalpredictors contains index values indicating that the corresponding predictors are categorical. the index values are between 1 and p, where p is the number of predictors used to train the model. if none of the predictors are categorical, then this property is empty ([]).

data types: single | double

unique class labels used in training, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors. classnames has the same data type as the class labels y. (the software treats string arrays as cell arrays of character vectors.) classnames also determines the class order.

data types: categorical | char | logical | single | double | cell

this property is read-only.

misclassification costs, specified as a square numeric matrix. cost has k rows and columns, where k is the number of classes.

cost(i,j) is the cost of classifying a point into class j if its true class is i. the order of the rows and columns of cost corresponds to the order of the classes in classnames.

data types: double

expanded predictor names, specified as a cell array of character vectors.

if the model uses encoding for categorical variables, then expandedpredictornames includes the names that describe the expanded variables. otherwise, expandedpredictornames is the same as predictornames.

data types: cell

parameter values, such as the name-value pair argument values, used to train the ecoc classifier, specified as an object. modelparameters does not contain estimated parameters.

access properties of modelparameters using dot notation. for example, list the templates containing parameters of the binary learners by using mdl.modelparameters.binarylearner.

number of observations in the training data, specified as a positive numeric scalar.

data types: double

predictor names in order of their appearance in the predictor data x, specified as a cell array of character vectors. the length of predictornames is equal to the number of columns in x.

data types: cell

this property is read-only.

prior class probabilities, specified as a numeric vector. prior has as many elements as the number of classes in classnames, and the order of the elements corresponds to the order of the classes in classnames.

fitcecoc incorporates misclassification costs differently among different types of binary learners.

data types: double

response variable name, specified as a character vector.

data types: char

rows of the original training data used in fitting the classificationecoc model, specified as a logical vector. this property is empty if all rows are used.

data types: logical

this property is read-only.

score transformation function to apply to the predicted scores, specified as 'none'. an ecoc model does not support score transformation.

observation weights used to train the ecoc classifier, specified as a numeric vector. w has numobservations elements.

the software normalizes the weights used for training so that sum(w,'omitnan') is 1.

data types: single | double

unstandardized predictor data used to train the ecoc classifier, specified as a numeric matrix or table.

each row of x corresponds to one observation, and each column corresponds to one variable.

data types: single | double | table

observed class labels used to train the ecoc classifier, specified as a categorical or character array, logical or numeric vector, or cell array of character vectors. y has numobservations elements and has the same data type as the input argument y of fitcecoc. (the software treats string arrays as cell arrays of character vectors.)

each row of y represents the observed classification of the corresponding row of x.

data types: categorical | char | logical | single | double | cell

hyperparameter optimization properties

this property is read-only.

cross-validation optimization of hyperparameters, specified as a bayesianoptimization object or a table of hyperparameters and associated values. this property is nonempty if the 'optimizehyperparameters' name-value pair argument is nonempty when you create the model. the value of hyperparameteroptimizationresults depends on the setting of the optimizer field in the hyperparameteroptimizationoptions structure when you create the model.

value of optimizer fieldvalue of hyperparameteroptimizationresults
'bayesopt' (default)object of class bayesianoptimization
'gridsearch' or 'randomsearch'table of hyperparameters used, observed objective function values (cross-validation loss), and rank of observations from lowest (best) to highest (worst)

object functions

reduce size of multiclass error-correcting output codes (ecoc) model
compare accuracies of two classification models using new data
cross-validate multiclass error-correcting output codes (ecoc) model
discard support vectors of linear svm binary learners in ecoc model
classification edge for multiclass error-correcting output codes (ecoc) model
gather properties of statistics and machine learning toolbox object from gpu
convert multiclass error-correcting output codes (ecoc) model to incremental learner
classification loss for multiclass error-correcting output codes (ecoc) model
classification margins for multiclass error-correcting output codes (ecoc) model
partialdependencecompute partial dependence
plotpartialdependencecreate partial dependence plot (pdp) and individual conditional expectation (ice) plots
predictclassify observations using multiclass error-correcting output codes (ecoc) model
resubstitution classification edge for multiclass error-correcting output codes (ecoc) model
limelocal interpretable model-agnostic explanations (lime)
resubstitution classification loss for multiclass error-correcting output codes (ecoc) model
resubstitution classification margins for multiclass error-correcting output codes (ecoc) model
classify observations in multiclass error-correcting output codes (ecoc) model
shapleyshapley values
compare accuracies of two classification models by repeated cross-validation

examples

train a multiclass error-correcting output codes (ecoc) model using support vector machine (svm) binary learners.

load fisher's iris data set. specify the predictor data x and the response data y.

load fisheriris
x = meas;
y = species;

train a multiclass ecoc model using the default options.

mdl = fitcecoc(x,y)
mdl = 
  classificationecoc
             responsename: 'y'
    categoricalpredictors: []
               classnames: {'setosa'  'versicolor'  'virginica'}
           scoretransform: 'none'
           binarylearners: {3x1 cell}
               codingname: 'onevsone'
  properties, methods

mdl is a classificationecoc model. by default, fitcecoc uses svm binary learners and a one-versus-one coding design. you can access mdl properties using dot notation.

display the class names and the coding design matrix.

mdl.classnames
ans = 3x1 cell
    {'setosa'    }
    {'versicolor'}
    {'virginica' }
codingmat = mdl.codingmatrix
codingmat = 3×3
     1     1     0
    -1     0     1
     0    -1    -1

a one-versus-one coding design for three classes yields three binary learners. the columns of codingmat correspond to the learners, and the rows correspond to the classes. the class order is the same as the order in mdl.classnames. for example, codingmat(:,1) is [1; –1; 0] and indicates that the software trains the first svm binary learner using all observations classified as 'setosa' and 'versicolor'. because 'setosa' corresponds to 1, it is the positive class; 'versicolor' corresponds to –1, so it is the negative class.

you can access each binary learner using cell indexing and dot notation.

mdl.binarylearners{1}   % the first binary learner
ans = 
  compactclassificationsvm
             responsename: 'y'
    categoricalpredictors: []
               classnames: [-1 1]
           scoretransform: 'none'
                     beta: [4x1 double]
                     bias: 1.4505
         kernelparameters: [1x1 struct]
  properties, methods

compute the resubstitution classification error.

error = resubloss(mdl)
error = 0.0067

the classification error on the training data is small, but the classifier might be an overfitted model. you can cross-validate the classifier using crossval and compute the cross-validation classification error instead.

train an ecoc classifier using svm binary learners. then, access properties of the binary learners, such as estimated parameters, by using dot notation.

load fisher's iris data set. specify the petal dimensions as the predictors and the species names as the response.

load fisheriris
x = meas(:,3:4);
y = species;

train an ecoc classifier using svm binary learners and the default coding design (one-versus-one). standardize the predictors and save the support vectors.

t = templatesvm('standardize',true,'savesupportvectors',true);
predictornames = {'petallength','petalwidth'};
responsename = 'irisspecies';
classnames = {'setosa','versicolor','virginica'}; % specify class order
mdl = fitcecoc(x,y,'learners',t,'responsename',responsename,...
    'predictornames',predictornames,'classnames',classnames)
mdl = 
  classificationecoc
           predictornames: {'petallength'  'petalwidth'}
             responsename: 'irisspecies'
    categoricalpredictors: []
               classnames: {'setosa'  'versicolor'  'virginica'}
           scoretransform: 'none'
           binarylearners: {3x1 cell}
               codingname: 'onevsone'
  properties, methods

t is a template object that contains options for svm classification. the function fitcecoc uses default values for the empty ([]) properties. mdl is a classificationecoc classifier. you can access properties of mdl using dot notation.

display the class names and the coding design matrix.

mdl.classnames
ans = 3x1 cell
    {'setosa'    }
    {'versicolor'}
    {'virginica' }
mdl.codingmatrix
ans = 3×3
     1     1     0
    -1     0     1
     0    -1    -1

the columns correspond to svm binary learners, and the rows correspond to the distinct classes. the row order is the same as the order in the classnames property of mdl. for each column:

  • 1 indicates that fitcecoc trains the svm using observations in the corresponding class as members of the positive group.

  • –1 indicates that fitcecoc trains the svm using observations in the corresponding class as members of the negative group.

  • 0 indicates that the svm does not use observations in the corresponding class.

in the first svm, for example, fitcecoc assigns all observations to 'setosa' or 'versicolor', but not 'virginica'.

access properties of the svms using cell subscripting and dot notation. store the standardized support vectors of each svm. unstandardize the support vectors.

l = size(mdl.codingmatrix,2); % number of svms
sv = cell(l,1); % preallocate for support vector indices
for j = 1:l
    svm = mdl.binarylearners{j};
    sv{j} = svm.supportvectors;
    sv{j} = sv{j}.*svm.sigma   svm.mu;
end

sv is a cell array of matrices containing the unstandardized support vectors for the svms.

plot the data, and identify the support vectors.

figure
gscatter(x(:,1),x(:,2),y);
hold on
markers = {'ko','ro','bo'}; % should be of length l
for j = 1:l
    svs = sv{j};
    plot(svs(:,1),svs(:,2),markers{j},...
        'markersize',10   (j - 1)*3);
end
title('fisher''s iris -- ecoc support vectors')
xlabel(predictornames{1})
ylabel(predictornames{2})
legend([classnames,{'support vectors - svm 1',...
    'support vectors - svm 2','support vectors - svm 3'}],...
    'location','best')
hold off

figure contains an axes object. the axes object with title fisher's iris -- ecoc support vectors, xlabel petallength, ylabel petalwidth contains 6 objects of type line. one or more of the lines displays its values using only markers these objects represent setosa, versicolor, virginica, support vectors - svm 1, support vectors - svm 2, support vectors - svm 3.

you can pass mdl to these functions:

  • predict, to classify new observations

  • resubloss, to estimate the classification error on the training data

  • crossval, to perform 10-fold cross-validation

cross-validate an ecoc classifier with svm binary learners, and estimate the generalized classification error.

load fisher's iris data set. specify the predictor data x and the response data y.

load fisheriris
x = meas;
y = species;
rng(1); % for reproducibility

create an svm template, and standardize the predictors.

t = templatesvm('standardize',true)
t = 
fit template for classification svm.
                     alpha: [0x1 double]
             boxconstraint: []
                 cachesize: []
             cachingmethod: ''
                clipalphas: []
    deltagradienttolerance: []
                   epsilon: []
              gaptolerance: []
              kkttolerance: []
            iterationlimit: []
            kernelfunction: ''
               kernelscale: []
              kerneloffset: []
     kernelpolynomialorder: []
                  numprint: []
                        nu: []
           outlierfraction: []
          removeduplicates: []
           shrinkageperiod: []
                    solver: ''
           standardizedata: 1
        savesupportvectors: []
            verbositylevel: []
                   version: 2
                    method: 'svm'
                      type: 'classification'

t is an svm template. most of the template object properties are empty. when training the ecoc classifier, the software sets the applicable properties to their default values.

train the ecoc classifier, and specify the class order.

mdl = fitcecoc(x,y,'learners',t,...
    'classnames',{'setosa','versicolor','virginica'});

mdl is a classificationecoc classifier. you can access its properties using dot notation.

cross-validate mdl using 10-fold cross-validation.

cvmdl = crossval(mdl);

cvmdl is a classificationpartitionedecoc cross-validated ecoc classifier.

estimate the generalized classification error.

generror = kfoldloss(cvmdl)
generror = 0.0400

the generalized classification error is 4%, which indicates that the ecoc classifier generalizes fairly well.

more about

algorithms

alternative functionality

you can use these alternative algorithms to train a multiclass model:

references

[1] fürnkranz, johannes. “round robin classification.” j. mach. learn. res., vol. 2, 2002, pp. 721–747.

[2] escalera, s., o. pujol, and p. radeva. “separability of ternary codes for sparse designs of error-correcting output codes.” pattern recog. lett., vol. 30, issue 3, 2009, pp. 285–297.

extended capabilities

version history

introduced in r2014b

网站地图