categoricalpredictors
|
categorical predictor
indices, specified as a vector of positive integers. categoricalpredictors
contains index values indicating that the corresponding predictors are categorical. the index
values are between 1 and p , where p is the number of
predictors used to train the model. if none of the predictors are categorical, then this
property is empty ([] ).
|
categoricalsplit
| an n-by-2 cell array, where n is the number of
categorical splits in tree . each row in
categoricalsplit gives left and right values for a categorical
split. for each branch node with categorical split j based on a
categorical predictor variable z , the left child is chosen if
z is in categoricalsplit(j,1) and the right
child is chosen if z is in categoricalsplit(j,2) .
the splits are in the same order as nodes of the tree. nodes for these splits can be
found by running cuttype and selecting
'categorical' cuts from top to bottom. |
children
|
an n-by-2 array containing the numbers of the child
nodes for each node in tree , where n
is the number of nodes. leaf nodes have child node
0 .
|
classcount
|
an n-by-k array of class counts for
the nodes in tree , where n is the
number of nodes and k is the number of classes. for any
node number i , the class counts
classcount(i,:) are counts of observations (from the
data used in fitting the tree) from each class satisfying the conditions for
node i .
|
classnames
|
list of the elements in y with duplicates removed.
classnames can be a numeric vector, vector of
categorical variables, logical vector, character array, or cell array of
character vectors. classnames has the same data type as
the data in the argument y . (the software treats string arrays as cell arrays of character
vectors.)
if the value of a property has at least one dimension of length
k, then classnames indicates the
order of the elements along that dimension (e.g., cost
and prior ).
|
classprobability
|
an n-by-k array of class
probabilities for the nodes in tree , where
n is the number of nodes and k is
the number of classes. for any node number i , the class
probabilities classprobability(i,:) are the estimated
probabilities for each class for a point satisfying the conditions for node
i .
|
cost
|
square matrix, where cost(i,j) is the cost of classifying a point into
class j if its true class is i (the rows
correspond to the true class and the columns correspond to the predicted class). the
order of the rows and columns of cost corresponds to the order of the
classes in classnames . the number of rows and columns in
cost is the number of unique classes in the response. this
property is read-only.
|
cutcategories
|
an n-by-2 cell array of the categories used at branches
in tree , where n is the number of
nodes. for each branch node i based on a categorical
predictor variable x , the left child is chosen if
x is among the categories listed in
cutcategories{i,1} , and the right child is chosen if
x is among those listed in
cutcategories{i,2} . both columns of
cutcategories are empty for branch nodes based on
continuous predictors and for leaf nodes.
cutpoint contains the cut points for
'continuous' cuts, and
cutcategories contains the set of categories.
|
cutpoint
|
an n-element vector of the values used as cut points in
tree , where n is the number of
nodes. for each branch node i based on a continuous
predictor variable x , the left child is chosen if
x and the right child is chosen if
x>=cutpoint(i) . cutpoint is
nan for branch nodes based on categorical predictors
and for leaf nodes.
cutpoint contains the cut points for
'continuous' cuts, and
cutcategories contains the set of categories.
|
cuttype
|
an n-element cell array indicating the type of cut at
each node in tree , where n is the
number of nodes. for each node i ,
cuttype{i} is:
'continuous' — if the cut is defined in
the form x < v for a variable
x and cut point v .
'categorical' — if the cut is defined by
whether a variable x takes a value in a set of
categories.
'' — if i is a leaf
node.
cutpoint contains the cut points for
'continuous' cuts, and
cutcategories contains the set of categories.
|
cutpredictor
|
an n-element cell array of the names of the variables
used for branching in each node in tree , where
n is the number of nodes. these variables are
sometimes known as cut variables. for leaf nodes,
cutpredictor contains an empty character
vector.
cutpoint contains the cut points for
'continuous' cuts, and
cutcategories contains the set of categories.
|
cutpredictorindex
|
an n-element array of numeric indices for the variables
used for branching in each node in tree , where
n is the number of nodes. for more information, see
cutpredictor .
|
expandedpredictornames
|
expanded predictor names, stored as a cell array of character
vectors.
if the model uses encoding for categorical variables, then
expandedpredictornames includes the names that
describe the expanded variables. otherwise,
expandedpredictornames is the same as
predictornames .
|
isbranchnode
|
an n-element logical vector that is
true for each branch node and
false for each leaf node of
tree .
|
nodeclass
|
an n-element cell array with the names of the most
probable classes in each node of tree , where
n is the number of nodes in the tree. every element
of this array is a character vector equal to one of the class names in
classnames .
|
nodeerror
|
an n-element vector of the errors of the nodes in
tree , where n is the number of
nodes. nodeerror(i) is the misclassification probability
for node i .
|
nodeprobability
|
an n-element vector of the probabilities of the nodes
in tree , where n is the number of
nodes. the probability of a node is computed as the proportion of
observations from the original data that satisfy the conditions for the
node. this proportion is adjusted for any prior probabilities assigned to
each class.
|
noderisk
|
an n-element vector of the risk of the nodes in the
tree, where n is the number of nodes. the risk for each
node is the measure of impurity (gini index or deviance) for this node
weighted by the node probability. if the tree is grown by twoing, the risk
for each node is zero.
|
nodesize
|
an n-element vector of the sizes of the nodes in
tree , where n is the number of
nodes. the size of a node is defined as the number of observations from the
data used to create the tree that satisfy the conditions for the
node.
|
numnodes
|
the number of nodes in tree .
|
parent
|
an n-element vector containing the number of the parent
node for each node in tree , where n
is the number of nodes. the parent of the root node is
0 .
|
predictornames
|
a cell array of names for the predictor variables, in the order in which
they appear in x .
|
prior
|
numeric vector of prior probabilities for each class. the order
of the elements of prior corresponds to the order
of the classes in classnames .
the number of elements of prior is the number of
unique classes in the response. this property is read-only.
|
prunealpha
|
numeric vector with one element per pruning level. if the pruning level
ranges from 0 to m, then prunealpha
has m 1 elements sorted in ascending order.
prunealpha(1) is for pruning level 0 (no pruning),
prunealpha(2) is for pruning level 1, and so
on.
|
prunelist
|
an n-element numeric vector with the pruning levels in
each node of tree , where n is the
number of nodes. the pruning levels range from 0 (no pruning) to
m, where m is the distance between
the deepest leaf and the root node.
|
responsename
|
character vector describing the response variable
y .
|
scoretransform
|
function handle for transforming scores, or character vector representing
a built-in transformation function. 'none' means no
transformation; equivalently, 'none' means
@(x)x . for a list of built-in transformation
functions and the syntax of custom transformation functions, see
fitctree .
add or change a scoretransform function using dot
notation:
ctree.scoretransform = 'function'
or
ctree.scoretransform = @function
|
surrogatecutcategories
|
an n-element cell array of the categories used for
surrogate splits in tree , where n is
the number of nodes in tree . for each node
k , surrogatecutcategories{k} is a
cell array. the length of surrogatecutcategories{k} is
equal to the number of surrogate predictors found at this node. every
element of surrogatecutcategories{k} is either an empty
character vector for a continuous surrogate predictor, or is a two-element
cell array with categories for a categorical surrogate predictor. the first
element of this two-element cell array lists categories assigned to the left
child by this surrogate split and the second element of this two-element
cell array lists categories assigned to the right child by this surrogate
split. the order of the surrogate split variables at each node is matched to
the order of variables in surrogatecutvar . the
optimal-split variable at this node does not appear. for nonbranch (leaf)
nodes, surrogatecutcategories contains an empty
cell.
|
surrogatecutflip
|
an n-element cell array of the numeric cut assignments
used for surrogate splits in tree , where
n is the number of nodes in
tree . for each node k ,
surrsurrogatecutflip{k} is a numeric vector. the
length of surrogatecutflip{k} is equal to the number of
surrogate predictors found at this node. every element of
surrogatecutflip{k} is either zero for a categorical
surrogate predictor, or a numeric cut assignment for a continuous surrogate
predictor. the numeric cut assignment can be either –1 or 1. for every
surrogate split with a numeric cut c based on a
continuous predictor variable z, the left child is chosen
if z<c and the cut assignment for this surrogate split is 1, or
if z≥c and the cut assignment for this surrogate split is –1.
similarly, the right child is chosen if z≥c and the cut assignment for this surrogate split is 1, or
if z<c and the cut assignment for this surrogate split is –1. the
order of the surrogate split variables at each node is matched to the order
of variables in surrogatecutpredictor . the optimal-split
variable at this node does not appear. for nonbranch (leaf) nodes,
surrogatecutflip contains an empty array.
|
surrogatecutpoint
|
an n-element cell array of the numeric values used for
surrogate splits in tree , where n is
the number of nodes in tree . for each node
k , surrogatecutpoint{k} is a
numeric vector. the length of surrogatecutpoint{k} is
equal to the number of surrogate predictors found at this node. every
element of surrogatecutpoint{k} is either
nan for a categorical surrogate predictor, or a
numeric cut for a continuous surrogate predictor. for every surrogate split
with a numeric cut c based on a continuous predictor
variable z, the left child is chosen if z<c and surrogatecutflip for this surrogate
split is 1, or if z≥c and
surrogatecutflip for this surrogate split is –1.
similarly, the right child is chosen if z≥c and surrogatecutflip for this surrogate
split is 1, or if z<c and surrogatecutflip for this surrogate
split is –1. the order of the surrogate split variables at each node is
matched to the order of variables returned by
surrogatecutpredictor . the optimal-split variable at
this node does not appear. for nonbranch (leaf) nodes,
surrogatecutpoint contains an empty cell.
|
surrogatecuttype
|
an n-element cell array indicating types of surrogate
splits at each node in tree , where n
is the number of nodes in tree . for each node
k , surrogatecuttype{k} is a cell
array with the types of the surrogate split variables at this node. the
variables are sorted by the predictive measure of association with the
optimal predictor in the descending order, and only variables with the
positive predictive measure are included. the order of the surrogate split
variables at each node is matched to the order of variables in
surrogatecutpredictor . the optimal-split variable at
this node does not appear. for nonbranch (leaf) nodes,
surrogatecuttype contains an empty cell. a surrogate
split type can be either 'continuous' if the cut is
defined in the form z <v for a
variable z and cut point v or
'categorical' if the cut is defined by whether
z takes a value in a set of categories.
|
surrogatecutpredictor
|
an n-element cell array of the names of the variables
used for surrogate splits in each node in tree , where
n is the number of nodes in
tree . every element of
surrogatecutpredictor is a cell array with the names
of the surrogate split variables at this node. the variables are sorted by
the predictive measure of association with the optimal predictor in the
descending order, and only variables with the positive predictive measure
are included. the optimal-split variable at this node does not appear. for
nonbranch (leaf) nodes, surrogatecutpredictor contains an
empty cell.
|
surrogatepredictorassociation
|
an n-element cell array of the predictive measures of
association for surrogate splits in tree , where
n is the number of nodes in
tree . for each node k ,
surrogatepredictorassociation{k} is a numeric vector.
the length of surrogatepredictorassociation{k} is equal
to the number of surrogate predictors found at this node. every element of
surrogatepredictorassociation{k} gives the predictive
measure of association between the optimal split and this surrogate split.
the order of the surrogate split variables at each node is the order of
variables in surrogatecutpredictor . the optimal-split
variable at this node does not appear. for nonbranch (leaf) nodes,
surrogatepredictorassociation contains an empty
cell.
|