interpret machine learning models
this topic introduces statistics and machine learning toolbox™ features for model interpretation and shows how to interpret a machine learning model (classification and regression).
a machine learning model is often referred to as a "black box" model because it can be difficult to understand how the model makes predictions. interpretability tools help you overcome this aspect of machine learning algorithms and reveal how predictors contribute (or do not contribute) to predictions. also, you can validate whether the model uses the correct evidence for its predictions, and find model biases that are not immediately apparent.
features for model interpretation
use lime
, shapley
, and
plotpartialdependence
to explain the contribution of individual
predictors to the predictions of a trained classification or regression model.
lime
— local interpretable model-agnostic explanations (lime ) interpret a prediction for a query point by fitting a simple interpretable model for the query point. the simple model acts as an approximation for the trained model and explains model predictions around the query point. the simple model can be either a linear model or a decision tree model. you can use the estimated coefficients of a linear model or the estimated predictor importance of a decision tree model to explain the contribution of individual predictors to the prediction for the query point. for more details, see lime.shapley
— the shapley value (, , and ) of a predictor for a query point explains the deviation of the prediction (response for regression or class scores for classification) for the query point from the average prediction, due to the predictor. for a query point, the sum of the shapley values for all features corresponds to the total deviation of the prediction from the average. for more details, see shapley values for machine learning model.plotpartialdependence
andpartialdependence
— a partial dependence plot (pdp ) shows the relationships between a predictor (or a pair of predictors) and the prediction (response for regression or class scores for classification) in the trained model. the partial dependence on the selected predictor is defined by the averaged prediction obtained by marginalizing out the effect of the other variables. therefore, the partial dependence is a function of the selected predictor that shows the average effect of the selected predictor over the data set. you can also create a set of individual conditional expectation (ice ) plots for each observation, showing the effect of the selected predictor on a single observation. for more details, see more about on theplotpartialdependence
reference page.
some machine learning models support embedded type feature selection, where the model learns predictor importance as part of the model learning process. you can use the estimated predictor importance to explain model predictions. for example:
train an ensemble ( or
regressionbaggedensemble
) of bagged decision trees (for example, random forest) and use thepredictorimportance
andoobpermutedpredictorimportance
functions.train a linear model with lasso regularization, which shrinks the coefficients of the least important predictors. then use the estimated coefficients as measures for predictor importance. for example, use
fitclinear
orfitrlinear
and specify the'regularization'
name-value argument as'lasso'
.
for a list of machine learning models that support embedded type feature selection, see embedded type feature selection.
use statistics and machine learning toolbox features for three levels of model interpretation: local, cohort, and global.
level | objective | use case | statistics and machine learning toolbox feature |
---|---|---|---|
local interpretation | explain a prediction for a single query point. |
| use lime and
shapley
for a specified query point. |
cohort interpretation | explain how a trained model makes predictions for a subset of the entire data set. | validate predictions for a particular group of samples. |
|
global interpretation | explain how a trained model makes predictions for the entire data set. |
|
|
interpret classification model
this example trains an ensemble of bagged decision trees using the random forest algorithm, and interprets the trained model using interpretability features. use the object functions (oobpermutedpredictorimportance
and predictorimportance
) of the trained model to find important predictors in the model. also, use lime
and shapley
to interpret the predictions for specified query points. then use plotpartialdependence
to create a plot that shows the relationships between an important predictor and predicted classification scores.
train classification ensemble model
load the creditrating_historical
data set. the data set contains customer ids and their financial ratios, industry labels, and credit ratings.
tbl = readtable('creditrating_historical.dat');
display the first three rows of the table.
head(tbl,3)
id wc_ta re_ta ebit_ta mve_bvtd s_ta industry rating _____ _____ _____ _______ ________ _____ ________ ______ 62394 0.013 0.104 0.036 0.447 0.142 3 {'bb'} 48608 0.232 0.335 0.062 1.969 0.281 8 {'a' } 42444 0.311 0.367 0.074 1.935 0.366 1 {'a' }
create a table of predictor variables by removing the columns containing customer ids and ratings from tbl
.
tblx = removevars(tbl,["id","rating"]);
train an ensemble of bagged decision trees by using the fitcensemble
function and specifying the ensemble aggregation method as random forest ('bag'
). for reproducibility of the random forest algorithm, specify the 'reproducible'
name-value argument as true
for tree learners. also, specify the class names to set the order of the classes in the trained model.
rng('default') % for reproducibility t = templatetree('reproducible',true); blackbox = fitcensemble(tblx,tbl.rating, ... 'method','bag','learners',t, ... 'categoricalpredictors','industry', ... 'classnames',{'aaa' 'aa' 'a' 'bbb' 'bb' 'b' 'ccc'});
blackbox
is a model.
use model-specific interpretability features
classificationbaggedensemble
supports two object functions, oobpermutedpredictorimportance
and predictorimportance
, which find important predictors in the trained model.
estimate out-of-bag predictor importance by using the oobpermutedpredictorimportance
function. the function randomly permutes out-of-bag data across one predictor at a time, and estimates the increase in the out-of-bag error due to this permutation. the larger the increase, the more important the feature.
imp1 = oobpermutedpredictorimportance(blackbox);
estimate predictor importance by using the predictorimportance
function. the function estimates predictor importance by summing changes in the node risk due to splits on each predictor and dividing the sum by the number of branch nodes.
imp2 = predictorimportance(blackbox);
create a table containing the predictor importance estimates, and use the table to create horizontal bar graphs. to display an existing underscore in any predictor name, change the ticklabelinterpreter
value of the axes to 'none'
.
table_imp = table(imp1',imp2', ... 'variablenames',{'out-of-bag permuted predictor importance','predictor importance'}, ... 'rownames',blackbox.predictornames); tiledlayout(1,2) ax1 = nexttile; table_imp1 = sortrows(table_imp,'out-of-bag permuted predictor importance'); barh(categorical(table_imp1.row,table_imp1.row),table_imp1.('out-of-bag permuted predictor importance')) xlabel('out-of-bag permuted predictor importance') ylabel('predictor') ax2 = nexttile; table_imp2 = sortrows(table_imp,'predictor importance'); barh(categorical(table_imp2.row,table_imp2.row),table_imp2.('predictor importance')) xlabel('predictor importance') ax1.ticklabelinterpreter = 'none'; ax2.ticklabelinterpreter = 'none';
both object functions identify mve_bvtd
and re_ta
as the two most important predictors.
specify query point
find the observations whose rating
is 'aaa'
and choose four query points among them.
rng('default') tblx_aaa = tblx(strcmp(tbl.rating,'aaa'),:); querypoint = datasample(tblx_aaa,4,'replace',false)
querypoint=4×6 table
wc_ta re_ta ebit_ta mve_bvtd s_ta industry
_____ _____ _______ ________ _____ ________
0.283 0.715 0.069 9.612 1.066 11
0.603 0.891 0.117 7.851 0.591 6
0.212 0.486 0.057 3.986 0.679 2
0.273 0.491 0.071 3.287 0.465 5
use lime with linear simple models
explain the predictions for the query points using lime
with linear simple models. lime
generates a synthetic data set and fits a simple model to the synthetic data set.
create a lime
object using tblx_aaa
so that lime
generates a synthetic data set using only the observations whose rating
is 'aaa'
, not the entire data set.
explainer_lime = lime(blackbox,tblx_aaa);
the default value of datalocality for lime
is 'global'
, which implies that, by default, lime
generates a global synthetic data set and uses it for any query points. lime
uses different observation weights so that weight values are more focused on the observations near the query point. therefore, you can interpret each simple model as an approximation of the trained model for a specific query point.
fit simple models for the four query points by using the object function fit
. specify the third input (the number of important predictors to use in the simple model) as 6 to use all six predictors.
explainer_lime1 = fit(explainer_lime,querypoint(1,:),6); explainer_lime2 = fit(explainer_lime,querypoint(2,:),6); explainer_lime3 = fit(explainer_lime,querypoint(3,:),6); explainer_lime4 = fit(explainer_lime,querypoint(4,:),6);
plot the coefficients of the simple models by using the object function plot
.
tiledlayout(2,2) ax1 = nexttile; plot(explainer_lime1); ax2 = nexttile; plot(explainer_lime2); ax3 = nexttile; plot(explainer_lime3); ax4 = nexttile; plot(explainer_lime4); ax1.ticklabelinterpreter = 'none'; ax2.ticklabelinterpreter = 'none'; ax3.ticklabelinterpreter = 'none'; ax4.ticklabelinterpreter = 'none';
all simple models identify ebit_ta
, mve_bvtd
, re_ta
, and wc_ta
as the four most important predictors. the positive coefficients for the predictors suggest that increasing the predictor values leads to an increase in the predicted scores in the simple models.
for a categorical predictor, the plot
function displays only the most important dummy variable of the categorical predictor. therefore, each bar graph displays a different dummy variable.
compute shapley values
the shapley value of a predictor for a query point explains the deviation of the predicted score for the query point from the average score, due to the predictor. create a shapley
object using tblx_aaa
so that shapley
computes the expected contribution based on the samples for 'aaa'
.
explainer_shapley = shapley(blackbox,tblx_aaa);
compute the shapley values for the query points by using the object function fit
.
explainer_shapley1 = fit(explainer_shapley,querypoint(1,:)); explainer_shapley2 = fit(explainer_shapley,querypoint(2,:)); explainer_shapley3 = fit(explainer_shapley,querypoint(3,:)); explainer_shapley4 = fit(explainer_shapley,querypoint(4,:));
plot the shapley values by using the object function plot
.
tiledlayout(2,2) nexttile plot(explainer_shapley1) nexttile plot(explainer_shapley2) nexttile plot(explainer_shapley3) nexttile plot(explainer_shapley4)
mve_bvtd
is the most important predictor for the first three query points. the shapley values of mve_bvtd
are negative for the three query points. the mve_bvtd
variable values are about 9.6, 7.9, 4.0, and 3.3 for the query points. according to the shapley values for the four query points, a large mve_bvtd
value leads to a decrease in the predicted score, and a small mve_bvtd
value leads to an increase in the predicted scores compared to the average.
create partial dependence plot (pdp)
a pdp plot shows the averaged relationships between the predictor and the predicted score in the trained model. create pdps for re_ta
and mve_bvtd
, which the other interpretability tools identify as important predictors. pass tblx_aaa
to plotpartialdependence
so that the function computes the expectation of the predicted scores using only the samples for 'aaa'
.
figure plotpartialdependence(blackbox,'re_ta','aaa',tblx_aaa)
plotpartialdependence(blackbox,'mve_bvtd','aaa',tblx_aaa)
the minor ticks in the x
-axis represent the unique values of the predictor in tbl_aaa
. the plot for mve_bvtd
shows that the predicted score is large when the mve_bvtd
value is small. the score value decreases as the mve_bvtd
value increases until it reaches about 5, and then the score value stays unchanged as the mve_bvtd
value increases. the dependency on mve_bvtd
in the subset tbl_aaa
identified by plotpartialdependence
is not consistent with the local contributions of mve_bvtd
at the four query points identified by lime
and shapley
.
interpret regression model
the model interpretation workflow for a regression problem is similar to the workflow for a classification problem, as demonstrated in the example interpret classification model.
this example trains a gaussian process regression (gpr) model and interprets the trained model using interpretability features. use a kernel parameter of the gpr model to estimate predictor weights. also, use lime
and shapley
to interpret the predictions for specified query points. then use plotpartialdependence
to create a plot that shows the relationships between an important predictor and predicted responses.
train gpr model
load the carbig
data set, which contains measurements of cars made in the 1970s and early 1980s.
load carbig
create a table containing the predictor variables acceleration
, cylinders
, and so on
tbl = table(acceleration,cylinders,displacement,horsepower,model_year,weight);
train a gpr model of the response variable mpg
by using the fitrgp
function. specify kernelfunction
as 'ardsquaredexponential'
to use the squared exponential kernel with a separate length scale per predictor.
blackbox = fitrgp(tbl,mpg,'responsename','mpg','categoricalpredictors',[2 5], ... 'kernelfunction','ardsquaredexponential');
blackbox
is a regressiongp
model.
use model-specific interpretability features
you can compute predictor weights (predictor importance) from the learned length scales of the kernel function used in the model. the length scales define how far apart a predictor can be for the response values to become uncorrelated. find the normalized predictor weights by taking the exponential of the negative learned length scales.
sigmal = blackbox.kernelinformation.kernelparameters(1:end-1); % learned length scales weights = exp(-sigmal); % predictor weights weights = weights/sum(weights); % normalized predictor weights
create a table containing the normalized predictor weights, and use the table to create horizontal bar graphs. to display an existing underscore in any predictor name, change the ticklabelinterpreter
value of the axes to 'none'
.
tbl_weight = table(weights,'variablenames',{'predictor weight'}, ... 'rownames',blackbox.expandedpredictornames); tbl_weight = sortrows(tbl_weight,'predictor weight'); b = barh(categorical(tbl_weight.row,tbl_weight.row),tbl_weight.('predictor weight')); b.parent.ticklabelinterpreter = 'none'; xlabel('predictor weight') ylabel('predictor')
the predictor weights indicate that multiple dummy variables for the categorical predictors model_year
and cylinders
are important.
specify query point
find the observations whose mpg
values are smaller than the 0.25 quantile of mpg
. from the subset, choose four query points that do not include missing values.
rng('default') % for reproducibility idx_subset = find(mpg < quantile(mpg,0.25)); tbl_subset = tbl(idx_subset,:); querypoint = datasample(rmmissing(tbl_subset),4,'replace',false)
querypoint=4×6 table
acceleration cylinders displacement horsepower model_year weight
____________ _________ ____________ __________ __________ ______
13.2 8 318 150 76 3940
14.9 8 302 130 77 4295
14 8 360 215 70 4615
13.7 8 318 145 77 4140
use lime with tree simple models
explain the predictions for the query points using lime
with decision tree simple models. lime
generates a synthetic data set and fits a simple model to the synthetic data set.
create a lime
object using tbl_subset
so that lime
generates a synthetic data set using the subset instead of the entire data set. specify simplemodeltype
as 'tree'
to use a decision tree simple model.
explainer_lime = lime(blackbox,tbl_subset,'simplemodeltype','tree');
the default value of datalocality for lime
is 'global'
, which implies that, by default, lime
generates a global synthetic data set and uses it for any query points. lime
uses different observation weights so that weight values are more focused on the observations near the query point. therefore, you can interpret each simple model as an approximation of the trained model for a specific query point.
fit simple models for the four query points by using the object function fit
. specify the third input (the number of important predictors to use in the simple model) as 6. with this setting, the software specifies the maximum number of decision splits (or branch nodes) as 6 so that the fitted decision tree uses at most all predictors.
explainer_lime1 = fit(explainer_lime,querypoint(1,:),6); explainer_lime2 = fit(explainer_lime,querypoint(2,:),6); explainer_lime3 = fit(explainer_lime,querypoint(3,:),6); explainer_lime4 = fit(explainer_lime,querypoint(4,:),6);
plot the predictor importance by using the object function plot
.
tiledlayout(2,2) ax1 = nexttile; plot(explainer_lime1); ax2 = nexttile; plot(explainer_lime2); ax3 = nexttile; plot(explainer_lime3); ax4 = nexttile; plot(explainer_lime4); ax1.ticklabelinterpreter = 'none'; ax2.ticklabelinterpreter = 'none'; ax3.ticklabelinterpreter = 'none'; ax4.ticklabelinterpreter = 'none';
all simple models identify displacement
, model_year
, and weight
as important predictors.
compute shapley values
the shapley value of a predictor for a query point explains the deviation of the predicted response for the query point from the average response, due to the predictor. create a shapley
object for the model blackbox
using tbl_subset
so that shapley
computes the expected contribution based on the observations in tbl_subset
.
explainer_shapley = shapley(blackbox,tbl_subset);
compute the shapley values for the query points by using the object function fit
.
explainer_shapley1 = fit(explainer_shapley,querypoint(1,:)); explainer_shapley2 = fit(explainer_shapley,querypoint(2,:)); explainer_shapley3 = fit(explainer_shapley,querypoint(3,:)); explainer_shapley4 = fit(explainer_shapley,querypoint(4,:));
plot the shapley values by using the object function plot
.
tiledlayout(2,2) nexttile plot(explainer_shapley1) nexttile plot(explainer_shapley2) nexttile plot(explainer_shapley3) nexttile plot(explainer_shapley4)
model_year
is the most important predictor for the first, second, and fourth query points, and the shapley values of model_year
are positive for the three query points. the model_year
variable value is 76 or 77 for these three points, and the value for the third query point is 70. according to the shapley values for the four query points, a small model_year
value leads to a decrease in the predicted response, and a large model_year
value leads to an increase in the predicted response compared to the average.
create partial dependence plot (pdp)
a pdp plot shows the averaged relationships between the predictor and the predicted response in the trained model. create a pdp for model_year
, which the other interpretability tools identify as an important predictor. pass tbl_subset
to plotpartialdependence
so that the function computes the expectation of the predicted responses using only the samples in tbl_subset
.
figure
plotpartialdependence(blackbox,'model_year',tbl_subset)
the plot shows the same trend identified by the shapley values for the four query points. the predicted response (mpg
) value increases as the model_year
value increases.
references
see also
lime
| shapley
| plotpartialdependence
related topics
- shapley values for machine learning model
- introduction to feature selection
- (deep learning toolbox)