grad-凯发k8网页登录
this example shows how to use the gradient-weighted class activation mapping (grad-cam) technique to understand why a deep learning network makes its classification decisions. grad-cam, invented by selvaraju and coauthors [1], uses the gradient of the classification score with respect to the convolutional features determined by the network in order to understand which parts of the image are most important for classification. this example uses the googlenet pretrained network for images.
grad-cam is a generalization of the class activation mapping (cam) technique. for activation mapping techniques on live webcam data, see . grad-cam can also be applied to nonclassification examples such as regression or semantic segmentation. for an example showing how to use grad-cam to investigate the predictions of a semantic segmentation network, see .
load pretrained network
load the googlenet network.
net = googlenet;
classify image
read the googlenet image size.
inputsize = net.layers(1).inputsize(1:2);
load sherlock.jpg
., an image of a golden retriever included with this example.
img = imread("sherlock.jpg");
resize the image to the network input dimensions.
img = imresize(img,inputsize);
classify the image and display it, along with its classification and classification score.
[classfn,score] = classify(net,img);
imshow(img);
title(sprintf("%s (%.2f)", classfn, score(classfn)));
googlenet correctly classifies the image as a golden retriever. but why? what characteristics of the image cause the network to make this classification?
grad-cam explains why
the grad-cam technique utilizes the gradients of the classification score with respect to the final convolutional feature map, to identify the parts of an input image that most impact the classification score. the places where this gradient is large are exactly the places where the final score depends most on the data.
the gradcam
function computes the importance map by taking the derivative of the reduction layer output for a given class with respect to a convolutional feature map. for classification tasks, the gradcam
function automatically selects suitable layers to compute the importance map for. you can also specify the layers with the 'reductionlayer'
and 'featurelayer'
name-value arguments.
compute the grad-cam map.
map = gradcam(net,img,classfn);
show the grad-cam map on top of the image by using an 'alphadata'
value of 0.5. the 'jet'
colormap has deep blue as the lowest value and deep red as the highest.
imshow(img); hold on; imagesc(map,'alphadata',0.5); colormap jet hold off; title("grad-cam");
clearly, the upper face and ear of the dog have the greatest impact on the classification.
for a different approach to investigating the reasons for deep network classifications, see and .
references
[1] selvaraju, r. r., m. cogswell, a. das, r. vedantam, d. parikh, and d. batra. "grad-cam: visual explanations from deep networks via gradient-based localization." in ieee international conference on computer vision (iccv), 2017, pp. 618–626. available at on the computer vision foundation open access website.
see also
| | |