Deep Learning with Pytorch in a Nutshell
  • Content
  • Image recognition
  • Object detection
  • Semantic segmentation
  • GAN
  • Image style transfer
  • Face recognition
  • Interpretability
  • Word embedding
  • Pytorch
  • Optimization
  • Special layers
    • Transposed convolution
  • Neural architecture search
  • Reinforcement learning
    • Proof of Bellman equation
    • Tabular solution method
Powered by GitBook
On this page
  • Visualizing and Understanding Convolutional Networks
  • Learning Deep Features for Discriminative Localization (Class activation mapping, CAM)

Interpretability

PreviousFace recognitionNextWord embedding

Last updated 6 years ago

Visualizing and Understanding Convolutional Networks

Learning Deep Features for Discriminative Localization (Class activation mapping, CAM)

Class activation mapping (CAM)

  • A weakly supervised localization method.

  • If the last three layers are "convolution + global average pooling+ full connection), then you can apply this method to visualize object location.

  • Just use fc's weight and calculate weighted sum of convolution feature maps (without global average pooling).

Class activation mapping

Let a1,...,aka_1,..., a_ka1​,...,ak​ be the k feature maps of the last convolution layer and followed by GAP and FC layers, and MMMis the size of a feature map, and wnw_nwn​ is the weight which connected n-th filter to one of the scores called "s".

s=∑n=1k(wn∑ij(an)ijM)=1M∑ij(∑n=1kwnan)ijs = \sum_{n=1}^k (w_n \frac{\sum_{ij} (a_n)_{ij}}{M})=\frac{1}{M}\sum_{ij} (\sum_{n=1}^kw_na_n)_{ij}s=n=1∑k​(wn​M∑ij​(an​)ij​​)=M1​ij∑​(n=1∑k​wn​an​)ij​

It changed the layers' order from "conv->GAP->FC" to "conv->FC->GAP". The following simple equation is called the "class activation mapping" of class "s".

CAM=∑n=1kwnan\text{CAM}=\sum_{n=1}^kw_na_nCAM=n=1∑k​wn​an​
https://cs.nyu.edu/~fergus/papers/zeilerECCV2014.pdf
http://cnnlocalization.csail.mit.edu/Zhou_Learning_Deep_Features_CVPR_2016_paper.pdf