Interactive prototype

Feature Map Visualization

See how different filters and CNN layers produce different activation maps that highlight edges, textures, and patterns.

Return to module Open prototype only

Structured teaching notes

Connect the interaction to the core idea.

These notes are written to sit below the interactive prototype, preserve the same teaching flow, and help the learner name what the visualization is showing.

Background

A feature map is the spatial response pattern produced after a filter looks at an image or at a previous layer. Your prototype is strongest when it shows that different channels light up for different reasons and that those responses become more abstract across layers. In Layer 1, the learner should notice local edges, contrast changes, and simple textures. In the middle layer, the learner should notice grouped contours, corners, and local parts built from earlier channels. In the deeper layer, the learner should stop expecting a normal-looking image filter and instead look for broader object-relevant regions and more abstract focus.

The overlay interaction is especially important because it connects a selected channel back to the original image region. So the supporting text should explain not only what a feature map is, but also what the learner is supposed to notice when clicking a map, changing the layer, and comparing channels.

Important formulas

Feature map_k = φ(W_k * input)

The k-th feature map stores how strongly filter Wk responds at each spatial position after the nonlinearity phi is applied.

Deeper layers transform earlier feature maps into newer, more abstract maps

Deeper layers do not start from raw pixels; they transform earlier feature maps into newer, more abstract maps.

Tensor shape = height x width x channels

A full feature-map tensor has height, width, and one channel for each filter.

Pros

Makes the internal behavior of a CNN more interpretable by showing where and how different filters respond.
Helps beginners see the hierarchy from simple local patterns to broader and more abstract representations.
Supports debugging because dead, redundant, or overly background-focused channels become easier to notice.
Connects the original image to internal activations through overlays, which makes the representation less mysterious.

Cons

Deeper maps can become abstract enough that they stop looking intuitively image-like.
A bright activation does not automatically mean that one channel alone is responsible for the final prediction.
Visualization can be misleading if map normalization or color scaling hides relative differences.
It is easy to over-interpret one selected map and forget that the network uses many channels together.

Quick example

On a shoe image, one shallow map may respond strongly along sole edges, another may prefer diagonal upper contours, and a deeper map may highlight a larger silhouette region. Clicking different channels reveals that the CNN is not storing one copy of the image; it is storing many specialized responses to different aspects of the same image.

Common mistake

A common mistake is to say that a feature map is 'just the filtered image.' That is too shallow, especially in deeper layers where the map is already built from earlier channels. Another mistake is to interpret one bright overlay as the whole explanation for the prediction, instead of one useful intermediate response among many.