A feature map is the spatial response pattern produced after a filter looks at an image or at a previous layer. Your prototype is strongest when it shows that different channels light up for different reasons and that those responses become more abstract across layers. In Layer 1, the learner should notice local edges, contrast changes, and simple textures. In the middle layer, the learner should notice grouped contours, corners, and local parts built from earlier channels. In the deeper layer, the learner should stop expecting a normal-looking image filter and instead look for broader object-relevant regions and more abstract focus.
The overlay interaction is especially important because it connects a selected channel back to the original image region. So the supporting text should explain not only what a feature map is, but also what the learner is supposed to notice when clicking a map, changing the layer, and comparing channels.