0 Posted 2025-11-12Updated 2025-11-124 minutes read (About 667 words)

Comprehensive CNN in Math

Math for CNN

related information: Piotr Skalski from Medium

Main Formula

$$
y_{i,j,k’} = b_{k’} + \sum_{k=1}^{K} \sum_{p=1}^{P} \sum_{q=1}^{Q} W_{p,q,k,k’} ; x_{i+p-1,,j+q-1,,k}
$$

Why it look so confusing?

The triple summation is simply an expanded form of a matrix multiplication (dot product), written out to show every element-wise operation. In practice, this whole operation is just a generalized matrix (or tensor) dot product.

where:

$ x_{i,j,k} $: input feature map at position $(i, j)$, channel $k$
$ W_{p,q,k,k’} $: kernel weight at position $(p, q)$, from input channel $k$ to output channel $k’$
$ b_{k’} $: bias for output channel $k’$
$ y_{i,j,k’} $: output feature map at position $(i, j)$, channel $k’$

With stride $ S $:

$$
y_{i,,j,,k’} = b_{k’} + \sum_{k=1}^{K} \sum_{p=1}^{P} \sum_{q=1}^{Q} W_{p,q,k,k’} ; x_{S\cdot i+p-1,,S\cdot j+q-1,,k}
$$

General CNN layer in vector form:
$$
z^{(\ell)} = W^{(\ell)} * a^{(\ell-1)} + b^{(\ell)}
$$
$$
a^{(\ell)} = g(z^{(\ell)})
$$

In the context of CNNs and neural networks, $g(z^{(\ell)})$ represents the activation function

Kernal Functoins

Common Kernel Types in CNNs

There are different types of kernels (also called filters) in CNNs, each serving a specific purpose to extract important features from input data. Here are some main types and why they’re preferred for different tasks:

1. Edge Detection Kernels

Examples: Sobel, Prewitt, Scharr, Roberts
Why They’re Good: They highlight edges in images by calculating differences in pixel intensities. Useful for identifying boundaries, shapes, and object outlines.
Typical Use: Early/convolutional layers to capture structure, shapes.

Name	Kernel Example	Detects
Sobel X	$\begin{bmatrix}-1 & 0 & 1\\ -2 & 0 & 2\\ -1 & 0 & 1\end{bmatrix}$	Horizontal edges
Sobel Y	$\begin{bmatrix}-1 & -2 & -1\\ 0 & 0 & 0\\ 1 & 2 & 1\end{bmatrix}$	Vertical edges

2. Blur/Smoothing Kernels

Examples: Average (Box), Gaussian Blur
Why They’re Good: Reduce noise and detail, making feature extraction more robust by focusing on larger-scale patterns rather than tiny irrelevant details.
Typical Use: Preprocessing, denoising, sometimes within networks before strong feature detection.
Average Blur: Each value replaced by the average inside the kernel window.
Gaussian Blur: Weights values by a Gaussian function.

3. Sharpening Kernels

Examples: Laplacian, Unsharp Mask
Why They’re Good: Emphasize transitions in intensity. Useful for enhancing details and features so model can find fine structure.
Typical Use: Image enhancement, sometimes as part of feature engineering.

4. Emboss/Outline Kernels

Why They’re Good: Emphasize specific patterns (e.g., outlines, textures), helpful for texture analysis, stylization, and making geometric features stand out.

5. Learned Kernels in CNNs

Why They’re Good: During training, CNNs automatically learn the kernels/filters best suited for the task:
- Early layers: tend to learn edge/texture-like filters (as above).
- Deeper layers: learn more complex patterns, such as parts of objects, motifs, textures, or even whole objects.

6. Specialized Kernels

Dilated Kernels: Increase receptive field without increasing computation, good for context aggregation (e.g., semantic segmentation).
Depthwise Separable Kernels: Used in efficient architectures (e.g., MobileNet), separate spatial and depth-wise filtering to save computation.
Grouped Convolutions: Enables splitting feature processing, used in ResNeXt, AlexNet.

Summary Table:

Kernel Type	Typical Purpose	Example Layer/Use
Edge Detection	Outline objects/features	Early feature extraction
Blur (Smoothing)	Noise reduction, abstraction	Preprocessing, denoising
Sharpening	Feature enhancement	Enhancement, texture extraction
Emboss/Outline	Make outlines/textures stand out	Texture/style analysis
Dilated	Large context with fewer parameters	Segmentation, ASPP modules
Depthwise/Grouped	Computational efficiency	Mobile/efficient CNNs
Learned (Generic)	Task-adaptive (via training)	All CNN layers (esp. deep ones)

In summary:

Early CNN layers often end up learning kernels similar to classic edge or blur kernels, because these are useful primitives for interpreting raw visual data.
Deeper layers learn more complex, task-specific kernels.
Kernel choice/learning enables CNNs to adapt to different tasks in computer vision—object recognition, segmentation, style transfer, etc.

Comprehensive CNN in Math

https://karobben.github.io/2025/11/12/AI/DL-CNN/

Author

Karobben

Posted on

2025-11-12

Updated on

2025-11-12

Licensed under

#Machine Learning Deep Learning

Comprehensive CNN in Math

Math for CNN

Main Formula

Kernal Functoins

Common Kernel Types in CNNs

1. Edge Detection Kernels

2. Blur/Smoothing Kernels

3. Sharpening Kernels

4. Emboss/Outline Kernels

5. Learned Kernels in CNNs

6. Specialized Kernels

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Comments

Catalogue

Tags

Subscribe for updates

Links

Recommends

Categories