Digital Image¶
What is a Digital Image?¶
A digital image is a 2D function \(I(x, y)\) mapping pixel coordinates to intensity values, making images computable entities that can be stored, processed, and analyzed as matrices.
Digital Image Formats¶
Raster image formats store a grid of colored dots (pixels). The number of bits per pixel determines color depth:
1 bpp: 2 colors (binary — black or white)
4 bpp: 16 colors
8 bpp: 256 colors
24 bpp: 8 bits per channel (RGB) — 16,777,216 colors
32 bpp: 24-bit color + 8-bit alpha channel
Common formats: GIF, JPG, PPM, TIF, BMP, camera RAW.
Tools: OpenCV/Python, MATLAB/Octave, Processing.org
Point Processes¶
Point processes operate on individual pixels independently — the output at \((x, y)\) depends only on the input at \((x, y)\).
Operations on Images¶
Add/Subtract: Combine or difference two images pixel-wise
Alpha blending: Weighted combination of images using transparency parameter \(\alpha \in [0, 1]\)
\(\alpha = 0\) → invisible, \(\alpha = 1\) → fully visible
RGB becomes \(\alpha\) RGB (premultiplied alpha)
Blend: \(I_{out} = \alpha \cdot I_A + (1 - \alpha) \cdot I_B\)
Image histograms: Distribution of pixel intensity values; useful for contrast analysis and equalization
Blending Modes¶
Blending modes define how two layers of pixels are combined:
Arithmetic modes:
Average: \(f(a, b) = (a + b) / 2\)
Normal: \(f(a, b) = b\)
Addition: Tends to produce whites (overexposure)
Subtraction: Tends to produce blacks (underexposure)
Difference: Subtract with scaling
Divide: Brightens photos
Darken: \(f(a, b) = \min(a, b)\) per channel
Lighten: \(f(a, b) = \max(a, b)\) per channel
Advanced modes:
Multiply: Darkens — \(f(a, b) = a \cdot b\)
Screen: Brightens — \(f(a, b) = 1 - (1 - a)(1 - b)\)
Smoothing¶
Smoothing reduces noise by averaging pixel values over a neighborhood.
Box filter (averaging): Replaces each pixel with the uniform average of its kernel neighborhood (e.g., 21×21).
Gaussian filter: Weights neighbors by a Gaussian distribution — closer pixels contribute more. Produces smoother results than box filtering.
Median filtering: A non-linear operation that replaces each pixel with the median of all pixels in the kernel area.
Reduces noise effectively
Preserves edges (sharp lines) — unlike averaging filters which blur edges
Main idea: use median instead of mean
Convolution and Cross-Correlation¶
Cross-correlation: Sliding dot product of a kernel \(h\) over an image \(F\):
Denoted \(G = h \otimes F\).
Replaces each pixel with a linear combination of its neighbors
The kernel \(h[u, v]\) specifies the weights
Convolution: Same as cross-correlation but with a flipped kernel. For symmetric kernels, cross-correlation and convolution produce identical results.
Common filters:
Box filter: Uniform weights (e.g., 21×21) — averaging
Gaussian filter: Weights follow normal distribution — smoother falloff
Gradients and Edge Detection¶
Edges are locations of rapid change in the image intensity function \(F(x, y)\). They appear as ridges in the 3D height map of an image.
Discontinuities arise from changes in:
Surface normal
Depth
Surface color
Illumination
Edge detection approach:
Look for neighborhoods with strong signs of change
Considerations: neighborhood size, change metric, threshold
Compute gradients (discrete derivatives) using kernels
Gradient kernels (operators for computing discrete derivatives):
Prewitt: Equal-weight gradient approximation
Sobel: Weighted gradient approximation (emphasizes center row/column)
Roberts: 2×2 diagonal difference operator
Canny Edge Detector: The standard multi-stage edge detection algorithm:
Smooth with Gaussian filter
Compute gradient magnitude and direction
Non-maximum suppression (thin edges)
Hysteresis thresholding (strong/weak edge linking)