Digital Image Processing

Digital Image Processing

Check Image Processing Demo

Chapter 1: Introduction to Digital Image Processing

This chapter defines digital image processing as the use of computers to manipulate images, tracing its origins to early space missions (e.g., Ranger 7’s lunar images) and medical applications like CT scans. It categorizes processes into low-level (e.g., noise reduction), mid-level (e.g., object segmentation), and high-level (e.g., computer vision). Examples span astronomy (e.g., Cygnus Loop in gamma rays), medical imaging (PET scans), and industrial inspection, illustrating the field’s breadth. The chapter emphasizes the interplay of hardware (e.g., CCD sensors) and software, with key applications in enhancing, restoring, and analyzing images for human and machine interpretation.

Chapter 2: Digital Image Fundamentals

Focused on the physics and math behind digital images, this chapter explains how images are formed via illumination ((i(x,y))) and reflectance ((r(x,y))), and how human vision perceives brightness logarithmically, with rods and cones handling low-light and color. Sampling (spatial discretization) and quantization (intensity discretization) convert continuous images to digital form; reducing resolution from 930 to 72 dpi blurs details, while insufficient levels (e.g., <16) cause “false contouring.” Interpolation techniques like bilinear resampling improve zoom quality. The chapter also introduces pixel adjacency (4-, 8-, m-connectivity) and distance metrics (e.g., Euclidean, city-block), essential for analyzing pixel relationships.

Chapter 3: Intensity Transformations and Spatial Filtering

This chapter explores techniques to enhance image appearance. Histogram equalization automatically spreads intensity levels to maximize contrast, as seen in improving low-contrast medical images. Spatial filtering uses kernels: Gaussian kernels (e.g., 43x43 with σ=7) blur images smoothly, while median filters (e.g., 7x7) reduce salt-and-pepper noise better than linear filters. The Laplacian operator enhances edges by highlighting intensity transitions, and unsharp masking (subtracting a blurred version from the original) sharpens details. Examples include enhancing X-ray contrast and restoring faded historical photos, demonstrating how transformations and filters address specific visual limitations.

Chapter 4: Filtering in the Frequency Domain

Bridging spatial and frequency domains, this chapter explains how the Fourier transform decomposes images into frequency components. Lowpass filters (e.g., Gaussian) smooth images by attenuating high frequencies, while highpass filters emphasize edges. For instance, ideal lowpass filters create ringing artifacts, while their spatial counterparts (e.g., Gaussian) avoid this. The Fast Fourier Transform (FFT) accelerates computations, enabling applications like removing periodic noise from satellite imagery. The chapter highlights duality: spatial convolution mirrors frequency-domain multiplication, and vice versa, crucial for tasks like image compression and restoration.

Chapter 5: Image Restoration and Reconstruction

Focused on reversing degradation, this chapter models image blur and noise (e.g., Gaussian, motion blur). Wiener filtering minimizes error by balancing noise and detail, while inverse filtering requires estimating degradation functions (e.g., from blurry photos). Reconstruction from projections (e.g., CT scans) uses filtered back-projection to synthesize 3D images. Example: Restoring a faded X-ray by modeling dust and applying inverse filtering. The chapter also discusses noise reduction via image averaging, showing that averaging 50 noisy images reduces noise variance by 1/50, improving clarity in low-light astronomical images.

Chapter 6: Color Image Processing

This chapter delves into color models (RGB, CMYK, HSI) and their applications. Pseudocolor mapping assigns colors to grayscale for enhanced interpretation, as in MRI scans highlighting tissues. Color segmentation isolates objects by hue, while correction adjusts for lighting biases. For example, false-color satellite images distinguish vegetation (near-infrared) from water. The chapter also covers color filtering and compression, noting that RGB images decompose into three 8-bit channels, each processed separately for tasks like noise reduction or edge detection.

Chapter 7: Wavelet and Other Image Transforms

Wavelet transforms offer a powerful approach in image processing. They divide an image into different scales and orientations, providing a multi - resolution analysis. The Haar wavelet, as a basic example, decomposes an image into four sub - images: an approximation sub - image and three detail sub - images (horizontal, vertical, and diagonal). This decomposition allows for efficient compression, as most of the image’s energy is concentrated in the approximation sub - image. For instance, in a medical X - ray image, wavelet compression can reduce the file size significantly while maintaining important diagnostic details.

The discrete wavelet transform (DWT) is a key concept here. It can be computed using a filter bank approach, where high - pass and low - pass filters are applied to the image data. Compared to the Fourier transform, the DWT is better at capturing local features, making it suitable for edge detection and texture analysis. For example, in a satellite image of a forest, wavelet analysis can accurately identify the boundaries of different tree clusters and the unique textures of the forest canopy.

Chapter 8: Image Compression

Image compression is crucial for efficient storage and transmission of images. There are two main types: lossless and lossy compression. Lossless compression, such as Huffman coding and Lempel - Ziv - Welch (LZW) coding, reduces redundancy in the image data without losing any information. For example, in a barcode image, lossless compression ensures that the barcode can be accurately decoded after decompression.

Lossy compression, on the other hand, sacrifices some information to achieve higher compression ratios. The most well - known lossy compression standard is JPEG, which uses the discrete cosine transform (DCT). In JPEG, the image is divided into 8x8 blocks, and the DCT is applied to each block. The high - frequency coefficients are then quantized and discarded, resulting in a reduced data size. For a large - scale landscape photo, JPEG compression can reduce the file size by up to 90% while still maintaining an acceptable visual quality.

Chapter 9: Morphological Image Processing

Morphological operations are used to analyze and process the shape and structure of objects in an image. Erosion and dilation are the two fundamental operations. Erosion shrinks the objects in an image, while dilation expands them. For example, in a binary image of a fingerprint, erosion can be used to remove small spurious lines, and dilation can be used to fill in small gaps.

Opening and closing are derived operations. Opening is erosion followed by dilation, which is useful for removing small objects and smoothing the boundaries of larger objects. Closing is dilation followed by erosion, which can fill in small holes and connect broken parts of an object. In a satellite image of a city, morphological operations can be used to separate buildings from roads and to clean up the image for further analysis.

Chapter 10: Image Segmentation

Image segmentation is the process of dividing an image into multiple regions or objects. There are several segmentation methods, including thresholding, edge - based segmentation, and region - based segmentation. Thresholding is a simple and widely used method that separates an image into foreground and background based on pixel intensities. For example, in a medical image of a tumor, a suitable threshold can be used to isolate the tumor from the surrounding tissue.

Edge - based segmentation detects the boundaries of objects in an image using edge detectors such as the Sobel or Canny edge detectors. In an image of a car, edge - based segmentation can identify the outline of the car. Region - based segmentation groups pixels into regions based on their similarity in terms of color, texture, or intensity. In a natural scene image, region - based segmentation can group all the grass pixels together and all the sky pixels together.

Chapter 11: Feature Extraction

This chapter focuses on deriving descriptive attributes from segmented images to enable object analysis and classification. It begins by distinguishing between boundary and region features. Boundary descriptors, such as chain codes and Fourier descriptors, capture shape characteristics—for example, Fourier descriptors can reduce a complex shape to a few coefficients, allowing recognition of handwritten digits by encoding their contour shapes . Region features include texture (measured via co-occurrence matrices to quantify pixel spatial relationships) and color histograms, which are vital for differentiating materials in satellite imagery or cancerous tissues in medical scans.

A key concept is the Scale-Invariant Feature Transform (SIFT), which detects keypoints invariant to scale and rotation. For instance, in a wildlife photo, SIFT can identify a bird regardless of its size or orientation by extracting local gradient patterns . The chapter also introduces principal component analysis (PCA) for dimensionality reduction, allowing efficient representation of high-dimensional feature vectors—critical for handling large datasets like facial recognition databases.

Chapter 12: Image Pattern Classification

This chapter explores algorithms to assign labels to image regions or objects based on extracted features. Bayesian classifiers use probability theory to minimize misclassification error, assuming Gaussian distributions of feature vectors. For example, in a medical diagnostic system, Bayesian classifiers can distinguish benign vs. malignant tumors by analyzing texture and shape features .

Neural networks and deep learning are central here. Traditional neural networks with fully connected layers learn nonlinear feature relationships, while deep convolutional neural networks (CNNs) excel at image tasks by leveraging convolutional layers to detect hierarchical features (e.g., edges → textures → objects). For instance, a CNN trained on satellite imagery can classify land use (agricultural, urban) by learning to recognize patterns in pixel grids . The chapter emphasizes backpropagation for error correction and demonstrates applications in handwritten digit recognition and autonomous vehicle navigation, highlighting the power of data-driven learning in image understanding.

Throughout, the book underscores the pipeline from raw pixels to actionable insights—from low-level processing (Chapter 2) to high-level cognition (Chapter 12), unified by mathematical rigor and practical applications.