Image Segmentation Explained

Image segmentation is a method in which a digital image is broken into various subgroups called image segments, which help reduce the complexity of the image to make processing or analysis of the image simpler. In other words, segmentation involves assigning labels to pixels. All picture elements or pixels belonging to the same category have a common label assigned to them.

An example of different types of image segmentation. | Image: Mrinal Tyagi

For example, let’s look at a problem where the picture has to be provided as input for object detection. Rather than processing the whole image, the detector can be inputted with a region selected by a segmentation algorithm. This will prevent the detector from processing the whole image thereby reducing inference time.

An illustration of how image segmentation works. | Image: Mrinal Tyagi

Approaches in Image Segmentation

There are two common approaches in image segmentation:

Similarity approach: This approach involves detecting similarity between image pixels to form a segment based on a given threshold. Machine learning algorithms like clustering are based on this type of approach to segment an image.
Discontinuity approach: This approach relies on the discontinuity of pixel intensity values of the image. Line, point and edge detection techniques use this type of approach for obtaining intermediate segmentation results that can later be processed to obtain the final segmented image.

More on Machine Learning: Beginner’s Guide to VGG16 Implementation in Keras

Image Segmentation Techniques

There are five common image segmentation techniques.

5 Image Segmentation Techniques to Know

Threshold-based segmentation
Edge-based image segmentation
Region-based image segmentation
Clustering-based image segmentation
Artificial neural network-based segmentation

In this article, we will cover threshold-based, edge-based, region-based and clustering-based image segmentation techniques.

More on Machine Learning: Understanding and Building Neural Network (NN) Models

Threshold-Based Image Segmentation

Image thresholding segmentation is a simple form of image segmentation. It’s a way of creating a binary or multi-color image based on setting a threshold value on the pixel intensity of the original image.

In this thresholding process, we will consider the intensity histogram of all the pixels in the image. Then we will set a threshold to divide the image into sections. For example, consider image pixels ranging from 0 to 255, we’ll set a threshold of 60. So, all the pixels with values less than or equal to 60 will be provided with a value of 0 (black), and all the pixels with a value greater than 60 will be provided with a value of 255 (white).

Consider an image with a background and an object, we can divide an image into regions based on the intensity of the object and the background. But this threshold has to be perfectly set to segment an image into an object and a background.

Various thresholding techniques include:

1. Global Thresholding

In this method, we use a bimodal image. A bimodal image is an image with two peaks of intensities in the intensity distribution plot. One for the object and one for the background. Then, we deduce the threshold value for the entire image and use that global threshold for the whole image. A disadvantage of this type of threshold is that it performs really poorly when there’s poor illumination in the image.

2. Manual Thresholding

The following process goes as follows:

The process for manual thresholding. | Image: Mrinal Tyagi

3. Adaptive Thresholding

To overcome the effect of illumination, the image is divided into various subregions, and each region is segmented using the threshold value calculated for all of them. Then these subregions are combined to image the complete segmented image. This helps reduce the effect of illumination to a certain extent.

4. Optimal Thresholding

Optimal thresholding technique can be used to minimize the misclassification of pixels performed by segmentation.

Optimal thresholding graphs for image segmentation. | Image: Mrinal Tyagi

Calculation of an optimal threshold uses an iterative method by minimizing the misclassification loss of a pixel.

The probability of a pixel value is given by the following formulas :

Probability of a pixel value formula. | Image: Mrinal Tyagi

We consider a threshold value T as the initial threshold value. If the pixel value of the pixel to be determined is less or equal to T, then it belongs to the background, else it belongs to the object.

Threshold value formula. | Image: Mrinal Tyagi

Error if the pixel is misclassified as an object pixel. | Image: Mrinal Tyagi.

E1: Error if a background pixel is misclassified as an object pixel.

E2 error if the object pixel is misclassified as a background pixel. | Image: Mrinal Tyagi

E2: Error if an object pixel is misclassified as a background pixel.

Total error in the classification of the pixels are background or object. | Image: Mrinal Tyagi

E(T): Total error in the classification of pixels as background or object. To obtain great segmentation, we have to minimize this error. We do so by taking the derivative of the following equation and equating it to zero.

Consider a gaussian pixel density, the value of P(Z) can be calculated as:

Gaussian pixel density equation. | Image: Mrinal Tyagi

Standard deviation of density region. | Image: Mrinal Tyagi

Mean of intensity of pixels in region. | Image: Mrinal Tyagi

A priori probability of background and object pixels. | Image: Mrinal Tyagi

After this, the new value of T can be calculated by inputting it into the following equation:

T calculation equation. | Image: Mrinal Tyagi

Calculation for A variable. | Image: Mrinal Tyagi

Calculation for B variable. | Image: Mrinal Tyagi

C variable equation. | Image: Mrinal Tyagi

5. Local Adaptive Thresholding

Due to a variation in the illumination of pixels in the image, global thresholding might have difficulty in segmenting the image. Hence, the image is divided into smaller subgroups and then adaptive thresholding of those individual groups is conducted. After completing the individual segmentation of these subgroups, all of them are combined to form the completed segmented image of the original image. Hence, the histogram of subgroups helps in providing better segmentation of the image.

Practical Implementation of Thresholding-Based Segmentation

Importing libraries

import numpy as np
import cv2
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
     

warnings.filterwarnings("ignore")
     

image = cv2.imread("/content/1.jpg", 0)
     

plt.figure(figsize=(8, 8))
plt.imshow(image, cmap="gray")
plt.axis("off")
plt.show()
     


flattened_image = image.reshape((image.shape[0] * image.shape[1],))
     

flattened_image.shape
     
(6553600,)
PDF of image intensities

plt.figure()
sns.distplot(flattened_image, kde=True)
plt.show()
     

Applying Otsu Thresholding to the image for segmentation

ret, thresh1 = cv2.threshold(image, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)   
     

plt.figure(figsize=(8, 8))
plt.imshow(thresh1, cmap="binary")
plt.axis("off")
plt.show()

Edge-Based Image Segmentation

Edge-based segmentation relies on edges found in an image using various edge detection operators. These edges mark image locations of discontinuity in gray levels, color, texture, etc. When we move from one region to another, the gray level may change. So, if we can find that discontinuity, we can find that edge. A variety of edge detection operators are available, but the resulting image is an intermediate segmentation result and should not be confused with the final segmented image. We have to perform further processing on the image to segment it.

Additional steps include combining edges segments obtained into one segment in order to reduce the number of segments rather than chunks of small borders which might hinder the process of region filling. This is done to obtain a seamless border of the object. The goal of edge segmentation is to get an intermediate segmentation result to which we can apply region-based or any other type of segmentation to get the final segmented image.

Types of Edges

Edges are usually associated with magnitude and direction. Some edge detectors give both directions and magnitude. We can use various edge detectors like the Sobel edge operator, canny edge detector, Kirsch edge operator, Prewitt edge operator and Robert’s edge operator, etc.

Different types of edge detectors. | Image: Mrinal Tyagi

Any of the three formulas can be used to calculate the value of g. After calculation of g and theta, we obtain the edge vector, with both magnitudes as well as direction.

Calculating the value of g. | Image: Mrinal Tyagi

Theta calculation. | Image: Mrinal Tyagi

Practical Implementation of Edge-Based Segmentation

Importing Library

import numpy as np
import cv2
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
from skimage import filters
import skimage
     

warnings.filterwarnings("ignore")
     

image = cv2.imread("/content/1.jpg", 0)
     

plt.figure(figsize=(8, 8))
plt.imshow(image)
plt.axis("off")
plt.show()
     

Applying Sobel Edge operator

sobel_image = filters.sobel(image)
     

# cmap while displaying is not changed to gray for better visualisation
plt.figure(figsize=(8, 8))
plt.imshow(sobel_image)
plt.axis("off")
plt.show()
     

Applying Roberts Edge operator

roberts_image = filters.roberts(image)
     

# cmap while displaying is not changed to gray for better visualisation
plt.figure(figsize=(8, 8))
plt.imshow(roberts_image)
plt.axis("off")
plt.show()
     

Applying Prewitt edge operator

prewitt_image = filters.prewitt(image)
     

# cmap while displaying is not changed to gray for better visualisation
plt.figure(figsize=(8, 8))
plt.imshow(prewitt_image)
plt.axis("off")
plt.show()

Region-Based Image Segmentation

A region can be classified as a group of connected pixels exhibiting similar properties. Pixel similarity can be in terms of intensity and color, etc. In this type of segmentation, some predefined rules are present that pixels have to obey in order to be classified into similar pixel regions. Region-based segmentation methods are preferred over edge-based segmentation methods if it’s a noisy image. Region-based techniques are further classified into two types based on the approaches they follow:

Region growing method.
Region splitting and merging method.

Region Growing Technique

With the region growing method, we start with a pixel as the seed pixel, and then check the adjacent pixels. If the adjacent pixels abide by the predefined rules, then that pixel is added to the region of the seed pixel and the following process continues till there is no similarity left.

This method follows the bottom-up approach. If a region is growing, the preferred rule can be set as a threshold. For example, consider a seed pixel of two in the given image and a threshold value of three. If a pixel has a value greater than three, then it will be considered inside the seed pixel region. Otherwise, it will be considered in another region. As a result, two regions are formed in the following image based on a threshold value of three.

Region growing workflow. | Image: Mrinal Tyagi

Region Splitting and Merging Technique

In region splitting, the whole image is first taken as a single region. If the region does not follow the predefined rules, then it is divided into multiple regions (usually four quadrants), and then the pre-defined rules are carried out on those regions in order to decide whether to further subdivide or to classify that as a region. The following process continues until every region follows the pre-defined rules.

In the region merging technique, we consider every pixel as an individual region. We select a region as the seed region to check if adjacent regions are similar based on our predefined rules. If they are similar, we merge them into a single region and move ahead in order to build the segmented regions of the whole image. Both region splitting and region merging are iterative processes. Usually, the first region splitting is done on an image so as to split an image into maximum regions, and then these regions are merged in order to form a good segmented image of the original image.

In region splitting, the following condition can be checked in order to determine whether to subdivide a region or not. If the absolute value of the difference of the maximum and minimum pixel intensities in a region is less than or equal to a threshold value decided by the user, then the region does not require further splitting.

Region splitting and merging workflow. | Image: Mrinal Tyagi

Region splitting equation. | Image: Mrinal Tyagi

Clustering-Based Segmentation

Clustering is a type of unsupervised machine learning algorithm. It’s often used for image segmentation. One of the most dominant clustering-based algorithms used for segmentation is K-means clustering. This type of clustering can be used to make segments in a colored image.

An introduction to K-Means Clustering and image segmentation. | Video: Computerphile

More on Data Science: C-Means Clustering Explained

K-Means Clustering

Let’s imagine a two-dimensional dataset for better visualization.

First, in the data set, centroids (chosen by the user) are randomly initialized. Then the distance of all the points to all the clusters is calculated and the point is assigned to the cluster with the least distance. Then centroids of all the clusters are recalculated by taking the mean of that cluster as the centroid. Then data points are assigned to those clusters. And the process continues until the algorithm converges to a good solution. Usually, the algorithm takes a very small number of iterations to converge to a solution and does not bounce.

K-means clustering workflow. | Image: Mrinal Tyagi

Clustering is performed on an image with a color resembling the number of clusters inputted in the K-means algorithm. | Image: Mrinal Tyagi

I hope that you find this article and explanation useful.