Welcome to the series of blogs on Digital Image Processing using MATLAB. If you are looking for complete guidance in understanding the concepts of the digital images and image processing, you’re at the right place! In this series, we will discuss the concepts of Image Processing along with the implementation from scratch to the advanced level. So let’s start by first understanding the concepts of digital images.
Fundamentals of Digital Images
These days we encounter hundreds of digital images daily in our smartphones, laptops, i-pads, etc. But have you ever wondered how these images are generated, stored, and transferred through various networks? Let’s explore these in the fundamentals of digital images by first addressing the most basic question: what constitutes a digital image?
What is a Digital Image?
A digital image, in its most basic form, is a collection of numbers arranged in two or three dimensions. Each of these numbers is encoded into some shade of a color palette. These numbers are called the pixels of the image. So a 1 Megapixel image contains 1 million pixels. Each of them assumes a number varying within a certain range. To illustrate this, let’s analyze the following images carefully.
The left image is a typical black & white pixel image showing some structure. The grids on the image are just to illustrate the position of pixels. You can imagine it without the grids for better visualization. But how your computer or phone store this image? No, it does not remember the colors in the respective positions. They have a number for every pixel of the image as you can see in the right image. These numbers represent the equivalent intensities of the pixels.
How pixel values are chosen?
Conventionally, the brightest pixel assumes the highest value while the lowest number represents the darkest pixel. In this example, since we have only two intensities (black and white), there are two numbers for pixel representation: 0 (for dark pixel) & 1 (for bright pixel).
Now you can have a close look at both the images (Fig. 1.1) to find out the respective coding of pixels. Electronic devices have registers which store these numbers serially or in parallel in binary form. When you click on the image to see it, image pixels appear on the screen after conversion from pixel values to pixel color intensities.
Before we go much into the details of a digital image structure, let’s quickly go through the types of digital images classified according to their color distribution. There are three classes of images based on color intensities:
Types of Images
Binary (B/W) image
Binary images assume only two values 0 and 1 for each of their pixels (same as the example of Fig. 1.1). So you pick any pixel from the image and it will be either 0 (coded as pure black) or 1 (coded as pure white). Therefore one binary bit is sufficient to encode a pixel of the binary image. You barely see these images now as it’s almost obsolete in terms of usage because of its impoverished representation. You can find these images in some of the old newspapers, billboards or video games from the archive. The reason why they were used earlier is the only advantage they have – low storage requirements. And the low memory consumption comes at the cost of image quality.
Till the early ’90s or before, storage cost for images was considerably high. That is why binary images were frequently seen at that time. One can easily relate this by observing the images from the famous childhood video game, Mario (Fig. 1.2). Fig. 1.2(a) shows the example of a binary image where each of the pixels is either pure black or pure white. There is no gray color present in the image.
In this class of images, pixels are encoded into a larger range of values. Therefore, we perceive more shades of gray instead of just two. To be precise, each pixel of a grayscale image has pixel values in the range 0-255. We have discussed earlier that computers store pixel values in binary numbers. So it is important to know that 1 pixel of an image is equivalent to how many binary bits?
Usually, the maximum possible value of the image pixels decides the number of binary bits for encoding an image pixel. The formula for obtaining this number is,Nb = log2(max_pix_val+1)
Where Nb is the number of binary bits and max_pix_val is the maximum pixel value. Since the maximum value of a grayscale image pixel is 255, one pixel requires Nb = log2(255+1) = 8 bits for storage.
You might see grayscale images somewhere these days but you may be familiar with this if you were a child in ’90s or earlier. Yes, you guessed it right: the ‘so-called’ black & white TV! You were referring to a grayscale display television as B/W TV! So what is a B/W image then? You’ve just learned a while ago, the binary image.
Therefore, the difference between a B/W image and the grayscale image is simple. The binary (B/W) image pixels assume only two values (0 & 1). This makes the picture quality dull with just two intensities. While the grayscale image pixels assume 256 values (0-255) and therefore offer more shades of gray.
In terms of binary equivalent, each pixel of a B/W image requires only one binary bit for storage. In contrast, a grayscale image pixel requires 8 bits [log2(256)]. The example of a grayscale image is shown in Fig. 1.2(b).
Most of the images that we see today on our smartphones or computers belong to this category. Fig. 1.2(c) is the example of a color image from the latest version of the game. Color images are predominantly used because of its pleasant appearance and detailed content. Before we delve into the details of the color digital images, you must be familiar with some of the basics of color theory. Let’s go through it quickly!
Humans visualize colors through light waves. A few basic colors mix to produce a broad range of colors. A Color Model is an abstract way of describing colors using some basic color components. Additive and Subtractive color model are two well-known schemes that describe the basic understanding of color theory.
Additive (or RGB) color mixing model allows producing of colors by mixing three fundamental colors: Red, Green, and Blue in an appropriate proportion. In the Subtractive (CMYK) color model, a mixture of Cyan, Magenta, and Yellow produces different colors. Fig. 1.3 depicts both the color models graphically.
Most of the electronic display gadgets like TVs, smartphones, projectors use the RGB color model to produce colors on the screen. On the other hand, devices like printers use the CMYK color model. Therefore, any color you perceive on a physical surface is based on the subtractive color model.
Now that we have an insight into the color theory and understood the fact that the digital images use the RGB color model, we should concentrate on this model and forget the other one. Ok, we have now enough background to understand the concept of color images.
Color Images (cont..)
A color digital image consists of three grayscale planes: Red, Green, and Blue making it a three-dimensional grid of pixels (Fig. 1.4). Looking at a color image two-dimensionally, each pixel consists of three numbers (or sub-pixels) carrying values for corresponding red, green and blue. Each plane is having pixel values in the range 0-255.
The lower pixel values for any of the (RGB) planes will lessen the impact of that particular color. Similarly, a higher pixel value would cause the domination of that color. For example, you pick a pixel position in the image and you get three values corresponding to each R, G and B. If R pixel value is higher (close to 255) and the other two values are relatively less, you’ll get a reddish hue because of the red domination. Similar would be the case for the other two planes (Table 1.1).
Color Image Pixel Size
So how many binary bits a color image pixel comprises of? And what is the overall range of a color image pixel? It’s simple. Each of the color image planes would require the same amount of memory as a grayscale image of the same dimension. Therefore, each pixel of a color image is comprised of 8×3=24 bits.
There is no overall pixel range for a color image pixel. Therefore, we consider the individual pixel range of each plane (0-255). However, for the purpose of the depiction, we usually normalize the pixel range between 0 and 1 corresponding to the darkest and the brightest pixel respectively (Fig. 1.2(c)).
Fig. 1.4(a) shows three grayscale planes with different pixel values (not visible). Combining these planes we get a nice looking color image (Fig. 1.4(b)). Fig. 1.4(c) depicts a small section of the image by zooming in the original image and showing the corresponding RGB values. Let’s take a few examples of RGB values and their equivalent pixel color in the image. You can observe the color of image pixels are in accordance with the RGB values of the primary color model.
How to implement it?
Now that we have developed a sufficient background for the digital images, let’s move on to the next step. Here we’ll literally observe the image characteristics and various operations on them. For this, we need software that can read, display, and perform some basic operations on images. Your PC/smartphone has some basic software to read and display the images. But you can’t see the pixel values or number of planes of images or perform any operations on them. (Although there are few add-on software available to perform some of the operations like cropping and filtering on images.)
MATLAB is an excellent software that is used to read, display, and perform a number of interesting operations. This includes conversion of images from one type to another, changing the shape of images, altering or modifying the geometry and intensities of the image, applying various filters, etc. Matlab’s Image Processing toolbox offers a vast variety of functions and commands to perform these operations. I’ll explain all these in the next sections of the blog. But before that, it is preferable to learn a little bit of mathematics behind the software in order to understand it effortlessly. And the topic to learn is “Matrix” as the abbreviation for MATLAB is Matrix Laboratory! Ok, so we will learn the concepts of Matrix in the next blog.
For training, Research Assistance, and Consultancy Services on Computer Vision, Deep Learning, Signal Processing, Image Processing, and Medical Signal/Image Processing.