Fundamentals of Digital Images
These days we encounter hundreds of digital images on a daily basis in our smartphones, laptops, i-pads, etc. But have you ever wondered how these images are generated, stored and transferred through various networks? Let’s explore this in the fundamentals of digital images by first addressing the most basic question: what actually constitutes a digital image?
A digital image, in its most basic form, is a collection of numbers arranged in two or three dimensions (called matrix) such that each number is encoded into some shade of a color palette. These numbers are called the pixels of the image. So a 1 Megapixel image contains 1 million pixels, each coded into a number varying within a certain range. To illustrate this, let’s analyze the following images carefully.
The left image is a typical black & white pixel image showing some structure. The grids on the image are just to illustrate the position of pixels. You can imagine it without the grids as it occurs in any of these kinds of images. But how your computer or phone stores this image? No, it does not remember the colors on the respective positions. They actually have a number (or digit) for every position of the image as you can see on the right image next to it. These numbers are the function of intensities in the image. Conventionally for the brightest pixel, the highest number is assigned and for the darkest one, the lowest value is assigned. In this example, since we have only two intensities, two numbers are used for the representation: 0(for dark pixel) & 1(for bright pixel).
Now you can have a close look at both the images to find out the respective coding of pixels. Electronic devices have registers which store these numbers serially or in parallel in binary form and when you click on the image to see it, these numbers are converted back to their respective color intensities that appear on the screen. Before we go much into the details of a digital image structure, let’s quickly go through the types of digital images classified according to the color distribution. There are three classes of images based on color intensities:
Types of Images
- Binary (B/W) image: As the name suggests, it assumes only two values 0 and 1 for each of their pixels (same as the previous example). So you pick any pixel from the image and it will be either 0 (coded as pure black) or 1 (coded as pure white). Therefore one binary bit is sufficient to encode a pixel of the binary image. You barely see these images now as it’s almost obsolete in terms of usage because of its impoverished representation. You can find these images in some of the old newspapers, billboards or video games from the archive. The reason why they were used earlier is the only advantage they possess i.e, low storage requirements. And the low memory consumption comes at the cost of image quality.
- Grayscale image: In this class of image, pixels are encoded into a larger range of values and hence more shades of gray can be observed instead of just two. To be precise, each pixel of a grayscale image is coded into pixel values in the range 0-255. The number of binary bits in which each of these pixel values are coded is decided by the maximum value of the pixels. In this case, 8 binary bits are required to store one grayscale image pixel. This kind of image you might see somewhere these days but you may be familiar with this if you were a child in ’90s. Yes, you guessed it right: the ‘so-called’ black & white TV! Actually, you were referring a grayscale display television as B/W TV! So what is B/W image then? You’ve just learned a while ago, the binary image.
Therefore the difference between a B/W image and the grayscale image is: the binary (B/W) image pixels assumes only two values (0 & 1) making the picture quality dull with just two intensities while the grayscale image pixels assume 256 values (0-255) and therefore offers more shades of gray. In terms of binary equivalent, each pixel of a B/W image requires only one binary bit for storage while a grayscale image pixel requires 8 bits [log2(256)].
- Color Image: Most of the images that we see these days in our smartphones or computers, belong to this category. Before we delve into the details of the color digital images, you must be familiar with some of the basics of color theory. Let’s go through it quickly!
Humans visualize colors through light waves. A few basic colors mix together to produce a broad range of colors. An abstract way of describing colors using basic color components is called a color model. Additive and Subtractive color model are two well-known schemes that describe the basic understanding of color theory. Additive (or RGB) color mixing model allows producing of colors by mixing three fundamental colors: Red, Green, and Blue in an appropriate proportion, while in Subtractive (CMYK) color model Cyan, Magenta and Yellow are used to produce different colors. Most of the electronic display gadgets like TVs, smartphones, projectors use the RGB color model to produce colors on the screen. On the other hand, devices like printers use the CMYK color model, therefore any color you perceive on a physical surface is based on the subtractive color model.
Now that we have an insight about the color theory and understood the fact that digital displays (and hence digital images) use primary or RGB color model, we should concentrate only on RGB color model and forget the other one. Ok, we have now enough background to understand the concept of color images.
A color digital image consists of three grayscale planes: Red, green and blue making it a three-dimensional grid of pixels as shown in Fig. 1.4. Looking at a color image two-dimensionally, each pixel consists of three numbers (or sub-pixels) carrying values for corresponding red, green and blue. Each plane is having pixel values in the range 0-255. The lower pixel values for any of the (RGB) planes will lessen the impact of that particular color and a higher pixel value would cause domination of that color. For example, you pick a pixel position in the image and you get three values corresponding to each R, G and B. If R pixel value is higher (close to 255) and the other two values are relatively less, you’ll get a reddish hue because of the red domination. Similar would be the case for the other two planes (Refer to Table 1.1).
So how many binary bits are required to store a color image pixel? And what is the overall range of a color image pixel? It’s simple. Each of the color image planes would require exactly the same amount of memory as a grayscale image of the same dimension. Therefore each pixel of a color image is coded into 8×3=24 bits. As far as the pixel range is concerned, there is no overall range defined for a color image pixel rather individual range of 0-255 is taken into consideration. However, for the purpose of the depiction, the range is usually normalized between 0 and 1 corresponding to the darkest and the brightest pixel present in the image, as shown in Fig. 1.2(c).
Three grayscale planes having different pixel values (not visible here) according to the color proportion of the image is shown in Fig. 1.4(a). Combining these planes we get a nice looking color image (Fig. 1.4(b)) and a small section of the image as shown in Fig. 1.4(c) by zooming in the original image. RGB values of each pixel are shown in the zoomed section. Let’s take a few examples of RGB values and their equivalent pixel color in the image. You can observe the color of image pixels are in accordance with the RGB values described by the primary color model.
Now that we have developed a sufficient background for the digital images, let’s move on to the next step where we’ll literally observe the image characteristics and various operations on them. For this, we need software that can read, display and perform some basic operations on images. Your PC/smartphone has some basic software to read and display the images but you can’t see the pixel values or number of planes of images or perform any operations on them (although there are few add-on software available to perform some of the operations like cropping and filtering on images).
MATLAB is an excellent software that is used to read, display and perform a number of interesting operations like conversion of images from one type to another, changing the shape of images, altering or modifying the geometry and intensities of the image, etc. using its Image Processing toolbox. I’ll explain all these in the next sections of the blog. But before that, it is preferable to learn a little bit of mathematics (specifically, the concept of Matrix from Linear Algebra) behind the software in order to understand it easily. And the topic to learn is “Matrix” as the abbreviation for MATLAB is Matrix Laboratory!
For training, Research Assistance, and Consultancy Services on Computer Vision, Deep Learning, Signal Processing, Image Processing, and Medical Signal/Image Processing.