Fundamentals of Digital Images
These days we encounter hundreds of digital images on daily basis in our smart phones, laptops, i-pads etc. But have you ever wondered how these images are generated, stored and transferred through various networks? Let’s explore this in fundamentals of digital images by first addressing the most basic question: what actually constitutes a digital image?
A digital image, in its most basic form is a collection of numbers arranged in two or three dimensions (called matrix) such that each number is encoded into some shade of a color palette. These numbers are called the pixels of the image. So a 1 Megapixel image contains 1 million pixels, each coded into a number varying within a certain range. To illustrate this, let’s analyze the following images carefully.
The left image is a typical black & white pixel image showing some structure. The grids on the image are just to illustrate the position of pixels. You can imagine it without the grids as it occurs in any of these kinds of images. So how your computer or phone stores this image? No, it does not remember the colors on the respective positions. They actually have a number (or digit) for every position of the image as you can see on the right image next to it. These numbers are the function of intensities in the image. Conventionally for each of the brightest pixel, the highest number is assigned and for the darkest one, the lowest value is assigned. In this example, since we have only two intensities, two numbers are used for the representation: 0(for dark pixel) & 1(for bright pixel).
Now you can have a close look on both the images to find out the respective coding of pixels. Electronic devices have registers which stores these numbers serially or in parallel in binary form and when you click on the image to see it, these numbers are converted back to their respective intensities that appears on the screen.Before we go much into the details of a digital image structure, let’s quickly go through the types of digital images classified according to the color distribution. There are three class of images based on color intensities:
Types of Images
- Binary (B/W) image: As the name suggests, it assumes only two values 0 and 1 for each of their pixels (same as previous example). So you pick any pixel from the image and it will be either 0 (coded as pure black) or 1 (coded as pure white). Therefore one binary bit is sufficient to encode a pixel of the binary image. You barely see these images now as it’s almost obsolete in terms of usage because of its impoverished representation. You can find these images in some of the old newspapers, billboards or video games from the archive. The reason why they were used earlier is the only advantage they possess i.e, low storage requirements. And the low memory consumption comes at the cost of image quality.
- Grayscale image: In this class of image, pixels are encoded into larger range of values and hence more shades of gray can be observed instead of just two. To be precise, each pixel of a grayscale image is coded into pixel values in the range 0-255. Number of binary bits in which each of these pixel values are stored is decided by maximum value of the pixels. In this case 8 binary bits are required to store one grayscale image pixel. This kind of image you might see somewhere these days but you may be familiar with this if you were a child in 90’s. Yes, you guessed it right: the ‘so called’ black & white TV! Actually you were referring a grayscale display television as B/W TV! So what is B/W image then? You’ve just learned a while ago, the binary image.
Therefore the difference between a B/W image and the grayscale image is: the binary (B/W) image pixels assumes only two values (0 & 1) making the picture quality dull having only two intensities while the grayscale image pixels assumes 256 values (0-255) and therefore offers more shades of gray than just two. In terms of binary equivalent, each pixel of a B/W image requires only one binary bit for storage while a grayscale image pixel requires 8 bits [log2(256)].
- Color Image: Most of the images that we see in smart phones and computers these days belong to this category. Before we delve into the details of color digital images you must familiar with some basics of color theory. So let’s go through it quickly!
Humans visualize colors through light waves. Few basic colors mix together to produce a broad range of colors. An abstract way of describing colors using basic color components is called color model. Additive and Subtractive color model are two well known schemes that describe basic understanding of color theory. Additive (or RGB) color mixing model allows producing of colors by mixing three fundamental colors: Red, Green and Blue in an appropriate proportion, while in Subtractive (CMYK) color model Cyan, Magenta and Yellow are used to produce different colors. Most of the electronic display gadgets like TVs, smart phones, projectors use RGB color model to produce colors on the screen. On the other hand, devices like printers use CMYK color model and hence any color you perceive on a physical surface is based on the subtractive color model.
Now that we have an insight about the color theory and understood the fact that digital displays (and hence digital images) use primary or RGB color model, we should concentrate only on RGB color model and forget the other one. Ok, we have now enough background to understand the concept of color images.
A color digital image consists of three grayscale planes: Red, green and blue making it a three dimensional grid of pixels as shown in the figure. Looking at a color image two dimensionally, each pixel consist of three numbers (or sub-pixels) carrying values for corresponding red, green and blue. Each plane is having pixel values in the range 0-255. The lower pixel values for any of the (RGB) planes will lessen the impact of that particular color and a higher pixel value would cause domination of that color. For example, you pick a pixel position in the (2D) image and you get three values corresponding to each R, G and B. If R pixel value is higher (close to 255) and the other two values are relatively less, you’ll get a reddish hue because of the red domination. Similar would be the case for other two planes (Refer Table 1.1).
Now, how many binary bits are required to store a color image pixel? And what is the overall range of a color image pixel? It’s simple. Each of the color image plane would require exactly the same amount of memory as a grayscale image of the same dimension. Therefore each pixel of a color image is coded into 8×3=24 bits. As far as the pixel range is concerned, there is no overall range defined for a color image pixel rather individual range of 0-255 is taken into consideration. However for the purpose of depiction, the range is usually normalized between 0 and 1 corresponding to black and white as shown in figure 1.2(c).
As shown in the figure, three grayscale planes having different pixel values according to the color proportion of the image is shown in the left. Combining these planes we get a nice looking color image (center) and a small section of image as shown in right by zooming in the original image. RGB values of each pixel are shown in the zoomed section. Let’s take a few examples of RGB values and their equivalent pixel color in the image. You can observe the color of image pixels are in accordance with the RGB values described by the primary color model.
MATLAB is an excellent software that is used to read, display and perform a number of interesting operations like conversion of images from one type to another, changing shape of images, altering or modifying the geometry and intensities of the image etc. using its Image Processing toolbox. I’ll explain all these in our next section of the blog.
For training, research assistance, and consultancy services on computer vision, deep learning, signal processing, image processing, and medical signal/image processing.