Matlab

The following is taken from the Matlab Image Processing Toolbox users guide. A complete online manual is availabe in the PDF form (about 5MB). Click here to open the PDF manual. You should also be able to access this manual on-line from within Matlab on your desktop. I will only summarize some of the more relevant information in the following.

The Image Processing Toolbox is a collection of functions that extend the capability of the MATLAB ® numeric computing environment. The toolbox supports a wide range of image processing operations, including:

Many of the toolbox functions are MATLAB M-files, series of MATLAB statements that implement specialized image processing algorithms. You can view the MATLAB code for these functions using the statement:

You can extend the capabilities of the Image Processing Toolbox by writing your own M-files, or by using the toolbox in combination with with other toolboxes, such as the Signal Processing Toolbox and the Wavelet Toolbox.

The basic data structure in MATLAB is the array, an ordered set of real or complex elements. This object is naturally suited to the representation of images, real-valued, ordered sets of color or intensity data. (MATLAB does not support complex-valued images.)

MATLAB stores most images as two-dimensional arrays (i.e., matrices), in which each element of the matrix corresponds to a single pixel in the displayed image. (Pixel is derived from picture element and usually denotes a single dot on a computer display.) For example, an image composed of 200 rows and 300 columns of different colored dots would be stored in MATLAB as a 200-by-300 matrix.

This convention makes working with images in MATLAB similar to working with any other type of matrix data, and makes the full power of MATLAB available for image processing applications. For example, you can select a single pixel from an image matrix using normal matrix subscripting:

By default, MATLAB stores most data in arrays of class double. The data in these arrays is stored as double precision (64-bit) floating-point numbers. All of MATLAB’s functions and capabilities work with these arrays. For image processing, however, this data representation is not always ideal.

The number of pixels in an image may be very large; for example, a 1000-by-1000 image has a million pixels. Since each pixel is represented by at least one array element, this image would require about 8 megabytes of memory.

In order to reduce memory requirements, MATLAB supports storing image data in arrays of class uint8. The data in these arrays is stored as 8-bit unsigned integers. Data stored in uint8 arrays requires one eighth as much memory as data in double arrays.

Because the types of values that can be stored in uint8 arrays and double arrays differ, the Image Processing Toolbox uses different conventions for interpreting the values in these arrays. (Noninteger values cannot be stored in uint8 arrays, for example, but they can be stored in double arrays.) The next section discusses how the toolbox interprets image data, depending on the class of the data array.

In addition to differences in the types of data values they store, uint8 arrays and double arrays differ in the operations that MATLAB supports. See page 1-13 for information about the operations MATLAB supports for uint8 arrays.

This section discusses how MATLAB and the Image Processing Toolbox represent each of these image types.

An indexed image consists of two arrays, an image matrix and a colormap. The colormap is an ordered set of values that represent the colors in the image. For each image pixel, the image matrix contains a value that is an index into the colormap.

The colormap is an m-by-3 matrix of class double. Each row of the colormap matrix specifies the red, green, and blue (RGB) values for a single color:

The pixels in the image are represented by integers, which are pointers (indices) to color values stored in the colormap. The relationship between the values in the image matrix and the colormap depends on whether the image matrix is of class double or uint8. If the image matrix is of class double, the value 1 points to the first row in the colormap, the value 2 points to the second row, and so on. If the image matrix is of class uint8, there is an offset; the value 0 points to the first row in the colormap, the value 1 points to the second row, and so on. The uint8 convention is also used in graphics file formats, and enables 8-bit indexed images to support up to 256 colors. In the image above, the image matrix is of class double, so there is no offset. For example, the value 5 points to the fifth row of the colormap.

MATLAB stores an intensity image as a single matrix, with each element of the matrix corresponding to one image pixel. The matrix can be of class double, in which case it contains values in the range [0,1], or of class uint8, in which case the data range is [0,255]. The elements in the intensity matrix represent various intensities, or gray levels, where the intensity 0 represents black and the intensity 1 (or 255) represents full intensity, or white.

In a binary image, each pixel assumes one of only two discrete values. Essentially, these two values correspond to on and off. A binary image is stored as a two-dimensional matrix of 0’s (off pixels) and 1’s (on pixels).

A binary image can be considered a special kind of intensity image, containing only black and white. Other interpretations are possible, however; you can also think of a binary image as an indexed image with only two colors.

A binary image can be stored in an array of class double or uint8. However, a uint8 array is preferable, because it uses far less memory. In the Image Processing Toolbox, any function that returns a binary image returns it as a uint8 logical array. The toolbox uses the presence of the logical flag to signify that the data range is [0,1]. (If the logical flag is off, the toolbox assumes the data range is [0,255].)

Like an indexed image, an RGB image represents each pixel color as a set of three values, representing the red, green, and blue intensities that make up the color. Unlike an indexed image, however, these intensity values are stored directly in the image array, not indirectly in a colormap.

In MATLAB, the red, green, and blue components of an RGB image reside in a single m-by-n-by-3 array. m and n are the numbers of rows and columns of pixels in the image, and the third dimension consists of three planes, containing red, green, and blue intensity values. For each pixel in the image, the red, green, and blue elements combine to create the pixel’s actual color.

For example, to determine the color of the pixel (112,86), look at the RGB triplet stored in (112,86,1:3). Suppose (112,86,1) contains the value 0.1238, (112,86,2) contains 0.9874, and (112,86,3) contains 0.2543. The color for the pixel at (112,86) is:

An RGB array can be of class double, in which case it contains values in the range [0,1], or of class uint8, in which case the data range is [0,255]. The figure below shows an RGB image of class double:

This section discusses ways of working with the data arrays that represent images, including:

You can use the MATLAB imread function to read image data from files. imread can read these graphics file formats:

To write image data from MATLAB to a file, use the imwrite function. imwrite can write the same file formats that imread reads.

In addition, you can use the imfinfo function to return information about the image data in a file.

See the reference entries for imread, imwrite, and imfinfo for more information about these functions.

For certain operations, it is helpful to convert an image to a different image type. For example, if you want to filter a color image that is stored as an indexed image, you should first convert it to RGB format. When you apply the filter to the RGB image, MATLAB filters the intensity values in the image, as is appropriate. If you attempt to filter the indexed image, MATLAB simply applies the filter to the indices in the indexed image matrix, and the results may not be meaningful.

The Image Processing Toolbox provides several functions that enable you to convert any image to another image type. These functions have mnemonic names; for example, ind2gray converts an indexed image to a grayscale intensity format.

Note that when you convert an image from one format to another, the resulting image may look different from the original. For example, if you convert a color indexed image to an intensity image, the resulting image is grayscale, not color.

Function	Purpose
dither	Create a binary image from a grayscale intensity image by dithering; create an indexed image from an RGB image by dithering
gray2ind	Create an indexed image from a grayscale intensity image
grayslice	Create an indexed image from a grayscale intensity image by thresholding
im2bw	Create a binary image from an intensity image, indexed image, or RGB image, based on a luminance threshold
ind2gray	Create a grayscale intensity image from an indexed image
ind2rgb	Create an RGB image from an indexed image
mat2gray	Create a grayscale intensity image from data in a matrix, by scaling the data
rgb2gray	Create a grayscale intensity image from an RGB image
rgb2ind	Create an indexed image from an RGB image

The Image Processing Toolbox represents colors as RGB values, either directly (in an RGB image) or indirectly (in an indexed image). However, there are other methods for representing colors. For example, a color can be represented by its hue, saturation, and value components (HSV). Different methods for representing colors are called color spaces. The toolbox provides a set of routines for converting between RGB and other color spaces. The image processing functions themselves assume all color data is RGB, but you can process an image that uses a different color space by first converting it to RGB, and then converting the processed image back to the original color space.

Locations in an image can be expressed in various coordinate systems, depending on context. This section discusses the two main coordinate systems used in the Image Processing Toolbox, and the relationship between them. These systems are:

Generally, the most convenient method for expressing locations in an image is to use pixel coordinates. In this coordinate system, the image is treated as a grid of discrete elements, ordered from top to bottom and left to right. For example:

For pixel coordinates, the first component r (the row) increases downward, while the second component c (the column) increases to the right. Pixel coordinates are integer values and range between 1 and the length of the row or column.

There is a one-to-one correspondence between pixel coordinates and the coordinates MATLAB uses for matrix subscripting. This correspondence makes the relationship between an image’s data matrix and the way the image displays easy to understand. For example, the data for the pixel in the fifth row, second column is stored in the matrix element (5,2).

In the pixel coordinate system, a pixel is treated as a discrete unit, uniquely identified by a single coordinate pair, such as (5,2). From this perspective, a location such as (5.3,2.2) is not meaningful.

At times, however, it is useful to think of a pixel as a square patch, having area. From this perspective, a location such as (5.3,2.2) is meaningful, and is distinct from (5,2). In this spatial coordinate system, locations in an image are positions on a plane, and they are described in terms of x and y.

This figure illustrates the spatial coordinate system used for images. Notice that y increases downward:

This spatial coordinate system corresponds quite closely to the pixel coordinate system in many ways. For example, the spatial coordinates of the center point of any pixel are identical to the pixel coordinates for that pixel.

There are some important differences, however. In pixel coordinates, the upper-left corner of an image is (1,1), while in spatial coordinates, this location by default is (0.5,0.5). This difference is due to the pixel coordinate system being discrete, while the spatial coordinate system is continuous. Also, the upper-left corner is always (1,1) in pixel coordinates, but you can specify a nondefault origin for the spatial coordinate system. See "Using a Non default Spatial Coordinate System" on page 1-19 for more information.

Another potentially confusing difference is largely a matter of convention: the order of the horizontal and vertical components is reversed in the notation for these two systems. Pixel coordinates are expressed as (r,c), while spatial coordinates are expressed as (x,y).

MATLAB Image Processing Toolbox