| |
Geospatial Data / Image Processing Tutorial / Classification
Classification
Objectives
- To define and describe digital image classification
- To distinguish between supervised and unsupervised classification
and describe how each is produced
- To discuss the application of digital image classification data
to remote sensing
Introduction
An important part of image analysis is identifying groups
of pixels that have similar spectral characteristics and to determine
the various features or land cover classes represented by these groups.
This form of analysis is known as classification. Visual classification
relies on the analyst's ability to use visual elements (tone, contrast,
shape, etc) to classify an image. Digital image classification is based
on the spectral information used to create the image and classifies each
individual pixel based on its spectral characteristics. The result of
a classification is that all pixels in an image are assigned to particular
classes or themes (e.g. water, coniferous forest, deciduous forest, corn,
wheat, etc.), resulting in a classified image that is essentially a thematic
map of the original image. The theme of the classification is selectable,
thus a classification can be performed to observe land use patterns, geology,
vegetation types, or rainfall.
The analyst classifying an image must distinguish between
spectral classes and information classes. Spectral classes are groups
of pixels that have nearly uniform spectral characteristics. Information
classes are the various themes or groups the analyst is attempting to
identify in an image. Information classes may include such classes as
deciduous and coniferous forests, various agricultural crop types, or
inland bodies of water. The objective of image classification is to match
the spectral classes in the data to the information classes of interest.
Though any image can be classified, multispectral imagery
tends to be used most often. One band classification is usually very difficult
to classify since more than one surface type will exhibit the same digital
number. Thus, any spectral classes in a single band classification will
likely contain several information classes, and distinguishing between
them would be difficult. Normally two or more bands are used for classification,
and their combined digital numbers are used to identify the spectral signatures
of the spectral classes present in the image. The more bands used to create
a classification, the more likely the analyst will get a set of unique
land cover classes.
Supervised Classification
A supervised classification is performed when some prior
or acquired knowledge of the classes in a scene is used to identify representative
samples of different surface cover types. These samples, known as training
sites, are set up to identify the spectral characteristics of each class
of interest. The determination of training sites is based on the analyst's
knowledge of the geographical region and the surface cover types present
in the image. Once the training sites have been established, the numerical
information in all of the image's spectral bands are used to define the
spectral "signature" of each class. Once the computer has determined
the signatures for each class, it will compare every pixel to the signatures
and label it as the class that it is mathematically closest to. Thus,
in a supervised classification, the analyst starts with information classes
and uses these to define spectral classes. Each pixel in the image is
then assigned to the class which it most closely resembles.
Unsupervised Classification
An unsupervised classification is essentially the opposite
of a supervised classification. The pixels in an image are examined by
the computer and classified into spectral classes. The grouping is based
solely on the numerical information in the data and the spectral classes
are later matched by the analyst to information classes. In order to create
an unsupervised classification the analyst typically determines the number
of spectral classes to identify and a computer algorithm will find pixels
with similar spectral properties and group them accordingly. The following
image is an example of a 40-level unsupervised classification of a portion
of Howard County, Maryland. This means that there are 40 distinct spectral
classes in this image, each of which is assigned a gray tone value ranging
from black to white, with intermediate shades of gray.

[40-level unsupervised classification]
Programs, called clustering algorithms, are used to determine
the statistical groupings in the data. Usually, the analyst specifies
how the initial classification should proceed. In addition to specifying
the desired number of classes, the analyst may specify parameters to determine
how close pixels' digital numbers must be to be considered in the same
class. Once the clustering process has run, the analyst may want to combine
or further break down some clusters. Thus, unlike its name suggests, an
unsupervised classification often requires interaction with an analyst.
|