note 1.2
AISRP workshop '97 presentation, E. Merényi,
U of Arizona, LPL
Even moderately high dimensionality causes some traditional techniques of data exploration, classification and visualization to fail or to become impractical. For example, the number of pairwise scattergrams or Principle Component plots grows too large to evaluate by the human eye. Linear methods (such as PCA) may not be able to separate classes with subtle distinguishing features, which is often the case for spectral signatures of surface cover materials.
Real compositional/mineralogical differences may be manifestied in slight spectral differences, which are resolved in hyperspectral data but may be lost if dimensionality reduction is preformed just to accomodate some of our old time favorite classifiers. For example, the Maximum Likelihood classifier (along with any covariance based methods) requires at least N+1 training samples per class, which may ba a prohibitively large number in many real field applications. Some small but interesting classes may not even contain that many pixels. Reducing the number of channels, however, can cause a loss of distinction among classes, thus would negate the advantage of high spectral resolution sensors. See note 3.1 for more illustration.
Figure at left: (Click on image for full view.) Principal Component
views of a 13-dimensional spectral data set of asteroids. The different colors
and shapes indicate known compositional types. Notice that the clusters of
the various types do not separate in any of these PCA plots . They do not
separate in the rest of the 78 pairwise PCA plots either.
Figure at right: (Click on image for full view.) Cluster map of the same data as above, prepared by a Self-Organizing Artificial Neural Net. The clusters are separated by the white fences, the fence width indicates the measure of dissimilarity between groups. Here, all clusters can be seen in one 2-dimensional representation, clearly separated. Olivine rich (So) and pyroxene rich (Sp) subclasses of S asteroids, for example, were detected with this techniqe while formal identification eluded conventional methods such as PCA or minimum tree clustering. Figure and details in Merényi et al., Prediciton of water in asteroids from spectral data shortward of 3 microns, ICARUS, 129, pp 421-439, 1997. |