What is (fuzzy) cluster analysis?

Cluster analysis divides data into groups (clusters) such that
similar data objects belong to the same cluster and dissimilar data
objects to different clusters. The resulting data partition improves
data understanding and reveals its internal structure. Partitional
clustering algorithms divide up a data set into clusters or classes,
where similar data objects are assigned to the same cluster whereas
dissimilar data objects should belong to different clusters. In real
applications there is very often no sharp boundary between clusters so
that fuzzy clustering is often better suited for the data. Membership
degrees between zero and one are used in fuzzy clustering instead of
crisp assigments of the data to clusters. The most prominent fuzzy
clustering algorithm is the fuzzy c-means, a fuzzification of k-Means
or ISODATA.
Areas of application of fuzzy cluster analysis include for example
data analysis, pattern recognition, and image segmentation. The
detection of special geometrical shapes like circles and ellipses can
be achieved by so-called shell clustering algorithms. Fuzzy clustering
belongs to the group of soft computing techniques (which include
neural nets, fuzzy systems, and genetic algorithms).
The family of objective function-based fuzzy clustering
algorithms includes, amongst others, the ...
- fuzzy c-means algorithm (FCM): spherical clusters of
approximately the same size
- Gustafson-Kessel algorithm (GK): ellipsoidal clusters
with approx. the same size; there are also axis-parallel
variants of this algorithm; can also be used to detect lines
(to some extent)
- Gath-Geva algorithm (GG) / Gaussian mixture decomposition
(GMD): ellipsoidal clusters with varying size; there are also
axis-parallel variants of this algorithm; can also be used to
detect lines (to some extent)
- fuzzy c-varieties algorithm (FCV): detection of linear
manifolds (infinite lines in 2D)
- adaptive fuzzy c-varieties algorithm (AFC): detection of
line segments in 2D data
- fuzzy c-shells algorithm (FCS): detection of circles (no
closed form solution for prototypes)
- fuzzy c-spherical shells algorithm (FCSS): detection of
circles
- fuzzy c-rings algorithm (FCR): detection of circles
- fuzzy c-quadric shells algorithm (FCQS): detection of
ellipsoids
- fuzzy c-rectangular shells algorithm (FCRS): detection of
rectangles (and variants thereof)
|
To probe further...
You can find some of my publications related to fuzzy clustering
below, a few of them are available on-line (g'zipped postscript
(.ps.gz) and portable document format (.pdf)).
- Introduction to Fuzzy Clustering: Many popular fuzzy
clustering and shell clustering algorithms are discussed in
this book.
- Efficient Implementation: For large data sets a
simple data organisation can help to reduce the runtime of the
fuzzy c-means algorithm significantly.
- Convergence of Fuzzy Clustering: A paper about the
relationship of fixed points and saddle points of FCM and
convergence of (probabilistic as well as possibilistic)
fuzzy clustering algorithms in general.
- Learning Fuzzy Systems from Data: The advantage of
fuzzy systems is their interpretability, which is based on the
fact that the membership functions partition the data space
appropriately. Many approaches in the literature, however, do
not have this property.
- F. Höppner, F. Klawonn, P. Eklund:
Learning Indistinguishability from Data.
Soft Computing Journal 6(1), pp. 6-13, 2002
- F. Höppner, F. Klawonn:
Obtaining Interpretable Fuzzy Models from Fuzzy Clustering and Fuzzy Regression.
Proc. of the 4th Int. Conf. on Knowledge-Based Intelligent Engineering
Systems and Allied Technologies (KES), Brighton, UK, pp. 162-165, 2000.
[ .ps.gz ]
[ .pdf ]
- Shell Clustering: A paper about the detection of
rectangles and similar shapes in images.
- Fuzzification: The fuzzification in fuzzy
clustering is obtained by introducing a so called "fuzzifier"
in the objective function of crisp k-means. Although it is
necessary to have such a concept, otherwise the membership
degrees will stay crisp, there are also some drawbacks of this
way of fuzzification. The following paper shows an alternative
approach.
|