Models for Spectral Clustering and Their Applications
- Citation Author(s):
- Submitted by:
- Andrew Knyazev
- Last updated:
- 23 February 2016 - 1:44pm
- Document Type:
- Student Thesis
- Document Year:
Ph.D. Thesis by Donald McCuan (advisor Andrew Knyazev), Department of Mathematical and Statistical Sciences, University of Colorado Denver, 2012, originally posted at http://math.ucdenver.edu/theses/McCuan_PhdThesis.pdf (1018)
In this dissertation the concept of spectral clustering will be examined. We will
start by discussing biclustering of images via spectral clustering and give a justification
for this technique by analogy to vibrational problems that is independent of that given
by relaxation of a combinatorial optimization problem. The importance in clustering
of the Fiedler vector and some theorems by Fiedler are emphasized. We will extend
Fiedler's theorem to the case of the generalized eigenvalue problem.
By examining the application of these theories to the clustering problem we hope
to develop a better understanding of the eigenfunctions and their use in clustering.
Practical problems with clustering that occur due to construction of edge weights are
Courant's Nodal Domains Theorem (CNLT) as an analog of the Fiedler vector for
eigenvectors of higher dimension are studied and the literature for discrete CNLTs
are reviewed. A new definition for r-weak sign graphs is presented and a modified
discrete CNLT theorem for r-weak sign graphs is introduced. The application of these
to spectral clustering is discussed.
The discussion of spectral clustering is continued via an examination of clustering
on DNA micro arrays. This allows us to develop an algorithm for successive biclustering.
In this we develop a new technique and theorem for dealing with disconnected
graph components. All of this is incorporated in new MATLAB software. Results of
clustering using real micro array data is presented.
The last section deals with the software package Block Locally Optimal Preconditioned
Eigenvalue Xolver (BLOPEX) which as part of the Authors graduate work was
upgraded to Version 1.1. BLOPEX implements the Locally Optimal BLock Preconditioned
Conjugate Gradient (LOBPCG) method for solution of very large, sparse,
symmetric (or Hermitian) generalized eigenvalue problems.
Version 1.1 of BLOPEX adds (amongst other things) support for complex matrices
and 64bit integers. BLOPEX provides interfaces to independently developed
software packages such as PETSc and Hypre which provide high quality preconditioners
and parallel MPI-based processing support. There is also support for execution
of BLOPEX from MATLAB and stand alone C programs. This was a multi person
eort. The Authors contribution was in recoding the abstract routines and PETSc
and Hypre interfaces for the new functionality, the development of a new complex
driver test program, and aid and assistance in testing, debugging and documentation
of all interfaces.
We will describe the BLOPEX software, design decisions, the LOBPCG algorithm
as implemented, and all of the various interfaces. Some numerical results will be