Mathematical Medicine and Biology Advance Access published online on November 28, 2006
Mathematical Medicine and Biology, doi:10.1093/imammb/dql030
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Mathematics, University of Strathclyde, Glasgow G1 1XH, UK
* To whom correspondence should be addressed. We give a simple and informative derivation of a spectral algorithm for clustering and reordering complementary DNA microarray expression data. Here, expression levels of a set of genes are recorded sim-ultaneously across a number of samples, with a positive weight reflecting up-regulation and a negative weight reflecting down-regulation. We give theoretical support for the algorithm based on a biologic-ally justified hypothesis about the structure of the data, and illustrate its use on public domain data in the context of unsupervised tumour classification. The algorithm is derived by considering a discrete optimization problem and then relaxing to the continuous realm. We prove that in the case where the data have an inherent checkerboard sign pattern, the algorithm will automatically reveal that pattern. Further, our derivation shows that the algorithm may be regarded as imposing a random graph model on the expression levels and then clustering from a maximum likelihood perspective. This indicates that the output will be tolerant to perturbations and will reveal near-checkerboard patterns when these are present in the data. It is interesting to note that the checkerboard structure is revealed by the first (dom-inant) singular vectors--previous work on spectral methods has focussed on the case of nonnegative edge weights, where only the second and higher singular vectors are relevant. We illustrate the algorithm on real and synthetic data, and then use it in a tumour classification context on three different cancer data sets. Our results show that respecting the two-signed nature of the data (thereby distinguishing between up-regulation and down-regulation) reveals structures that cannot be gleaned from the absolute value data (where up- and down-regulation are both regarded as changes).
Received June 22, 2005
Revised June 23, 2006
Article
Spectral analysis of two-signed microarray expression data
Desmond J. Higham 1 *, Gabriela Kalna 1, and J. Keith Vass 2
2 The Beatson Institute for Cancer Research, Glasgow G61 1BD, UK
Desmond J. Higham, E-mail: djh{at}maths.strath.ac.uk
![]()
Abstract ![]()
CiteULike
Connotea
Del.icio.us What's this?