Computational Proteomics: Algorithms for Classifying Prostate Cancer
Collaborators:
Professors Dayanand Naik, Michael Wagner (Math and Statistics, ODU;
Srinivas Kasukurti, Raghu Ram Devineni (Computer Science, ODU);
Professors George Wright, Jr., John Semmes, Bao-Ling Adam (Eastern Virginia
Medical School).
Objective:
One of the goals of proteomics is to develop
fast, automatable, and inexpensive techniques to identify
proteins from mixtures, to match the recent developments
in genomics. One of the novel technologies for accomplishing this
is called Surface Enhanced Laser Dissociation Ionization (SELDI)
mass spectrometry.
We are collaborating with Prof. Wright's group in analyzing
protein mass spectral data obtained from SELDI on
serum and prostate fluid samples.
The goal is to make use of the protein signatures from mass spectra to
classify patient samples into healthy, early prostate cancer, late
prostate cancer, and benign prostate hyperplasia (BPH).
Approach:
The data obtained from SELDI mass spectra corresponds to
an array of intensities corresponding to varying values of
the mass to charge (hereafter called ``mass'') ratio.
We process this data to extract peaks,
and then match peaks to an interval on the mass axis,
so that every sample corresponds to measurements of intensities
at the same intervals of masses.
From known samples belonging to each of the four groups,
we have employed several statistical classification procedures
to classify samples belonging to unknown groups.
Among these are Fisher's linear discriminant analysis,
quadratic discriminant analysis, non-parametric methods,
k-nearest neighbor classifiers, and support vector machines.
We have also employed protein selection and principal components
to reduce the data and to improve the classification.
Currently we are able to classify normal and late cancer samples
with small error rates.
With continued development of our methods together with
suitable normalization techniques, we expect to distinguish
the early cancer and BPH samples from each other.
We are completing a study that compares the error rates
of the various discrimination techniques in classifying prostate cancer.
We plan to apply our methods to SELDI mass spectral data from
other cancers. Once the error rates in classification are low
enough, SELDI mass spectral analysis could become a
convenient, low-cost diagnostic tool for cancer.
A longitudinal study of the protein signatures in patients
as they are treated for cancer is planned across different ages and
races.
back to
highlights page