Classification and Computational Methods in Gene Expression Data Analysis

Author

Cecilia Ritz

Summary, in English

The technology of cDNA microarrays has given us the possibility to monitor the state of cells by measuring the activity of thousands of genes simultaneously. This high-throughput techniqe has in cancer research allowed exploratory studies of molecular mechanisms behind for example metastasis and response to therapy. This increased knowledge can hopefully result in new therapies and improved prognostic and predictive tools. These tools however have to be properly validated in large cohorts and must be subjected to large-scale trials before use in the clinic.

One aim of this thesis is to evaluate the performance of classifiers of clinical outcome for breast cancer based on gene expression data as compared to conventional clinical markers. Additionally, we develop computational methods for analysis and classification using gene expression data. Our results suggests that clinical markers and molecular profiling have similar power in breast cancer prognosis. Further studies using larger cohorts are thus needed to validate and refine molecular prognostic profiles. We have also performed multicategory classification of leukemia into genetic subtypes and have predicted response to therapy in a subgroup. The main contribution to the computational analysis is our development of a method for improvement of missing value imputation of 2-dye cDNA microarray data. Recognizing that some categories of missing values are over- or underestimated in a kNN-based imputation method, we suggest a linear model that corrects for this bias and improves imputation of these spots.

Department/s

Computational Biology and Biological Physics - Undergoing reorganization

Publishing year

2007

Language

English

Links

Publication in Lund University research portal

Document type

Dissertation

Publisher

Department of Theoretical Physics, Lund University

Topic

Biophysics

Keywords

Bioinformatik
medicinsk informatik
Bioinformatics
medical informatics
biomathematics biometrics
missing values
leukemia
cDNA microarray data
supervised classification
breast cancer
prognostic markers
biomatematik

Status

Published

Supervisor

Patrik Edén

ISBN/ISSN/Other

ISBN: 978-91-628-7159-8

Defence date

11 May 2007

Defence time

10:15

Defence place

Lecture Hall F, Dept. of Physics

Opponent

Carlos Caldas (Professor)

Classification and Computational Methods in Gene Expression Data Analysis

Summary, in English

Contact information

Shortcuts

Find us on social media

Collaboration and networks