Neural Network Approaches To Survival Analysis
Author
Summary, in English
Predicting the probable survival for a patient can be very challenging
for many diseases. In many forms of cancer, the choice of treatment
can be directly impacted by the estimated risk for the patient. This
thesis explores different methods to predict the patient's survival
chances using artificial neural networks (ANN).
ANN is a machine learning technique inspired by how
neurons in the brain function. It is capable of learning to recognize
patterns by looking at labeled examples, so-called supervised
learning. Certain characteristics of medical data make it difficult to
use ANN methods and the articles in this thesis investigates different
methods of overcoming those difficulties.
One of the most prominent difficulties is the missing
data known as censoring. Survival data usually originates from medical
studies, which only are conducted during a limited time period for
example during five years. During this time, some patients will leave
the study for various reasons like death by unrelated causes. Some
patients will also survive the study without experiencing cancer
recurrence or death. These patients provide partial information about
the survival characteristics of the disease but are challenging to
include in statistical models.
Articles 1-3, and 5 utilize a genetic algorithm to train ANN
models to maximize (or minimize) non-differentiable functions, which
are impossible to combine with traditional ANN training techniques
which rely on gradient information. One of these functions is the
concordance index, which compares survival predictions in a pair-wise
fashion. This function is often used to compare prognostic models in
survival analysis, and is maximized directly using the genetic
algorithm approach. In contrast, Article 5
tries to produce the best grouping of the patients into low,
intermediate, or high risk by maximizing, or minimizing the area under
the survival curve.
Article 4 does not use a genetic
algorithm approach but instead takes the approach to modify the
underlying data. Regular gradient methods are used to train ANNs on
survival data where censored times are estimated in a maximum
likelihood framework.
for many diseases. In many forms of cancer, the choice of treatment
can be directly impacted by the estimated risk for the patient. This
thesis explores different methods to predict the patient's survival
chances using artificial neural networks (ANN).
ANN is a machine learning technique inspired by how
neurons in the brain function. It is capable of learning to recognize
patterns by looking at labeled examples, so-called supervised
learning. Certain characteristics of medical data make it difficult to
use ANN methods and the articles in this thesis investigates different
methods of overcoming those difficulties.
One of the most prominent difficulties is the missing
data known as censoring. Survival data usually originates from medical
studies, which only are conducted during a limited time period for
example during five years. During this time, some patients will leave
the study for various reasons like death by unrelated causes. Some
patients will also survive the study without experiencing cancer
recurrence or death. These patients provide partial information about
the survival characteristics of the disease but are challenging to
include in statistical models.
Articles 1-3, and 5 utilize a genetic algorithm to train ANN
models to maximize (or minimize) non-differentiable functions, which
are impossible to combine with traditional ANN training techniques
which rely on gradient information. One of these functions is the
concordance index, which compares survival predictions in a pair-wise
fashion. This function is often used to compare prognostic models in
survival analysis, and is maximized directly using the genetic
algorithm approach. In contrast, Article 5
tries to produce the best grouping of the patients into low,
intermediate, or high risk by maximizing, or minimizing the area under
the survival curve.
Article 4 does not use a genetic
algorithm approach but instead takes the approach to modify the
underlying data. Regular gradient methods are used to train ANNs on
survival data where censored times are estimated in a maximum
likelihood framework.
Publishing year
2015
Language
English
Full text
- Available as PDF - 403 kB
- Download statistics
Document type
Dissertation
Publisher
Department of Astronomy and Theoretical Physics, Lund University
Topic
- Biophysics
- Physical Sciences
Keywords
- Survival Analysis
- Artificial Neural Networks
- Machine Learning
- Genetic Algorithms
- Evolutionary Algorithms
- Fysicumarkivet:2015:Kalderstam
Status
Published
Supervisor
ISBN/ISSN/Other
- ISBN: 978-91-7623-307-8
- ISBN: 978-91-7623-308-5
Defence date
29 May 2015
Defence time
13:15
Defence place
Sal F, Fysikum, Sölvegatan 14A, 221 00 Lund
Opponent
- Azzam Taktak