Genetic Algorithms as Tool for Statistical Analysis of High-Dimensional Data Structures
Rüdiger Krause
ISBN 978-3-8325-0661-2
213 Seiten, Erscheinungsjahr: 2004
Preis: 40.50 €
In regression the objective is to determine an appropriate function
which reflects reality as accurate as possible but also eliminates
irregularities from data noise and is therefore easy to interpret.
A popular and flexible approach for estimating the true underlying
function is the additive model. One possible approach for fitting
additive models is the expansion in B-splines which allows direct
calculation of the estimators. If the number of B-splines is too large the estimated functions become wiggly and tend to be very close to the observed data. To avoid this problem of overfitting we use a penalization approach characterized by smoothing parameters. In this thesis we propose the use of genetic algorithms for smoothing parameter optimization. Genetic algorithms are rarely applied in the field of statistics and refer to the principle that better adapted individuals win against their competitors under equal conditions. Apart from smoothing parameter optimization the user often faces datasets containing large numbers of relevant and irrelevant explanatory variables. Appropriate variable selection approaches allow to reduce the number of variables to subsets of relevant variables.
We propose to consider the problems of variable selection and choice of smoothing parameters simultaneously by using genetic algorithms. Our approach bases on an appropriate combination of the genetic algorithms for smoothing parameter optimization and variable selection.








