Robustness and generalisation : tangent hyperplanes and classification trees
MetadataShow full item record
The issue of robust training is tackled for fixed multilayer feedforward architectures. Several researchers have proved the theoretical capabilities of Multilayer Feedforward networks but in practice the robust convergence of standard methods like standard backpropagation, conjugate gradient descent and Quasi-Newton methods may be poor for various problems. It is suggested that the common assumptions about the overall surface shape break down when many individual component surfaces are combined and robustness suffers accordingly. A new method to train Multilayer Feedforward networks is presented in which no particular shape is assumed for the surface and where an attempt is made to optimally combine the individual components of a solution for the overall solution. The method is based on computing Tangent Hyperplanes to the non-linear solution manifolds. At the core of the method is a mechanism to minimise the sum of squared errors and as such its use is not limited to Neural Networks. The set of tests performed for Neural Networks show that the method is very robust regarding convergence of training and has a powerful ability to find good directions in weight space. Generalisation is also a very important issue in Neural Networks and elsewhere. Neural Networks are expected to provide sensible outputs for unseen inputs. A framework for hyperplane based classifiers is presented for improving average generalisation. The framework attempts to establish a trained boundary so that there is an optimal overall spacing from the boundary to training points closest to this boundary. The framework is shown to provide results consistent with the theoretical expectations.
Thesis, PhD Doctor of Philosophy
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.