Neural Networks for Conditional Probability Estimation

Forecasting Beyond Point Predictions

Fotogalerie

Dirk Husmeier

Neural Networks for Conditional Probability Estimation

Forecasting Beyond Point Predictions

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Weitere Ausgabe:
eBook, PDF

Inhaltsangabe

Andere Kunden interessierten sich auch für

Okechukwu A. Uwechue
Human Face Recognition Using Third-Order Synthetic Neural Networks

112,99 €
ICANN 98

40,99 €
Achilleas Zapranis
Principles of Neural Model Identification, Selection and Adequacy

77,99 €
Statistical Mechanics of Neural Networks

38,99 €
Computational Architectures Integrating Neural and Symbolic Processes

149,99 €
Yoshiyasu Takefuji
Neural Network Parallel Computing

153,99 €
Hermann Haken
Brain Dynamics

38,99 €

Produktbeschreibung

Conventional applications of neural networks usually predict a single value as a function of given inputs. In forecasting, for example, a standard objective is to predict the future value of some entity of interest on the basis of a time series of past measurements or observations. Typical training schemes aim to minimise the sum of squared deviations between predicted and actual values (the 'targets'), by which, ideally, the network learns the conditional mean of the target given the input. If the underlying conditional distribution is Gaus sian or at least unimodal, this may be a satisfactory approach. However, for a multimodal distribution, the conditional mean does not capture the relevant features of the system, and the prediction performance will, in general, be very poor. This calls for a more powerful and sophisticated model, which can learn the whole conditional probability distribution. Chapter 1 demonstrates that even for a deterministic system and 'be nign' Gaussian observational noise, the conditional distribution of a future observation, conditional on a set of past observations, can become strongly skewed and multimodal. In Chapter 2, a general neural network structure for modelling conditional probability densities is derived, and it is shown that a universal approximator for this extended task requires at least two hidden layers. A training scheme is developed from a maximum likelihood approach in Chapter 3, and the performance ofthis method is demonstrated on three stochastic time series in chapters 4 and 5.

Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.

Produktdetails

Produktdetails
Perspectives in Neural Computing
Verlag: Springer / Springer London / Springer, Berlin
Artikelnr. des Verlages: 978-1-85233-095-8
Softcover reprint of the original 1st ed. 1999
Seitenzahl: 300
Erscheinungstermin: 22. Februar 1999
Englisch
Abmessung: 235mm x 155mm x 17mm
Gewicht: 460g
ISBN-13: 9781852330958
ISBN-10: 1852330953
Artikelnr.: 21251881

Herstellerkennzeichnung
Springer-Verlag GmbH
Tiergartenstr. 17
69121 Heidelberg
ProductSafety@springernature.com

Produktdetails

Perspectives in Neural Computing
Verlag: Springer / Springer London / Springer, Berlin
Artikelnr. des Verlages: 978-1-85233-095-8
Softcover reprint of the original 1st ed. 1999
Seitenzahl: 300
Erscheinungstermin: 22. Februar 1999
Englisch
Abmessung: 235mm x 155mm x 17mm
Gewicht: 460g
ISBN-13: 9781852330958
ISBN-10: 1852330953
Artikelnr.: 21251881

Herstellerkennzeichnung
Springer-Verlag GmbH
Tiergartenstr. 17
69121 Heidelberg
ProductSafety@springernature.com

Inhaltsangabe

1. Introduction.- 1.1 Conventional forecasting and Takens' embedding theorem.- 1.2 Implications of observational noise.- 1.3 Implications of dynamic noise.- 1.4 Example.- 1.5 Conclusion.- 1.6 Objective of this book.- 2. A Universal Approximator Network for Predicting Conditional Probability Densities.- 2.1 Introduction.- 2.2 A single-hidden-layer network.- 2.3 An additional hidden layer.- 2.4 Regaining the conditional probability density.- 2.5 Moments of the conditional probability density.- 2.6 Interpretation of the network parameters.- 2.7 Gaussian mixture model.- 2.8 Derivative-of-sigmoid versus Gaussian mixture model.- 2.9 Comparison with other approaches.- 2.10 Summary.- 2.11 Appendix: The moment generating function for the DSM network.- 3. A Maximum Likelihood Training Scheme.- 3.1 The cost function.- 3.2 A gradient-descent training scheme.- 3.3 Summary.- 3.4 Appendix.- 4. Benchmark Problems.- 4.1 Logistic map with intrinsic noise.- 4.2 Stochastic combination of two stochastic dynamical systems.- 4.3 Brownian motion in a double-well potential.- 4.4 Summary.- 5. Demonstration of the Model Performance on the Benchmark Problems.- 5.1 Introduction.- 5.2 Logistic map with intrinsic noise.- 5.3 Stochastic coupling between two stochastic dynamical systems.- 5.4 Brownian motion in a double-well potential.- 5.5 Conclusions.- 5.6 Discussion.- 6. Random Vector Functional Link (RVFL) Networks.- 6.1 The RVFL theorem.- 6.2 Proof of the RVFL theorem.- 6.3 Comparison with the multilayer perceptron.- 6.4 A simple illustration.- 6.5 Summary.- 7. Improved Training Scheme Combining the Expectation Maximisation (EM) Algorithm with the RVFL Approach.- 7.1 Review of the Expectation Maximisation (EM) algorithm.- 7.2 Simulation: Application of the GM network trained with the EMalgorithm.- 7.3 Combining EM and RVFL.- 7.4 Preventing numerical instability.- 7.5 Regularisation.- 7.6 Summary.- 7.7 Appendix.- 8. Empirical Demonstration: Combining EM and RVFL.- 8.1 Method.- 8.2 Application of the GM-RVFL network to predicting the stochastic logistic-kappa map.- 8.3 Application of the GM-RVFL network to the double-well problem.- 8.4 Discussion.- 9. A simple Bayesian regularisation scheme.- 9.1 A Bayesian approach to regularisation.- 9.2 A simple example: repeated coin flips.- 9.3 A conjugate prior.- 9.4 EM algorithm with regularisation.- 9.5 The posterior mode.- 9.6 Discussion.- 10. The Bayesian Evidence Scheme for Regularisation.- 10.1 Introduction.- 10.2 A simple illustration of the evidence idea.- 10.3 Overview of the evidence scheme.- 10.4 Implementation of the evidence scheme.- 10.5 Discussion.- 11. The Bayesian Evidence Scheme for Model Selection.- 11.1 The evidence for the model.- 11.2 An uninformative prior.- 11.3 Comparison with MacKay's work.- 11.4 Interpretation of the model evidence.- 11.5 Discussion.- 12. Demonstration of the Bayesian Evidence Scheme for Regularisation.- 12.1 Method and objective.- 12.2 Large Data Set.- 12.3 Small Data Set.- 12.4 Number of well-determined parameters and pruning.- 12.5 Summary and Conclusion.- 13. Network Committees and Weighting Schemes.- 13.1 Network committees for interpolation.- 13.2 Network committees for modelling conditional probability densities.- 13.3 Weighting Schemes for Predictors.- 14. Demonstration: Committees of Networks Trained with Different Regularisation Schemes.- 14.1 Method and objective.- 14.2 Single-model prediction.- 14.3 Committee prediction.- 14.4 Conclusions.- 15. Automatic Relevance Determination (ARD).- 15.1 Introduction.- 15.2 Two alternative ARD schemes.- 15.3Mathematical implementation.- 15.4 Empirical demonstration.- 16. A Real-World Application: The Boston Housing Data.- 6.1 A real-world regression problem: The Boston house-price data.- 16.2 Prediction with a single model.- 16.3 Test of the ARD scheme.- 16.4 Prediction with network committees.- 16.5 Discussion: How overfitting can be useful.- 16.6 Increasing diversity.- 16.7 Comparison with Neal's results.- 16.8 Conclusions.- 17. Summary.- 18. Appendix: Derivation of the Hessian for the Bayesian Evidence Scheme.- 18.1 Introduction and notation.- 18.2 A decomposition of the Hessian using EM.- 18.3 Explicit calculation of the Hessian.- 18.4 Discussion.- References.

Inhaltsangabe