 
  
  
  
  
 Next: Genetic Algorithms
Up: Neural Network Learning
 Previous: Advanced Topics
 
-  practical method for learning real-valued and vector-valued
functions over continuous and discrete-valued attributes
-  robust to noise in the training data
-  Backprop algorithm is most common
-  hypothesis space: all functions that can be represented by
assigning weights to fixed network of interconnected units
-  feedforward networks containing 3 layers can approximate any
function to arbitrary accuracy given sufficient number of units in
each layer
-  networks of practical size are capable of representing a rich
space of highly nonlinear functions
-  Backprop searches the space of possible hypotheses using
gradient descent (GD) to iteratively reduce the error in the network to fit
the training data.
-  GD converges to a local minimum in the training error with
respect to the network weights.
-  Backprop has the ability to invent new features that are not
explicit in the input
-  hidden units of multilayer networks learn to represent
intermediate features (e.g., face recognition)
-  Overfitting is an important issue (caused by overuse of accuracy
imho).
-  Cross-validation can be used to estimate an appropriate stopping
point for gradient descent.
-  Many other algorithms and extensions.
 
Patricia Riddle 
Fri May 15 13:00:36 NZST 1998