 
  
  
  
  
 Next: Stochastic Gradient Descent
Up: Neural Network Learning
 Previous: Gradient-Descent Algorithm
 
-  important general paradigm when 
-  continuously parameterized hypothesis
-  the error can be differentiated with respect to the hypothesis 
parameters
 
-  The key practical problems are:
-  converging to a local minimum can be quite slow
-  if there are multiple local minima, then there is no guarantee 
that the procedure will find the global minimum (Notice: The gradient
	descent algorithm can work with other error definitions and
	will not have a global minimum. If we use the sum of squares
error, this is not a problem.)
 
 
Patricia Riddle 
Fri May 15 13:00:36 NZST 1998