Type: Article

PATLIS: Fast and reliable backpropagation based learning algorithm using Parallel Tangent and heuristic Line Search

Journal: WSEAS Transactions on Computers (discontinued) (11092750)Year: September 2006Volume: 5Issue: Pages: 2098 - 2105

Moallem P.^a Monadjemi S.A. Ashourian M.

a :University of Isfahan - IRAN(IR) - Isfahan

Language: English

Abstract

As a gradient based optimization algorithm, we introduce PATLIS (PArallel Tangent and heuristic LIne Search) optimization method that can be used as a learning algorithm for MLP neural networks. In typical gradient based learning algorithms, the momentum rate has usually an improving effect in convergence rate and decreasing the zigzagging phenomena. However it sometimes causes the convergence rate to decrease. The Parallel Tangent (ParTan) gradient can be used as deflecting method to improve the convergence. From the implementation point of view, it is as simple as the momentum. In fact this method is one of the more practical implementation of conjugate gradient. ParTan tries to overcome the inefficiency of zigzagging phenomena of conventional backpropagation by deflecting the gradient through acceleration phase. In this paper, we use two learning rates, η for gradient search direction and μ for accelerating direction through parallel tangent. Moreover, a heuristic line search based on an improved version of dynamic self adaptation of η and μ is used to improve the proposed learning method. In dynamic self adaptation, each learning rate is adapted locally to the cost function landscape and the previous learning rate. Finally we test the proposed algorithm for optimizing Rosenbrock function and for various MLP neural networks including a XOR 2×2×1, Encoder 16×4×16 and finally Parity 4×4×1. We compare the results with those of the dynamic self adaptation of gradient learning rate and momentum (DSη-α) and parallel tangent with dynamic self adaptation (PTDSη-μ). Experimental results for optimizing Rosenbrock function for the first 100 iterations of executions showed the convergence speed of PATLIS is very faster than the other tested methods. Furthermore for MLP problems, the experimental results showed that the average numbers of epochs for PATLIS respect to PTDSη-μ and DSη-α were decreased to around 50% and 66% respectively. Our proposed algorithm also shows a good power for avoiding from local minimum.