Fundamentals of Neural Networks: Multi-Layer Perceptrons(MLPs)
VOLUME 4, APRIL 2011
Fundamentals of Neural Networks: Multi-Layer Perceptrons(MLPs)
By : Khalid Isa (PhD Student)
Multiple layer perceptrons (MLPs) represent a generalization of the single-layer perceptron (SLP) as described in previous article. MLPs can form arbitrarily complex decision regions and can separate various input patterns. The capability of MLP stems from the non-linearities used within the nodes. If the nodes were linear elements, then a single-layer network with appropriate weight could be used instead of two- or three-layer perceptrons. Figure 1 shows a typical MLP neural network structure.
The input/output mapping of a network is established according to theweights and the activation functions of their neurons in input, hidden and outputlayers. The number of input neurons corresponds to the number of inputvariables in the neural network, and the number of output neurons is the same asthe number of desired output variables. The number of neurons in the hiddenlayer(s) depends upon the particular NN application. For example, consider thefollowing two-layer feed-forward network with three neurons in the hidden layer and two neurons in the second layer:
Figure 1 Example of MLP
As is shown, the inputs are connected to each neuron in hidden layer via their corresponding weights. A zero weight indicates no connection. For example, if it is implied that no connection exists between the second input
and the third neuron
. Outputs of last layer are considered as the outputs of the network.The activation function for one neuron could be different from other neurons within a layer, but for structural simplicity, similar neurons are commonly chosen within a layer. The input data sets (or sensory information) are presented to the input layer. This layer is connected to the first hidden layer. If there is more than one hidden layer, the last hidden layer should be connected to the output layer of the network. At the first phase, we will have the following linear relationship for each layer:
where A, is a column vector consisting of m elements, is an m x n weight matrix and X is a column input vector of dimension n. For the above example, the linear activity level of the hidden layer (neurons n1 to
) can be calculated as follows:
The output vector for the hidden layer can be calculated by the following formula:
whereF is a diagonal matrix comprising the non-linear activation functions of the first hidden layer:
References
1. M. Tim Jones, Artificial Intelligence A Systems Approach,Infinity Science Press, 2008.
2. Ham, F. and Kostanic, I., Principles of Neurocomputing for Science and Engineering, McGraw Hill, New York, NY, 2001.