C# Neural Networks with Encog
- by JoshReuben
Neural Networks
· I recently read a book Introduction to Neural Networks for C# , by Jeff Heaton. http://www.amazon.com/Introduction-Neural-Networks-C-2nd/dp/1604390093/ref=sr_1_2?ie=UTF8&s=books&qid=1296821004&sr=8-2-spell. Not the 1st ANN book I've perused, but a nice revision.
· Artificial Neural Networks (ANNs) are a mechanism of machine learning – see http://en.wikipedia.org/wiki/Artificial_neural_network , http://en.wikipedia.org/wiki/Category:Machine_learning
· Problems Not Suited to a Neural Network Solution- Programs that are easily written out as flowcharts consisting of well-defined steps, program logic that is unlikely to change, problems in which you must know exactly how the solution was derived.
· Problems Suited to a Neural Network – pattern recognition, classification, series prediction, and data mining. Pattern recognition - network attempts to determine if the input data matches a pattern that it has been trained to recognize. Classification - take input samples and classify them into fuzzy groups.
· As far as machine learning approaches go, I thing SVMs are superior (see http://en.wikipedia.org/wiki/Support_vector_machine ) - a neural network has certain disadvantages in comparison: an ANN can be overtrained, different training sets can produce non-deterministic weights and it is not possible to discern the underlying decision function of an ANN from its weight matrix – they are black box.
· In this post, I'm not going to go into internals (believe me I know them). An autoassociative network (e.g. a Hopfield network) will echo back a pattern if it is recognized.
· Under the hood, there is very little maths. In a nutshell - Some simple matrix operations occur during training: the input array is processed (normalized into bipolar values of 1, -1) - transposed from input column vector into a row vector, these are subject to matrix multiplication and then subtraction of the identity matrix to get a contribution matrix. The dot product is taken against the weight matrix to yield a boolean match result. For backpropogation training, a derivative function is required. In learning, hill climbing mechanisms such as Genetic Algorithms and Simulated Annealing are used to escape local minima. For unsupervised training, such as found in Self Organizing Maps used for OCR, Hebbs rule is applied.
· The purpose of this post is not to mire you in technical and conceptual details, but to show you how to leverage neural networks via an abstraction API - Encog
Encog
· Encog is a neural network API
· Links to Encog: http://www.encog.org , http://www.heatonresearch.com/encog, http://www.heatonresearch.com/forum
· Encog requires .Net 3.5 or higher – there is also a Silverlight version. Third-Party Libraries – log4net and nunit.
· Encog supports feedforward, recurrent, self-organizing maps, radial basis function and Hopfield neural networks.
· Encog neural networks, and related data, can be stored in .EG XML files.
· Encog Workbench allows you to edit, train and visualize neural networks. The Encog Workbench can generate code.
Synapses and layers
· the primary building blocks - Almost every neural network will have, at a minimum, an input and output layer. In some cases, the same layer will function as both input and output layer.
· To adapt a problem to a neural network, you must determine how to feed the problem into the input layer of a neural network, and receive the solution through the output layer of a neural network.
· The Input Layer - For each input neuron, one double value is stored. An array is passed as input to a layer. Encog uses the interface INeuralData to hold these arrays. The class BasicNeuralData implements the INeuralData interface. Once the neural network processes the input, an INeuralData based class will be returned from the neural network's output layer.
· convert a double array into an INeuralData object :
INeuralData data = new BasicNeuralData(= new double[10]);
· the Output Layer- The neural network outputs an array of doubles, wraped in a class based on the INeuralData interface.
· The real power of a neural network comes from its pattern recognition capabilities. The neural network should be able to produce the desired output even if the input has been slightly distorted.
· Hidden Layers– optional. between the input and output layers. very much a “black box”. If the structure of the hidden layer is too simple it may not learn the problem. If the structure is too complex, it will learn the problem but will be very slow to train and execute. Some neural networks have no hidden layers. The input layer may be directly connected to the output layer. Further, some neural networks have only a single layer. A single layer neural network has the single layer self-connected.
· connections, called synapses, contain individual weight matrixes. These values are changed as the neural network learns.
Constructing a Neural Network
· the XOR operator is a frequent “first example” -the “Hello World” application for neural networks.
· The XOR Operator- only returns true when both inputs differ.
0 XOR 0 = 0
1 XOR 0 = 1
0 XOR 1 = 1
1 XOR 1 = 0
· Structuring a Neural Network for XOR - two inputs to the XOR operator and one output.
· input:
0.0,0.0
1.0,0.0
0.0,1.0
1.0,1.0
· Expected output:
0.0
1.0
1.0
0.0
· A Perceptron - a simple feedforward neural network to learn the XOR operator.
· Because the XOR operator has two inputs and one output, the neural network will follow suit. Additionally, the neural network will have a single hidden layer, with two neurons to help process the data. The choice for 2 neurons in the hidden layer is arbitrary, and often comes down to trial and error.
· Neuron Diagram for the XOR Network
·
· The Encog workbench displays neural networks on a layer-by-layer basis.
· Encog Layer Diagram for the XOR Network:
· Create a BasicNetwork - Three layers are added to this network. the FinalizeStructure method must be called to inform the network that no more layers are to be added. The call to Reset randomizes the weights in the connections between these layers.
var network = new BasicNetwork();
network.AddLayer(new BasicLayer(2));
network.AddLayer(new BasicLayer(2));
network.AddLayer(new BasicLayer(1));
network.Structure.FinalizeStructure();
network.Reset();
· Neural networks frequently start with a random weight matrix. This provides a starting point for the training methods. These random values will be tested and refined into an acceptable solution. However, sometimes the initial random values are too far off. Sometimes it may be necessary to reset the weights again, if training is ineffective. These weights make up the long-term memory of the neural network. Additionally, some layers have threshold values that also contribute to the long-term memory of the neural network. Some neural networks also contain context layers, which give the neural network a short-term memory as well. The neural network learns by modifying these weight and threshold values.
· Now that the neural network has been created, it must be trained.
Training a Neural Network
· construct a INeuralDataSet object - contains the input array and the expected output array (of corresponding range). Even though there is only one output value, we must still use a two-dimensional array to represent the output.
public static double[][] XOR_INPUT ={
new double[2] { 0.0, 0.0 },
new double[2] { 1.0, 0.0 },
new double[2] { 0.0, 1.0 },
new double[2] { 1.0, 1.0 } };
public static double[][] XOR_IDEAL = {
new double[1] { 0.0 },
new double[1] { 1.0 },
new double[1] { 1.0 },
new double[1] { 0.0 } };
INeuralDataSet trainingSet = new BasicNeuralDataSet(XOR_INPUT, XOR_IDEAL);
· Training is the process where the neural network's weights are adjusted to better produce the expected output. Training will continue for many iterations, until the error rate of the network is below an acceptable level. Encog supports many different types of training. Resilient Propagation (RPROP) - general-purpose training algorithm. All training classes implement the ITrain interface. The RPROP algorithm is implemented by the ResilientPropagation class. Training the neural network involves calling the Iteration method on the ITrain class until the error is below a specific value. The code loops through as many iterations, or epochs, as it takes to get the error rate for the neural network to be below 1%. Once the neural network has been trained, it is ready for use.
ITrain train = new ResilientPropagation(network, trainingSet);
for (int epoch=0; epoch < 10000; epoch++)
{
train.Iteration();
Debug.Print("Epoch #" + epoch + " Error:" + train.Error);
if (train.Error > 0.01) break;
}
Executing a Neural Network
· Call the Compute method on the BasicNetwork class.
Console.WriteLine("Neural Network Results:");
foreach (INeuralDataPair pair in trainingSet)
{
INeuralData output = network.Compute(pair.Input);
Console.WriteLine(pair.Input[0] + "," + pair.Input[1] + ", actual=" + output[0] + ",ideal=" + pair.Ideal[0]);
}
· The Compute method accepts an INeuralData class and also returns a INeuralData object. Neural Network Results:
0.0,0.0, actual=0.002782538818034049,ideal=0.0
1.0,0.0, actual=0.9903741937121177,ideal=1.0
0.0,1.0, actual=0.9836807956566187,ideal=1.0
1.0,1.0, actual=0.0011646072586172778,ideal=0.0
· the network has not been trained to give the exact results. This is normal. Because the network was trained to 1% error, each of the results will also be within generally 1% of the expected value.