Saturday, January 30, 2010

Practical Implementation of Neural Network based time series (stock) prediction - PART 2

As a brief follow up to the series, I want to take a moment to describe a bit about Weka, which is the machine learning tool that we will be using to implement the neural network. It is a fantastic open source JAVA based tool that was developed at the University of Waikato, New Zealand. Users who are not all that experienced with programming have access to the GUI shell that makes running a regression or classification scenario a snap. More advanced JAVA programmers may opt to use a command shell or customize their own classes. In addition there are numerous support options, including a fantastic Nabble thread that you may subscribe to--
Weka thread I have found that questions are answered very promptly and there is a lot of activity at the site, so you don't have to wait a long time to get a response. In addition there are some great books put out by Ian Witten and Eibe Frank that guide you through the practical data mining with a minimal barrage of mathematical theory:
Data Mining Practical Machine Learning Tools and Techniques With Java Implementations I have the first edition and have found it an immensely useful reference.

There are a variety of built in learning modules included in the free utility (Weka), such as linear regression, neural networks (a.k.a multilayer perceptrons), decision trees, support vector machines, and even genetic algorithms.

Fig 1. Using the Weka Gui

In Fig 1., we see the Weka GUI Chooser has been opened and the Explorer option was selected. The native format that Weka commonly uses is the .ARFF format, fortunately for us, however, it also reads in .CSV files, which are easily created with a save option in excel. The excel file we will first train is sim_training_set_perfect_sin.csv. Once loaded, you will see all of the relevant variables in the Weka Explorer shell.

Fig 2. Loaded Excel csv training source file for Weka

We notice some new variables have been introduced that were not in part 1.
To understand why, let's show the CSV file that is used here.

Fig 3. Training set variables.

What we see is that the original perfect sine wave signal has been preserved in the column labeled signal. The additional signals, s-1, s-2, s-3, s-4 are often called delayed or embedded (dimension) variables. They are simply lagged values of the signal that are used to train the neural network. There is no exact method to determine the number of lagged values, although a number of different methods exist. For now, we will simply accept that four delayed values of the signal are useful. The last column, called bias, is common to neural networks. The bias node allows the neural network to shift the constant signal input to the network via training. For instance, imagine our signal had an average of 2.0 but we were learning it. The neural network needs to have some input that will track that constant value or it will have large offset errors that will obstruct convergence. The bias node accomplishes that operation. Those familiar with Engineering theory will recognize this node as a DC bias.

Ok, so once other thing we notice in the GUI interface is the Class:signal(num) is selected on the bottom right. This is because we are predicting a numerical class, rather than a nominal one (which is the typical default for classification schemes).

Next, we select the classify tab to select our learning scheme, which in this case will be the MultilayerPerceptron.

We then want to make sure certain options are selected.

We set nominalToBinaryFilter and normalize attributes as False, as we don't wish to modify the input data to be binary and are not using nominal attributes. However, we
want the normalizeNumericClass set to True as mentioned earlier, it will force the normalization scheme to be set to Weka's internal limiting range, so we don't have to. Also, we will train for 1000 epochs.

Fig 6. Preferences for MLP training model.

We will build a model by training on 66% of the data. We want to store and output the predictions so that we can visually see what they look like. Lastly, we will Preserve order for split as it allows us to display the predicted out of sample time series in the original order. With all of these features set, we simply click OK and the start button and it will quickly build our first Neural Network model!

Fig 7. Results with summary of statistics console.

If we scroll up we can see the actual weights that the model converged upon for our Multilayer Perceptron that will be used to predict the out of sample data.
We can see that there is a nice printout of the last 34% of results (271 out of sample data points) along with the predicted value and error, as well as a useful summary of statistics in the bottom of the console. We often use Root mean squared error as a performance metric for neural net regressions. In this case, the number .0005 is quite good. But let's use a little trick to get a visual inspection of just how good. We can actually grab the data from the console (by selecting it with the left mouse button and dragging), then copy this data back into excel. As a result, we can then plot the actual versus predicted out of sample results inside of excel.

Fig 8. Importing prediction results back into Excel.

Notice that we cut and paste the data from the Weka console back into Excel, but must select text to columns in order to separate the data back into columns.

Fig 9. Selecting the regions to separate as columns.

And tada! We can now plot the predicted vs. actual values. And look how nicely they line up. The errors are extremely small on the out of sample set, notice some are 0, others are .001, imperceptible to the eye, without zooming way in on that point.
It actually found a perfect model for this time series (we will expand a bit later why), and the errors can be attributed to numerical precision.

Fig 10. Resulting plot of predicted vs. actual data.

We have now just built a basic Neural Network with a simple sine wave time series using Weka and Excel. The predicted out of sample results were extremely good.
However, as we will see, the data signal we used, the simple sine wave is a very easy signal to learn as it is perfectly repetitive and stationary. We will see that as the signal gets increasingly complex, the prediction results do not work as well.
That's it for Part 2, comments are welcome.

Friday, January 29, 2010

Practical Implementation of Neural Network based time series (stock) prediction - PART 1

The following introduction is to allow viewers to understand the basic concepts and practical implementation of neural nets towards a financial time series. I will not go too deep into detail about the mathematics behind the neural net at the moment. My goal is to get you to understand practical details about how to actually implement a neural net using simple tools and models. We will start with a simple model to understand a basic time series. The time series waveform is a simple sine wave with the period set to 30 days. It is implemented in excel as a source file to be processed in any Machine Learning capable software. For this example I will be using a very good GUI Java based program called Weka.

Fig 1. Shows a simple sine wave set to a period (T) of 30 days.

It is a very simple time series based upon the well known sine wave model.
We can see that one complete cycle occurs over a period of 30 days. Each time step is set to 1 unit or day per step.

Fig 2. A complex sinusoidal signal with f1 set to 1/T, where T=30 days.

Anyone who has worked with financial time series knows that they can be far more complicated than simple sine based models, however, it is often better to learn from basic principles and move up in complexity in order to have a good grasp of what we are doing. The second figure is a bit more complicated as it is the sum of three different sin based signals. Each signal has a different Amplitude and Frequency associated with it. We could use Fourier Analysis to show the spectrum of the three different tones if we wished. However, for now we'll just accept that it is a complex signal. Notice one property of this signal that is also a bit optimistic is that it is a stationary signal. Essentially a stationary signal has statistical properties that do not change over time. For example, if we were to sample the average from different slices, it would not change much. We also can visually see that the time series is mean reverting. Financial time series differ in that they are not stationary, but are typically unit root and must often be transformed in order for the neural network to process them. The purpose of the complex signal, however, is to show how we can move to an increasingly complex signal from a very simple model.

Fig 3. Normalized Complex Signal

The final step is to simply normalize the time series to be constrained between the vertical (what we call rails) range of minus 1 to plus 1. A typical neural net is limited by an internal function, sometimes called a squashing function. The function is a non-linear processing function that is often made up of a sigmoid or tanh (hyperbolic tangent) function, which saturate at (0,1) and (-1,1), respectively.
A simple transformation can be produced by xnew =xold*(vmaxn-vminn)/(vmaxo-vmino).
Vmax and Vmin are the new and old maximum values of the time series. In this case we will use -.9 and +.9 as the limiting rails so as to avoid saturation effects. Often software will do the normalizing for you. In the case of Weka, you can choose to have it do this operation for you, in which case no normalization is neccessary. Although we should understand it for future reference.

That's it for part I. Next we will investigate how to transport the data to Weka and have it build and predict the out of sample signal set!

Please add any comments on where I can improve my tutorial as I am new to the blogger scene and appreciate any feedback.