Saturday, February 20, 2010

Genetic Algorithm Systematic Trading Development -- Part 3 (Python/VBA)

As mentioned in prior posts, it is not possible to use the standard Weka GUI to instantiate a Genetic Algorithm, other than for feature selection. Part of the reason is that there is no generic algorithm to instantiate a fitness function. The same flexibility that allows an infinite possible range of fitnesses also requires custom scripting. Although it is possible to write a custom class for Weka/JAVA, I chose to utilize Python for this example, along with an older VBA tool I developed for the back-end results summary. Hopefully, you'll see that there are many tools that may be utilized to prototype various systems and augment the development process.

The essential GA uses a 17 bit string length to encode the following rule:

{if ma(m) binop ma(n) then buy}

The first 8 bits are used to encode the 1st ma value. Note there are 2^n = 2^8 = 256 potential decimal values that can be used for the parameter argument. The 9th bit is a 2 bit encoded value of the > or < binary operator as discussed in prior posts. The last 8 bits are used for the 2nd moving average parameter value. A simple fitness of the net dollar return was used for this example (Note Sharpe ratio, and other fitness metrics could have been used). The input series is SPY, using the range from 1993-2005 daily to optimize.

The python script was essentially set up to run 40 generations of a population of size 20 using elitism and tournament selection. Although this is by no means optimal (it is quite small), it was set up using these values for illustrative purposes. When you watch the video, what you'll see is the initial population in binary encoded strings each time a generation is passed. In addition, the decoded moving average rule is shown for each selection change. Although the video has been truncated for brevity, you should notice that the fitness number is improving each generation. The final solution was designed to halt after a fitness did not improve over five generations. In addition, you can see the final encoded result and a plot of the fitness convergence.

Video 1. Optimization of MA parameters using Python GA

Fig 1. Final Fitness result output to console

In fig 1. we see that the final rule converged to {if ma(220) > ma(221)) then Buy.
In addition, the final binary string is shown along with the final fitness value.
We can decode the binary string with relative ease.
[110110111110111100] is the 17 bit string representing the optimal fitness.
ma1 is 1st 8 bits = 11011011 = 219 decimal a +1 offset was used (so as not to have 0 day moving average) to get a resulting parameter argument of 220.
The next bit is = 1 corresponding to >
The final 8 bits represent the 221 argument by similar reasoning as the first.
So the resulting rule with parameters is:
if ma(220) > ma(221) then Buy
fitness = net$gain =$316.12

fig 2. Fitness Convergence

In fig 2. We see how the fitness continued to climb over successive generations until early convergence caused a halt at the fitness value that did not change over the prior 5 generations.

In order to verify the results, we will also show how other tools may be used. In this case, I used an older VBA simulator that I wrote a few years back.

Video 2. Summary of optimized parameters using VBA/Excel

Fig 3. Summary of Back Test Results

Above is a capture of the summary statistics using the back test program. Total net profit is slightly higher than the python results. This is due to the fact that the python simulation truncated the series length of the moving average data, so as to avoid zero front padded values, while the excel program did not. However, they are still in close agreement. It's often useful to use several different programs to force yourself to double check results.

Now, as an astute commenter already pointed out... this method is indeed curve fitting. What we found was the best possible pair of parameters(or at least one of the best; there are superior parameters, but I didn't run the example generation set too long) for our particular rule set we set out to investigate. Or as I mentioned in the first thread, we zeroed in on the region of the distribution curve with the most profitable candidates. Now, for those of you not familiar with curve fitting, it is not a happy concept amongst developers. In fact, it suffers from almost the same egregious problems as cherry picking examples, as I mentioned earlier on.

That being said, however, it is not done in vain. Our goal here is to quantitatively augment common development (the part where you create and verify) tools beyond mere guessing, intuition, and cherry picking. Firstly, it is possible that this particular rule set will not fare as well out of sample, which is true. However, in the same sense that we can not just take one cherry picked example for granted, we must also evaluate how things actually do perform out of sample. I say this because I've used similar techniques that looked very good, and did indeed perform very well out of sample for several periods out into the future. By honing in on the best candidates, we help to narrow down the set of candidates that are worthy of out of sample investigation. There are other additional techniques (some mentioned earlier, such as ensemble methods, different objective/fitness functions, and even different optimization criteria) that can be used to enhance this method, and in addition, verify robustness out of sample.

edit: Just for giggles, I decided to actually run the Out of Sample performance on this optimized in sample trained rule. The following chart illustrates how it performed 'out of sample' for the years 2005-today(2010).

Fig 4. Out of Sample Test Performance on optimized training rule parameters.

Not all that shabby for that curve fitted simple system during the worst meltdown in recent history, eh (much easier on the gut)?

To be frank, I have run so many evaluations on simple SMA systems, that I would say that they are not the most superior parameters to optimize around. Obviously, however, it really depends on what your objective is. There are some long term studies that have shown using the fitness objective of reduced volatility as the goal is quite beneficial with this simple rule set (you can verify that this simple system had far less volatility over the down periods, than the actual market-- in and out of sample) . It is up to you to find those parameters that are worthy of optimizing further. See commentary on A Quantitative Approach to Tactical Asset Allocation for a related example.

And please continue to give your feedback on what you like or don't like and areas you want to explore.
---------------------------------------------------------------------------------
If you are new to Python and would like to order a fantastic textbook, I highly recommend the following (applications geared a bit towards science and engineering): A Primer on Scientific Programming with Python

In addition, users who are interested in learning a bit more about VBA with a Financial Oriented slant will find great practical examples in the text: Financial Modeling, 3rd Edition

1. What you're getting here is curve-fit system. Try it in out-of-sample data and it will fail miserably. Curve fitting is very easy.

2. Thank You,

Yes, it is curve fit, and I was planning to discuss that before this segment is finished. That does not preclude it from being a useless exercise, however.

It's not so easy, however, to build and back-test systems=) Hopefully, you understand the scope of the site is to show you how to augment successful system building using modern techniques.

3. What python package did you use to create the outputs in this example? I couldn't find a package called "Python GA" with a quick Google search. Thank you!

I wrote my own custom scripts for this example. There are a few Python GA algorithms available on the web; pygene comes to mind. However, you'll have to write your own custom scripts for the fitness function as there aren't many examples with this type of application.

5. Hi,

I know this is an old post, but it's very relevant to the challenges I face at the moment.

I've been working on a high-frequency intra-day model for some time now. I use neuro-evolution (NEAT to be specific) as a vehicle.

To combat curve fit, I've adopted a strategy whereby I back-test evolved candidates over my OOS data as soon as they are born. When a newborn candidate performs similarly over the OOS(compared to in sample) data, the candidate is set aside for later use. When the evolution process stagnates, I use these candidates which I set aside to seed a new population, and start the evolutionary process again.

My question is : Am I curve fitting to all of my data(OOS and in sample combined) here?

I'm sure the answer is not that simple, but I'd appreciate any thoughts.

6. It seems like a reasonable approach assuming you are walking forward the data sets. I.e. the new culled population is evolved, but the results of that evolution are tested on new out of sample data.

One could argue that any type of GA is curve fitting; but if the OOS data performs comparably well to IS, and you have some understanding as to what behavior it's capturing, then it should give you sufficient confidence to continue live. You can also monitor conditions and see if they are behaving as expected (given your back-testing information).

Cheers,
IT