Comments on Intelligent Trading: Quantitative Candlestick Pattern Recognition (HMM, Baum Welch, and all that)

Your blog provided us with valuable information to...

2018-05-01T13:43:04.045-07:00

Your blog provided us with valuable information to work with. Each & every tips of your post are awesome. Thanks a lot for sharing. Keep blogging.. how to read candlestick charts

Very interesting blog, actually I have been workin...

2016-03-08T01:57:29.608-08:00

Very interesting blog, actually I have been working with related approach using candlesticks chart based on three parts of candlestick. I am looking for more beneficial strategies. I would like you to suggest me with more great tips.

Introduction to Japanese candlestick chart is inco...

2016-03-04T02:09:41.146-08:00

Introduction to Japanese candlestick chart is incomplete without talk about of dissimilar terminologies implicated. For more
See here.

Hi Nammik, It's been a while since I published...

2016-02-10T22:49:54.809-08:00

Hi Nammik,
It's been a while since I published this study. I would guess it is about 10 years on 1 instrument.

Best,
IT

May I know how many days are used for the clusteri...

2016-02-07T06:45:48.987-08:00

May I know how many days are used for the clustering.

Brilliant, I was playing with minitab myself.. try...

2013-12-27T10:06:40.743-08:00

Brilliant, I was playing with minitab myself.. trying to figure out classifications. Glad I bumped into your post. Some really nice ideas !

Hi William, That is absolutely correct. What is i...

2012-07-23T20:51:14.632-07:00

Hi William,
That is absolutely correct. What is important is to make sure the validation set uses the same set of constraints as the test set. At the end of the day, unless you have very separated regions of information (in which case even random initializations should converge to equal means), you're simply using the algorithm to reduce the Classification regions to a much smaller set.

If you truly wanted a deterministic cluster identifier, given such a randomly spaced field of data, you could simply force regions of interest by preselect boundaries. For instance, given a two dimensional grid, you can slice the grid up into n equally spaced partitions and use those partitions as your reference coordinates. This guarantees reference Coordinates will match in both sets.

Hope that makes sense.
IT

Unless I am mistaken, k-means is non-deterministic...

2012-07-21T00:27:12.933-07:00

Unless I am mistaken, k-means is non-deterministic. It starts with random centroids and, depending on the data, will often result in different outcomes when run multiple times.

Thank you for the comments marketdust.

2011-12-05T16:15:23.847-08:00

Thank you for the comments marketdust.

IT I wanted to leave a quick note to thank you se...

2011-12-03T20:21:17.707-08:00

IT

I wanted to leave a quick note to thank you setting up your blog. I had accidentally bumped into your blog and am very excited to read your posts. For a guy who was lost in TA, you have given new ideas to pursue. And for that I thank you for sharing your ideas.

Regards
marketdust

Thank you for the commend on distance.

2011-01-07T08:21:38.121-08:00

Thank you for the commend on distance.

In this case, each candle attribute is expressed a...

2011-01-06T17:26:09.355-08:00

In this case, each candle attribute is expressed as a coordinate defined by relative distances to open (i.e. H to O, L to O, Cl to O). Once the three dimensional coordinate is defined, the cluster algorithm in R will choose the default distance measurement (I'd have to check, but I think it's typically euclidean) to cluster by distance between similar candlesticks.

So how would you define a distance metric between ...

2011-01-05T05:44:58.796-08:00

So how would you define a distance metric between the two candles? Thanks.

nicolas, Thanks for your reply, I've been a bi...

2010-11-19T17:56:22.729-08:00

nicolas,
Thanks for your reply, I've been a bit busy lately, so apologies for the delay.

I tried to give some details (such as transition matrices). Once you have assigned clusters, the cluster states become the HMM states. With regards to evaluating from a statistical point of view, it's a bit of an art to get some OOS behavior to reject the null of chance based returns.

Hopefully, you'll get some inspiration on the NLP thread.

Cheers,
IT

I have to say that I got inspired by your approach...

2010-11-05T15:27:23.601-07:00

I have to say that I got inspired by your approach, since I heard about kmeans at university. So, I spent a couple of days doing my analysis but I got a bit stuck how to use it further.
I applied it on hourly forex data EURUSD, as an example and tried out with a feature space of different dimensions, mainly concerning the 3 last candles.
I then did some statistical analysis on the returns (mostly 1-5 bars later) conditioned on the the cluster classification. Unfortunately, it got a bit frustrating, since I haven't found statistical evidence over a different set of samples, so I couldn't reject the null that the returns are not different.
So I would appreciate if you could elaborate a bit more in detail, how you can further process your data, once it is divided in the different clusters.
I just can't believe that there is no return distribution difference depending on the pattern. ...

Many thanks again for the post

Cheers
Nic

Dear IT Just wanted to let you know I did my &quo...

2010-07-25T13:02:16.709-07:00

Dear IT

Just wanted to let you know I did my "chaos analysis" using K-means clustering and unfortunately improvement was slim to none. I'm becoming increasingly sceptical that there will be an easy way to forecast financial data using chaos theory.

Cheers

Anton

Hi IT, Thanks for your reply. That is pretty muc...

2010-07-12T20:35:42.206-07:00

Hi IT,

Thanks for your reply.

That is pretty much what I figured, I just wanted to make sure that I was not doing something stupid.

Adam

Adam, Even if you run it on the same data set ove...

2010-07-11T10:15:29.798-07:00

Adam,

Even if you run it on the same data set over and over, there is no guarantee that it will generate the global best solution or even the same centroids. This is similar to training a neural network, in that the initialization of any learner typically starts with random values; since it will not likely evaluate every possible arrangement and each initialization is random, you can get different results per run.

In the case of k-means, each time it starts out with a random location for each cluster centroid. There are several approaches that have been developed to deal with this. In 2 dimensions, you can visualize the way that it it divided candlestick clusters and see which solution makes intuitive sense to you.

Other issues to consider are that outliers may even skew the entire cluster! So, often k-medoids or similar other learners are preferred.

Going back to my earlier comment, even if you do not have a guaranteed optimal cluster arrangement, as long as you are using that as a reference, then using a supervised learner out of sample will be guaranteed to use the particular clusters you found (and accepted) as a fixed reference for which you can compare transition matrices for the features you identified in sample. Hope that makes sense.

IT

Hi IT, Now that I think of it I may not have comp...

2010-07-10T18:44:14.024-07:00

Hi IT,

Now that I think of it I may not have completely worded my question correctly.

What I meant to say was that if you run the kmeans clustering on a data set, then run it again on the same data set should you necessarily get the same probabilities in the transition matrix? I understand that the rows/columns will interchange as you are not forcing one candle stick pattern to be a certain cluster number.

Thanks

Adam

Thanks for getting back to me. That does give me ...

2010-07-10T17:15:33.981-07:00

Thanks for getting back to me.

That does give me some ideas.

I have come to the same conclusions as you did in your chaos post regarding order appearing appearing and disappearing. It is good to see others finding this and that I am not missing something.

Anyway, keep up the good work on the blog. I will read all of it eventually.

Adam

Adam, That's been my experience as well. I a...

2010-07-08T19:22:38.012-07:00

Adam,

That's been my experience as well. I alluded to this a bit on my chaos article. Have a look at the final excerpt at the end, regarding stability and pockets of order coalescing and dispersing.

I'll say one final comment, in that you might want to experiment with segments of data using k-means to identify clusters in one segment, and maybe a supervised learner in the next (kNN for example). Then you can look into various cross validation types of experiments (walk forward is preferable IMO). Hope that gives you some ideas.

IT

Hi IT, I have been experimenting with k means and...

2010-07-08T19:08:16.868-07:00

Hi IT,

I have been experimenting with k means and am finding that the transition matrix is not stable from run to run, i.e. that the probabilities are changing. There seems to be convergence when I have a low number of days to classify, but as the length of the series increases is see this convergence disappear.
After searching if have found numerous references to the convergence to the cluster means, but not so much on the uniqueness (I do have a paper to read the might be helpful).
So my question is whether you know if there is a unique set of clusters for a data set?
Just for reference I am using the kmeans function in Matlab.

Adam

Anton, No problem on typos; I do the same myself a...

2010-07-05T12:03:11.737-07:00

Anton,
No problem on typos; I do the same myself all the time. I think one of the things that is useful about clustering is that it allows us to re-generate the features of a given set of data into something that might prove more useful than conventional data representation. For instance, in chaos, using lags and embedding with percent changes is the common method that most think about, and often yield nothing beyond a cloud of random data. However, by transforming the data in creative methods, we might better be able to observe some type of existing order.

Ultimately, I've always said that formulating the problem is the most challenging part of math applied to trading; much more than understanding the actual math or machine learning.

IT

Dear IT First, I would like to apologize for all ...

2010-07-05T11:31:22.112-07:00

Dear IT

First, I would like to apologize for all the typos in my last post, it was really late when I was writting it. In fact, I'm stuned that you were able to make sense some of things I wrote.

Concerning chaos theory in forecasting, my opinion is that these methods (embedding the data etc.) are nothing more than a sophisticated pseudo -pattern recognition- metods. Example, suppose you have an up move by a 2%, then a down move by a 1.5% (embedding dimension is 2), you find simillar up-down in the past, project this past points into "the future", mesh the projected points and hope you can come up with something useful.
I actually observed (visually) the embedded data and couldn't find much order, so I was thinking cluster analysis could be useful here, since it would allow me to separate the data according to some rule.
Just a short comment, I have found articles in which the authors claim that there is an evidence of low-dimensional chaos in financial data, usually indices (think 2 to 5 dimension), while some suggest embedding dimension up to a 100.

I don't think visual inspection is going to work in my case, since my algos are completely "automated", I just pop in the data and let them work.
I did find something useful, its called "silhouette plot". It gives to each point (in each cluster) a number between -1 and 1. 1 indicates that the point is distant from neighbouring clusters, while -1 indicates that the point is probably in the wrong cluster. Taking the mean could use a benchmark how well did the clustering work.

Cheers

Anton

Hi Anton, Thanks for the kind comments. I've d...

2010-07-04T23:52:57.596-07:00

Hi Anton,
Thanks for the kind comments. I've done some work with chaos, and although there's a lot to discuss, in most cases, I've found that stock market time series are not all that chaotic in the deterministic sense. If you are familiar with phase space, you can easily see order in 2 dimensional embedded attractors for say the simple logistic equation (or think of the Lorenz attractor). It is much more difficult to see such simple order in financial series, however. In fact, I've been thinking about doing a blog on this very concept soon.

Regarding dimensions; as I mentioned earlier to Craig, it's part art, part science. I might expand on this at some point, but as you pointed out, the 3 dimensions make it very intuitive to visualize how it is capturing features that relate to common visual perspectives of candlesticks. However, even using 10 dimensions to cluster, you can still look at how it is placing the candlestick information in a lower (3d) space; because you can just look at what features it is identifying by sorting by cluster, and observing candlesticks in normal 2D space, as I've done in the example. One other aspect I haven't covered is that the cluster centroids will change depending on the sample space and size, so you have to be careful about how to approach in-sample/out of sample data.
Also, regarding higher dimensions (like 10) you also have to be aware of the curse of dimensionality, as the distance of data points tend to spread far out and cluster near edges of the hypercube; it's far less intuitive, so you might focus on different sample sets and what measure you are trying to capture to determine if it's capturing relevant information (although you can't see it intuitively in higher space, you can measure the information metrics in lower space).

IT