Tuesday, May 10, 2011

High Low Clustering on intraday high frequency sampled data

Nothing unusually exciting on this post, but I happened to be engaged in some particle based methods recently and made some simple visual observations as I was setting up some of the sampling environment in R.  I am also using Rkward and Ubuntu to generate, so I'm gathering everything from the current environment (including graphics).

Fig 1. Parallel plot of half hr sample of High and Low intraday data points vs time (Max is purple dots, Min are red). Fig 2. Cumulative count of high low events per interval (blue = total high and low).

The plot illustrates sampled intraday data at half hour increments.
The highs and lows of each sample interval are overlaid using purple to denote an intraday high and red to denote an intraday low.
Interesting points of observation are--

1) The high and low samples tend to be clustered at open, midday, and close.
2) High and low events do not appear to be uniformly and randomly distributed over time.
This kind of data processing is useful towards generating, exploring, and evaluating pattern based setups.

The study is by no means complete or conclusive, just stopping by to show more of the type of data processing and visual capabilities that R is capable of.   If anyone has done any more conclusive studies I'd be interested to hear.

P.S. If anyone notices any odd changes, for some reason Google was having some issues the last few days, and it appears to have reverted to my original (not ready to launch) draft.

1. Hi IT,

If I remember my stochastic processes correctly, the distribution of high/low of a stopped random walk/Brownian motion has a U-shaped distribution with the highest density at the beginning and end. Therefore, it is not obvious (to me at least) whether the clustering of high/low at the open/close represents a special "pattern" since even a random walk exhibits the same pattern. Of course the clustering near mid-day is still interesting.

Thanks for keeping up the blog by the way.

2. Hi ezbentley,

Thanks for the comments. I think it is the arcsin law you are thinking about. It's been a while since I've looked at this, but my first thought is that the U-shaped pattern of the arcsin law is a cumulative property where as these are discrete interval steps. Second, I think there are some stylized facts documented about volatility at open and close not being uniform ('smile') to look into. Maybe one of these days I'll do a more thorough post on this.

IT

3. Hi IT,

It appears that the arc sine law applies to the extrema as well. See http://www.stat.berkeley.edu/~aldous/Real-World/arc-sine.html. You can also easily obtain the U-shaped distribution of highs/lows by just simulating random walks. I definitely agree that there is volatility clustering near the open and close. As such, it does not seem obvious to me how to disentangle the effect of the true stylized facts from the arc sine law to come up with useful analysis. Do you have any thought on how to analyze the "intrinsic" volatility at open/close while it is likely to be "contaminated" by the arc sine law?

Thanks,

4. ezbentley,

Interesting link. I'd have to look more; I'm still not certain that covers discrete steps. I've always understood it as the leaders position frequency over many separate trials -- Positive (above 0) and Negative (below 0) bunching up near the endpoints.

One way to disentangle would be just to model something like a gbm N(u,sigma) walk and compare the envelope against real data. But before we even think about that, it's probably better to search the literature for intraday volatility patterns.

IT