New Pattern Data

Started by BitPlayer, July 02, 2004, 12:24:16 PM

Previous topic - Next topic

BitPlayer

Do you intend to post any kind of overall compilation of the new pattern data?  For example, the compilation might show, for each pattern (by my count, there should be 64 for dirt and 64 for turf), the outcomes for different age groups (for example: Jan-Mar 3yos; Apr-Jun 3yos; etc.).  Where the sample size shown on a given sheet is small, such a posting would allow people to evaluate the strength of a pattern using a larger sample.

An alternative approach would be to show, for featured age groups, the outcomes of the various possible patterns.  That would allow people to evaluate the importance of the third race back in looking at a pattern.  For example, if you see an 0-2 for the last two races, how much do you care what preceded the 0?

I can understand that you might not want to post this kind of data for competitive or other business reasons.  On the other hand, one might argue that the patterns are specific to Thorograph figures and wouldn\'t be useful to people not buying Thorograph data.

At the very least, I would think that you would want to look at these kinds of report internally.  A report of the first type would be helpful to you in decding whether a 3-month window is appropriate or whether you can use a larger window and get larger sample sizes without diluting the significance of the data.


TGJB

You are right about the business issues-- we changed our RBR rules because of them. I haven\'t got time to address the other points right now, but my guess is that others might want to. Catalin? Derby 1592? Mandown?

TGJB

TGJB

We are looking at expanding the time window to increase the sample, especially with turf horses, where the sample size is smaller. We will probably lump all 5 and up studies together rather than going by season, since development is pretty much done, and go 2 months in each direction with the younger horses. At that point we will assess the situation again.

TGJB

derby1592

It is very cool to see the pattern data on the sheets.

The combination of fig-based trainer stats and the new pattern stats is a real edge.

I know how much thought and work went into producing those numbers and I appreciate the effort.

Regarding BitPlayers comments. There are so many ways to rack and stack the data and, as several people have posted, you can only take so much new info at one time.

However, I do think putting in a 2-race pattern info (with a larger sample size) along with the 3-race pattern info might be useful, particularly if the sample sizes are small.

In the opposite direction, I think separating out cheap claimers might also improve the pattern info but you may run into the sample size problem.

Keep up the good work.

Cheers.

Chris

TGJB

Chris-- I know that you and Catalin ran some earlier studies using our figures. How were the studies different, and did you get similar results? I ask because my vague recollection, from things George told me, is that they did in fact come up differently. So far the studies we have done have pretty much confirmed my feelings (roughly) for the patterns that I have developed over the years.

TGJB

BitPlayer

I agree with everything derby1592 said.

With regard to the possibility of combining age groups to increase sample size, I think you have to let the data guide you.  If the outcome for a given pattern or set of patterns is the same for a series of age groups, there\'s no reason not to combine them.  That would allow you more flexibility to break down the data in other ways (such as by breaking out cheaper horses as derby 1592 suggests) without overly shrinking sample size.  On the other hand, you may find that you need to separate 6yos and up from 5yos.

The same logic would apply to combining pattern data where the last two races in the pattern are the same.

I\'m not really suggesting adding anything new to the sheets themselves, just that by looking at the data you may find more revealing ways to \"rack and stack\" it (and that I\'d love to be able to look at the data myself).  I read one book by an author named Cleveland in which, among other things, he bemoans people\'s tendency to spend a lot more time on data gathering than they do on data analysis.  He argues that the such priorities are misguided, particularly since data analysis is usually much cheaper.

My compliments on what you\'ve accomplished so far.


TGJB

One of the reasons for my question to Chris is that my recollection is that he came up with different results, possibly because he  only using 2 races. He and I talked at length about this, and the question has always been how to get a meaningful sample size without combining apples and oranges. Anyway, it\'s a work in progress, and we\'ll all have a better idea ogf what it means, how to use it, and possibly what to do next, after we get the bugs out and use it for a while.

TGJB

SJU5

JB,

I used the stats yesterday at chalky Belmont. They were very good. Congrats. Why did you guys all the years with all your data just recently think to compile the 3 race trends and look at results?  BTW, as I sat there in the clubhouse between the races with my two boys I did take notice of the track maintenance staff watering the dirt track surface between races and I thought back to your on-going discussion about track moisture ect. Yesterday was a hot and humid day with a nice breeze and my boys asked me why they watered the dirt like they do prior to a baseball game...even the kids were wondering LOL!!!