New Data

Started by JimP, June 28, 2004, 11:04:39 AM

Previous topic - Next topic

JimP

What does the sample size represent in the few data?

mandown

The number of starts shown is for the total number of patterns we found for that age group for the 3-month date range shown.

Criteria were same trainer all four starts (the 3 of the pattern plus the run being evluated), all routes or sprints, 42 days or fewer between each race, no surface change and at least three races prior to the start of the pattern. If more than six prior to the start of the pattern then the top was taken to be the best on the surface in the last six.

One side-effect of limiting the patterns with these criteria is that it reduces the number of turf patterns. For various reasons - races taken off the turf because of the weather is one that springs to mind - only about a quarter of the horses that ran in a turf race had also ran on turf in their previous three. This is something we may look at again.

JimP

Are you going to keep the data current going forward? A rolling 3 month range? Or continue to accumulate new instances in the sample going forward so that the sample size increases? I guess I\'m wondering about he choice of 3 months for the sample period.

mandown

At the moment Jerry wants to see how the 3-month range works out although for older horses we may simply lump them all together. Obviously there is a difference between the way a 3-y-o runs in May and the way a 3-y-o runs in October. On the other hand you would think that most 5-y-os would be fairly consistent across the year as by that time they should have stopped devleoping.

We will be adding instances as they occur each year, though I have to confess I haven\'t implemented that yet.

JimP

I interpret that to mean that you\'re going to maintain a rolling 3 month sample. Will you recompute the sample daily? Weekly? Just trying to understand the methodology for compiling the sample.

mandown

Hi Jim,

When I get round to it the re-computation will be daily. There will be no fixed time scale though as all the figure-based stats are computed only once the track days have been corrected with the final variant. This is usually within 4-5 days but depends on workload and when we get data from trackmen - and whether JB goes golfing or takes yet another vacation.