Ask the Experts

General Category => Ask the Experts => Topic started by: elkurzhal on April 22, 2009, 12:18:26 PM

Title: Lies, damned lies, and statistics
Post by: elkurzhal on April 22, 2009, 12:18:26 PM
So over the last couple months I\'ve been keeping a database of PP running lines (up to about 7000 horses, and 100k races)  from DRF\'s formulator export.  With the hopes of being able to sort out the synthetic tracks, and some other questions I have.   Wish had better figs (T-graph) to work with but DRF makes it easy and cheap to build a large database without carpletunnel syndrome.  

So today\'s reason is the Proride surface, and the popular opinion that POTN and others will far outrun there slow proride numbers on the move to the dirt track at CD.

The Proride has only been around since last September so the sample is even more limited so a grain or two of salt is probably required.

Anyway....

I\'ve got 28 horses that have run both at CD and on Proride, Using each horses best Beyer on the surface the following statistics were obtained.
Proride average best BSF for the group = 86.7 (12.6 std dev)
CD dirt = 78.5 (18.5 std dev)
All Synthetics avg BSF = 89.0 (9.9 std dev)
All Dirt avg = 88.9 (13.4 std dev)

Expanding a bit to all horses who have run on Proride and any dirt surface we get a 307 horse sample
Proride average best BSF for the group = 77.0 (14.3 std dev)
All Synthetics = 81.4 (13.1 std dev)
All Dirt  = 73.0 (19.7 std dev)

The large std dev on dirt really sticks out.  Not what I was hoping for and probably means it is going to be difficult to predict what horses will move up from Pro to dirt, but it the data does show pretty well that it\'s not a matter of the number being low on poly and all horses running better numbers on dirt.  If anything it is the opposite and the better numbers on average are being run on synthetic, but with a tighter distribution.

Back to POTN, his best beyer on Proride is 96.  That is 1.3 std deviations better than the average top on Proride from the 307 horse sample.  The same 1.3 std deviations better than the dirt average top would be 99.2.  That\'s pretty good, and being a spring 3yo, there is no reason to think he couldn\'t jump 5-10 points in his next start if it were as SA just the same as if it\'s at CD.  Not that I\'m interested in betting he will at a short price.  

If he does jump up and runs a 109 at CD though, everyone is going to be raving that he moved up because of the dirt....  They may be right, but they may not too....  I\'ll be on the hunt for poly to dirt shippers to bet against for a while, because they will be getting hammered.
Title: Re: Lies, damned lies, and statistics
Post by: covelj70 on April 22, 2009, 12:31:15 PM
awesome analysis, thanks for posting this.

As you say, need to base bets on this very cautiously but great analysis either way.

thank you very much for sharing
Title: Re: Lies, damned lies, and statistics
Post by: Uncle Buck on April 22, 2009, 01:12:36 PM
We know of two 3YO\'s running in this year\'s Derby which made 6 point forward jumps switching from Pro Ride to dirt. I feel in my gut that POTN and Choc C will do the same. They might not. I might get hit by a bus tonight. I might not:-)
Title: Re: Lies, damned lies, and statistics
Post by: elkurzhal on April 22, 2009, 01:44:14 PM
Uncle Buck, did you just leave off the other synthetic horses [Mr. Hot Stuff,Gen 1/4ers,HMB,Sq Ed]  or you don\'t feel they will get the same jump up?

Just checked the archives from some pre-poly days and sure enough 3yo\'s were making big jump ups going from dirt to DIRT.... [GREELEY\'S GALAXY,BELLAMY ROAD,BANDINI,CASTLEDALE,TEN MOST WANTED,INDIAN EXPRESS,MILLENNIUM WIND] Now 2 horses come out of the same race and do it, and we the talking heads have convinced us that we should expect this from all of the horses out of that race, or off that surface?  

If you get hit by that bus two days in a row, what is the probability your brother gets hit by a bus on the third day?
Title: Re: Lies, damned lies, and statistics
Post by: beazley on April 22, 2009, 02:27:07 PM
Great data!  Thanks and this confirms what I\'ve been playing around with also.

The 307 horse sample will obviously contain some horses that prefer dirt, others that prefer synthetic and others that don\'t have a preference.  So the shift you are seeing is for the average horse.

The data also shows the compression of the Beyer speed figure on synthetic versus dirt (stdev is smaller).  So the winning horse in a synthetic race probably has a figure that is slightly depressed while the losing horse in a synthetic race has a figure that is inflated (all relative to dirt).  I think someone posted that this depression/inflation for TG figs is 2-3 points.

So we know how to shift a winning figure from synthetic to dirt up by 2-3 points for an average horse with no preference.  It can then vary more depending on the horse\'s actual preference which we should be able to guesstimate from pedigree and sire data.

My opinion is that POTN will not move up but at best case will run equally well on dirt.  This is based on his female family but I will allow others to form their own conclusions.  In any case without a move up on dirt POTN is unlikely to contend at a short price.
Title: Re: Lies, damned lies, and statistics
Post by: elkurzhal on April 23, 2009, 09:43:30 AM
Beazley, well the data is a little questionable (Beyers) and the sample is small but the conclusion seems to be what is being reached by many.  

I\'m wondering if the root of the \"problem\"/difference, or maybe the answer to it, isn\'t in the sliding scale 1L = 1 pt @ 5f, 2L = 1pt @ 10f.  

With the races being run more slow early/fast late the horses are running faster at the end of the race then they do on dirt.  Maybe not really the ones finishing well, but the middle/back of the packers.  On dirt those horses are exhausted and easing up, on synthetic more often they ran on, but didn\'t have the kick to close it out.  

It could of course be argued, that this is the same for most turf races.  I\'ve read a number of people here say they make a 2-3 point adjustment there too.