Underlying Data

Started by TGJB, November 05, 2005, 04:55:12 PM

Previous topic - Next topic

TGJB

Recently Steve Plever posted here about the differences in figure making methods between us and Ragozin. Steve and I have had our differences, but he is one of the more articulate and informed posters, and he raised some important issues, so while we wait for Friedman to post their BC figures (I fear we may be waiting quite a while, and note that none of their customers are calling them on it), I\'m going to address a few of the issues Steve raised. It will probably be done over a series of posts.

\"Your own answer also shows why those who say your figs are too self-fulfilling may have a case. Your strong feeling that it would be totally improbable for the whole Distaff field except PH to run more than 3 points off their tops comes from looking at your own figs-- figs generated by tweaking the variant, albeit sometimes slightly, in almost every race\".

1-- All figures are made by looking at one\'s own figures, no matter who makes them, and are therefore self-fulfilling, since by definition we try to make today\'s figures match up to past ones. The difference between the way Ragozin makes them-- as opposed to the way Andy and I make them-- is in WHICH figures he looks at. Len looks at unrelated events-- the figures in both one and two turn races, or those before and after sealing the track, just to choose two extreme examples-- and uses them in \"tweaking\" his figures (trying to get them to match up with historical figures). Because there are no figures without \"tweaking\"-- we don\'t have a machine that measures track speed (thank God).

The main point is that THERE IS ABSOLUTELY NO BASIS FOR DOING THIS. It is pure dogma. There is no logic to tying together a race run over a sealed track and one run over a harrowed one-- and there is a lot of science that shows that even in much more ordinary circumstances the track is changing speed (again, see the work done by the physicists who actually studied racing surfaces. You can find it in \"Changing Track Speeds\" on this site).

2-- In his analysis of what I am doing, Steve misses two big points-- the horses are coming out of different races, and the relationships between the horses within each race (both in their prior races and now) are FIXED, by beaten lengths, weight, and ground. I CAN\'T have a whole field run within 3 points of their tops unless BOTH their figure histories in independent events AND their relationships to each other in the current race bear it out. If I have a horse with a 1 at Keeneland, and another with a 3 at Belmont and another with a 5 at CD, I can\'t give them those figures again today unless the relationships bear it out. The \"tweaking\" has to be done to all the horses in a race-- I can\'t just give them \"what I expect them to run\", to quote that nonsense from the other board.
TGJB

TGJB

More from Steve\'s post:

\"For me, the most logical thing to do would be to look at Belmont days with similar weather over many seasons and see what impact the watering had in the race or two following. I know they wouldn\'t drive exactly the same speed in those water trucks every time, but they\'re pretty close, and if we looked at enough races, we could probably get a general view of whether and approximately how much that water mattered. I\'m only an outsider with limited figure making experience looking in, but to me that approach is closer to science/probability than trying to use one field\'s worth of horses\' performances to determine exactly how extremely one filly freaked or how slowly a dozen others plodded\".

First of all, Steve obviously has not seen \"Changing Track Speeds\". This is very similar to an idea put forth in an e-mail to me once, and guys, there are more things in heaven and earth than are dreamt of in your philosophy.

1-- Among the conditions that would have to be identical, because all affect track \"speed\"--

Cushion composition (clay/sand ratio, type of each), depth of cushion. Possibly the same for the base. (There are long-term drainage issues).

Amount of rain the track had seen recently, condition of track when it rained (sealed or not). In general, whether it had been sealed often, harrowed often, or \"flaked\" recently-- how \"packed down\" it is. All these things can affect moisture content going INTO the day, and as the science shows, moisture content is a determinant in track speed, although the relationship is not a direct one-- the addition of water might make a track faster, but additional water might slow it down. And another surface might react entirely differently.

Humidity, temperature, wind, cloud cover, shade from the grandstand on some parts of the track (which varies at different times of the year). All these things affect evaporation.

And we haven\'t even gotten to the direct stuff-- track maintenance on the day (sealing, harrowing or flaking between races, or between some, or none) and watering of the track. Sometimes they water it before every race, sometimes not at all, sometimes just at certain points-- on BC day they just watered it once all day. And these variables are all interrelated-- watering it all day might have one effect if the track starts with a high moisture content, another with a lower level, another if it has been sealed the night before. And it might be completely different with a different track, or the same track a year later if the composition or depth has changed. Or if the sun is shining and the wind blowing-- or not blowing. Or if it is dry or humid.

2-- Assuming you actually could get moisture content readings for every single race, all around the track, AND all parts of the track were the same (no chance, according to the science), AND determine that all the other variables were identical, for the kind of approach Steve suggests to have meaning, you would have to have a meaningful sample to study-- many days with exactly the same circumstances, for all the different combinations. Good luck with that.

3-- But beyond that-- even if you had identical circumstances to work with, there is a little problem-- YOUR PREVIOUS VARIANTS WERE MADE SUBJECTIVELY. There is no other way to make figures-- that\'s the whole point of this exercise, to discuss whose judgment is more correct. So you would be looking at today using as an assumption that previous judgments were correct. If they were, looking at the results from before is a good idea. But if not, if someone is using bad assumptions in making their figures (as one of us must be, since we so fundamentally disagree), then you would just be reinforcing bad earlier conclusions.

So the only approach that makes sense is to recognize that circumstances change, and try to work through the variables to piece together the puzzle. Which is why one of the scientists said that the way we make figures (regression analysis) is the best way to determine track speed (\"Changing Track Speeds\").


TGJB

SP

Hey Jerry.  I actually did go and read the stuff on changing track speed and learned some from it.  I\'m not at all dismissive of it, or of your approach.  

I guess if I\'m being blind here, I\'m not going to realize my own misperception, but I\'m not trying to be dogmatic or ignore the complexities of how weather and racing surfaces interact.  Agreeing that conditions for each race may be different doesn\'t change my skepticism about anyone\'s ability to infer the precise changes in variant from race to race.  That\'s why I suggested a little statistical research to test whether cutting a race off due to watering is valid.

I understand your objections to using averages of larger samples of races run under similar, albeit, I concede, far from identical conditions.  But didn\'t you, like all figure makers, start out by generating pars from large samples of races run over many months, years, under very different conditions?

If the original pars that led to the original variants that led to the figures that enabled you to make the first projections that generated variants that eventually led to today\'s figures, projections & variants... if these were originally based on large samples with varying conditions, then couldn\'t that same \"start with a rough estimate and keep refining it\" approach be used to solve similar problems today?  (how\'s that for a sentence?!)

Might there not be something to be learned if you could find a sample of, say, 100 late afternoon, temp in the 40s Belmont fast/dry track races that took place after the day\'s first watering?  Add harrowing as a factor for analysis.  Admit that each event was unique but look for trends anyway.  You might learn that there was relatively little change from the prior race (which would reinforce Rag\'s view), learn a general trend (which would help you decide how to handle a race like the Distaff), or learn that the impact varied wildly (which would support your view that one can\'t generalize so you should continue to let horses\' past figs be the determining factor in the variant).  Maybe you could never get such a sample, but from your posts, it seems like you keep notes on track maintenance the way some guys keep trip notes.

I understand that what I\'m advocating is imprecise, but in a race like the Distaff, even the most precise fig makers are left weighing which scenario is most improbable.  Races like that, where you could have ended up with a different figure for 12 other horses if PH didn\'t run, point up the limitations of current methodology and the need to test and refine one\'s assumptions with more data.

Yes, of course Rags\' variants are also subjective because they too make projections.  They\'re not making fewer assumptions, just different ones -- assuming that changing track conditions and the part of the track that is used in the race have less impact that than you thnk they do.  My guess (feel free to call it a cop-out) is that their assumptions are generally correct on some days, and  yours are generally correct on others.  The \"tightness\" you see as verification of your approach isn\'t persuasive unless one shares your views of the consistency of equine performance.  I can\'t think of a comparison test with any validity other than my own ROI with the product.  

I appreciate any time you decide to spend on a response.  I\'m going to read it & think, and leave it at that, so we can both get back to the rest of what we do.  Thanks for an interesting discussion.

Best,
SP

marcus

SP must be one of the good ones - he cares about his ROI ( how many Beyer + Rags users can say that ) and his post makes a nice read , Thanks to Jerry\'s post + others like yours SP , someday I\'ll actually understand and be able to assimilate all this varient stuff w/ expert ability  - thanks !!!  But seriously , even if SP\'s proposed theoretical study did show only a small difference in varients it still wouldn\'t hold that those \"might be\" differences are negligable or irrelevant and , in contrast , it would be pertinent information to know becouse most races are decided by a small margin and the difference between winning and losing on ones ROI is usually influened by many subtle nuances . The thing about averages for me is to try too use them in a meaningful way  , steering me towards  winners not away from them as seems to be the case with how RAGS applies their track variant data into their final figure\'s that they assign ,  RAGS distorts the variant sample - it\'s way too conveluded  . In simple hypothetical terms , if a horse always runs an 8 or a 12 everytime his average would be 10 , a theoretical number and one that the horse never runs , so common sense and defernce are required protocols .    
marcus

TGJB

Sreve--

1-- I\'m glad you got something out of \"Changing Track Speeds\". For more on this subject, specific to the NYRA tracks, check out \"Are Racehorses Getting Faster\", parts 1 and 1a, also in the archives. They contain comments by Jerry Porcelli about the NYRA surfaces, and what he was doing to them when he was in charge (he now assists Passero). Among the revelations-- he manually adjusted track speed himself, intentionally, race to race, by adding-- or not adding-- water.

2-- Yeah, pars and averages.

When you start your data base, you have no choice but to use large population studies, and averages, to get your speed chart, and winning figure pars. The whole point of the projection method is that we realize that doesn\'t work for knowing how fast THIS 10 claimer went-- just 10 claimers in general. So as soon as you have a decent data base as a starting point, you STOP using averages.

THE WHOLE REASON PEOPLE USE AVERAGES IS BECAUSE THERE IS VARIABILITY IN THE UNDERLYING DATA. What possible sense can there be in tying a specific result to an average when we know the average is made up of different values? Isn\'t it obvious you are better off looking at the data for the specific thing in question? If one actually could do the study you want, with every single variable the same (and there may well be some variables I can\'t think of, as well), if the results were variable, that by definition means that there are several possible outcomes with the same circumstances. Why would it then be correct to tie it to the average? Or even to the one with the most identical results, unless they were ALL the same? And again, those previous results were the result of judgment themselves. How can it be correct to rely on previous judgment OVER judgment about the exact case in question?

3-- No, I do NOT make as many assumptions as Ragozin. I make as few as possible, as I said to you in the \"Figure Making Methodology\" post years ago (\"The two sides of the house I can see are white\"). For example, I assume no fixed relationship between one and two turn races. I don\'t even assume that I have all the relevant data to work with.

Two examples-- the first is the one from that post of years ago. When it became obvious the two Belmont turf courses were independent of each other, one faster one day, the other on another, I didn\'t know why, but it was obvious from the horses that it was happening. Ragozin could not come up with a reason why it was happening (neither could I), so he refused to believe it, and did them at the same speed (he used an average). I split them. It was only a month later we learned that they had been watering the courses on different days.

The other example involves slow-pace races, which as you probably know we both cut loose and do strictly off the horses, because they simply can\'t run fast enough late to make up the time they lost early. Well, that works fine AS LONG AS YOU KNOW THE PACE WAS SLOW. But what if you don\'t have fractions, as used to be the case with some turf races at Calder at about distances? If there are two turf races on a day, Ragozin will do them with each other, independent of KNOWING why they should be split, so if one has a very slow pace-- and he doesn\'t know it-- he\'s screwed. I do NOT make that assumption, and if it\'s obvious something happened I\'ll split them, or if they are very lightly raced horses, not do a figure for the race.

The assumption I make, as I said to you years ago, is that the past histories of the horses can be used as a guide for how fast they will run in the future. It\'s really a premise-- without it you couldn\'t use these things to bet with, either.

Ragozin makes tons of assumptions-- I\'ve shown lots of them. Many I proved outright were false, in \"Changing Track Speeds\". And if Friedman ever posts the BC figures (any comment about that, by the way?), I\'ll show you more evidence.

Which is why they won\'t.
TGJB

TGJB

CH\'s inane comment notwithstanding (and you would be well advised not to make another on the subject), this is where we are at:

Friedman has posted the BC and Triple Crown figures on his website every year. AFTER they ran the races this year,he said he would be doing so again. It has now been over a week, and they have not done so.

This is a deeply cynical move, the latest of many (the worst was not correcting their beaten lengths error in the 04 Derby-- they decided to leave an error, in the data their PAYING customers use, for the biggest race of the year, rather than admit an error). They are willing to come across as having something to hide rather than do what they have done in the past and said they would do again-- let the public look at their work product for racings biggest day. They do this because they believe their paying customers are too brainwashed to hold it against them. And because they know what even those customers would think if they DID see the figures.

You might recall that I made a big deal about how ridiculously slow Ragozin had the Jockey Club Gold Cup going (he had Borrego going BACK 3 points to a 5, the rest running much worse than Borrego, and much, much worse than they had been recently). Well, 4 of those horses-- Flower Alley, Suave, Sun King, and Borrego-- came back in the Classic. On Ragozin, the first 3 will all \"go forward\" a huge amount-- FA about 15 points, S and SK about 7 points each.

Borrego, however, is the one that is keeping them from going public. Keep in mind that he won the Gold Cup by a block, with another big gap to third, and finished TENTH BEATEN TEN LENGTHS IN THE CLASSIC, as the second favorite, off his Gold Cup win. Ragozin will have his Classic figure at least as good as, and probably BETTER than his Gold Cup race. The only way he could not would be:

a) They give the Distaff really slow figures. I mean, AWFUL numbers-- much worse than we did, and you might recall I had only the winner running back to (in her case better than) her top. I had most running really bad-- DESPITE TAKING OFF ALMOST 4 POINTS COMPARED TO THE SURROUNDING RACES. Since the Distaff and Classic are back to back races, and they insist you must do them at the same variant, to get the Classic slow enough for Borrego just to pair up (instead of go forward) they would have to not only not TAKE OFF the 4 points, but need to ADD a couple of points to the race. Anything less gives Borrego a \"forward move\".

You will have no idea how silly those figures on the Distaff would look unless you see them all together-- 12 of 13 top fillies, on the biggest day of the year, running 5, 10, or more points worse than their tops, all at the same time. If you want to get some idea of how that would look, go look at our figures for the Distaff-- still available in ROTW-- and add 5 or 6 points to each horse.


b) Or-- they split those two races, despite having said in no uncertain terms-- in both Ragozin/Friedman\'s book and many times on their website-- that they never do that. They TAKE OFF from the Distaff, as I did, and ADD to the Classic. If they do that, they can get Borrego and the others running slower.

As you can imagine, if they do either of these things, we\'re going to be having a conversation. And that\'s why they are not making their work public.
TGJB

BitPlayer

TGJB -

I don\'t know about Hamlet or the ghost, but if you asked the scientists you quoted in Changing Track Speeds whether a study of the type SP has suggested could be done, I suspect you\'d get a different answer.

As I understand SP\'s proposal, the objective is to measure the effect of a single event (track watering) on track speed.  I acknowledge that the effect will differ depending on the condition of the track before watering, but you have your variants to estimate that.  You measure the variant for the race preceding watering and the variant for the race after watering.  The results of your study would not be a single number, but a graph, with the change in track speed after watering on one axis and initial track speed on the other.  To minimize the effect of some of the variables you list, you would do the study at a single track with relatively consistent dry weather.  Southern California leaps to mind.  You would also want the races to be at the same distance.  All the points won\'t be on the curve, and there would be some of Bobphilo\'s outliers, but the points should cluster close to the curve.  The curve would then serve as a guide to making future variants at that track.

It may be more work than you care to undertake, but it doesn\'t seem impossible.  Studies with more variables are undertaken all the time.

BitPlayer

davidrex

 

 Alan....what do you think of this latest request?!

TGJB

Bit-- I\'ll give you a lengthy reply to this later today when I have more time, to put this to bed once and for all.
TGJB

TGJB,

\"The \"tweaking\" has to be done to all the horses in a race-- I can\'t just give them \"what I expect them to run\", to quote that nonsense from the other board.\"

If an entire race (meaning most/all of the horses and the relationships between them) comes up faster or slower than expected relative to the other races run that day, then you tweak the whole race and \"give them what you expected them to run\" based on their prior figures.  (all else being equal regarding track maintenance, wind, etc...)

Correct?  

Yes or No would be fine.

(That is not a criticism. It\'s a description of my understanding.)  

TGJB

CH-- Gotta love those compound questions that include characterizations.

     a) I never give them what I expect them to to run. There is no correlation at all between what I expect them to run and the figures I assign.

     b) Any corrections I make to a race are made to all the horses within the race.

     c) I use the previous figure histories of the horses to make the figures, as do all figure makers. What I (and Beyer) do not do is make the assumption that Ragozin does that the track is staying the same speed, or that one/two turn relationships are constant. So we adjust the variants from race to race.

I suggest that you wait for my reply to Bit Player before following up on this. Later tonight, I hope.
TGJB

OK. I think the issue may be my use of the term \"expect them to run\" vs. what I mean by it. I mean \"think they ran based on their prior figures and the result evidence\". I\'m pretty sure I understand what you are doing and when. If we communicated verbally it would be clearer. I\'ll wait on the later post for clarification.  

miff

JB said:

\"a) I never give them what I expect them to to run. There is no correlation at all between what I expect them to run and the figures I assign.\"


Jerry,

That is the most frequent criticism I hear from the RAG users that I know.I believe it comes from your pronouncements that:

1.Many horses run in tight ranges,( they feel you assume pairs going in which is why TG and RAGS look different too often)

2.It is unlikely that a high percentage of a certain group of runners will all X in a given race,(they feel you somewhat award figs based on this theory instead of whats happening on the track).


From all that I have read here since the BC, I have concluded that TG and RAGS can no longer be compared for confirmation of a horses fig.You are using far more variables and input than Rags and that may be why the scale of comparison has gone out of whack.

It would seem reasonable to think that the more relevant things that go into creating a fig, the more accurate the fig is, but I do understand that Rags believes \"it aint broke so they aint fixin it\"

Mike
miff

TGJB

Miff-- I don\'t assume horses run in tight ranges, and I can\'t make them do it if they don\'t. If Ragozin and I are both looking at a race where there the winner runs 15 points better than the last horse, we both have no choice but to give them 15 points better. That limits my options-- as I have said many times, I can\'t pair up the winner (or have him run in his usual range) AND the last finisher (likewise) and/or any in between, unless the relationships justify it. And that\'s even assuming I just want to break races out for no reason. As my treatment of the BC Juvenile Fillies shows, I don\'t.

What every serious figure maker knows-- including Ragozin (but probably not Friedman)-- is that if you are not screwing around WITHIN the races, the tighter ranges you have the horses running in, the more evidence it is that your data base is right. That\'s because the premise of the whole enterprise is that horses\' previous histories can be used as a guide to what figures they will run in the future, both in figure making and betting terms.

Because Ragozin assumes the track stays the same speed-- in situations where that assumption defies all logic (sealed, for example), let alone the many subtler ones I have described before-- there will absolutely be situations where there is no correlation whatsover between their figures and mine. And their figures and reality, for that matter.
TGJB

TGJB

Miff-- more on this.

We put BC day up, the whole day. This gives you a chance to see exactly what I\'m talking about, because it shows not only how we put the DAY together, but how it works out for the horses within each race. As you can see, I gave plenty of horses figures that I would not have \"wanted\" to give them-- ones that were outside their usual range. Buzzards Bay, Stevie Wonderboy, Silver Train, and Pleasant Home are examples of ones that got big new tops. I give horses big new tops all the time-- Borrego and Taste of Paradise got them the previous time. Lots of horses got figures much worse than their usual range-- Shakespeare and lots of others in the grass races (grass horses are usually VERY consistent), Yolanda B Too and several others in the Distaff, Gygastar, etc. Take a good look-- you CAN\'T put them all in their usual ranges.
TGJB