What I Don't Understand About Last-Second Program Betting

June 13, 2017, 11:32:20 AM

TempletonPeck: A few thoughts:

1) if this was happening, how would you know? if it was happening, the effect could be to bankrupt one or more of the syndicates to the benefit of one or more of the other syndicates. IOW, the bigger sharks would eat up the little sharks, but the net effect to you could be the same, the big shark now has more money with which to move the market.

2) it wouldn\'t be too hard for them to learn to avoid each other, they probably don\'t even have to discuss it but why couldn\'t/wouldn\'t they?

3) at big tracks, they may just not have enough money to bet, or may just prefer not to bet so much money to move the odds that much (\"You can sheer a sheep many times, but skin him only once!\"). HANA says the average mutuel pool at, for instance, Aqueduct, was $225k/race, so you could bet a decent amount without moving the odds appreciably.

4) don\'t underestimate the value of rebates - if these guys can get a 5% rebate they\'re *thrilled* to make 0EV bets, they don\'t care where the $ comes from as long as it comes.

5) don\'t underestimate the amount of money that they\'re funneling into the exotic pools, which is much tougher to see/feel/detect.

June 13, 2017, 10:43:04 PM

Mathcapper: I can only speak for what I know about the most well-known and successful of the computer teams (B. Benter/A. Woods), but I do know from what I\'ve read and heard from the practitioners themselves that they bet as much as they can at each track, according to the formula for maximum expected value (whatever produces the maximum profit, given that larger bets drive down the odds), and they do indeed funnel a lot into the exotic pools (the more exotic the better, since their edge is multiplicative).

Here\'s what Bill Benter said about the subject back in May of 2003, at which point they were about 10 years into the Hong Kong market, and computer teams had started to penetrate the U.S. market:

>
â€œA report from the trenches...10 years ago we were in the golden age of horse racing. Well, the age is still golden...This year, the aggregate win of all of the horseplayers using these systems will probably be the largest itâ€™s ever been, and I would say that itâ€™s more or less gone up every year.

\"In particular markets â€" Hong Kong is kind of not as good as it used to be â€" pool sizes have fallen there and thereâ€™s a lot of competition - but looked at worldwide as these teams expand into new markets, this just gets better and better as time goes on, so we still havenâ€™t probably hit the peak yet of how practical these systems are. Teams are playing around the world in most of the big racing markets very successfully.

â€œThereâ€™s very little sign - we have talked before - ...will the public get better, and start producing better estimates? We havenâ€™t seen much sign of it. It seems that basically, the probability estimates set by the public are not much better than they were, and computer models still seem to be able to outperform the public by the same large margin. Competing teams â€" that is people using computer systems in the same markets definitely hurt, but life goes on. Everybody still seems to win, even when you have two or three teams going head to head, betting on the same horses, driving down the odds on the same horses, everybody still keeps winning somehow.\"
>

He also said at another conference that he doesn\'t bet everything at the last minute. He often disguises his bets by betting at different times throughout the betting period leading up to post time, and often even on other horses, to throw the other computer teams off his trail.

Rocky R

June 14, 2017, 11:28:52 AM

hellersorr: Thanks to both TempletonPeck and Mathcapper.

I guess if Benter doesn\'t understand it - \"everybody keeps winning somehow\" - there\'s no way in hell I\'M going to understand it.

Of course, the really brutal Benter comment is: \" . . . the aggregate win of all of the horseplayers using these systems will probably be the largest it\'s ever been . . .\". The corollary, of course, is the aggregate LOSS of all other horseplayers will be the largest it\'s ever been.

June 14, 2017, 12:25:50 PM

Mathcapper: hellersorr Wrote:
-------------------------------------------------------
> The corollary, of course,
> is the aggregate LOSS of all other horseplayers
> will be the largest it\'s ever been.

With regards to that, he also stated at the time that he estimated that the total impact of the winnings of all these computer teams was to effectively raise the overall takeout on the general public by approximately 2 percentage points.

June 14, 2017, 01:40:37 PM

Furious Pete: With regards to \"good spots and bad spots\"; if one could find the situations were the formulas are underperforming and I\'m sure there are flaws - one could even find situations better than before..?

What do we actually know about what goes in to these formulas? Some insight, I guess, is to be found here: https://www.wired.com/2002/03/betting/ (both Benter and The Efficiency of Racetrack Betting Markets mentioned, thanks again for that recommendation Rocky, a copy is on it\'s way).

An extract:

The bedrock of a predictive betting system resides in a massive collection of data on each horse - including details about the tracks and jockeys. \"You massage all of that information into a mathematical equation that can be used for predicting probabilities,\" he explains. \"If you wanted to get started in this, you would spend a year building the probabilities system, and it could cost $1 million to put together.\" And that data bank needs constant updating.
Benter, for example, has employees whose sole job it is to review race tapes after every meet. They judge each horse on 130 characteristics - attributes like speed during the first third of the race, whether it got bumped coming out of a turn, the quality of its recovery from the bump, and, of course, how it finished - and assign numerical grades. This information goes into the database, where it can be cross-referenced and called up to help predict the outcome of any impending race that particular horse runs in.

The computer essentially simulates the race before it happens, based on what has transpired in the past and any anticipated conditions in the future. The software then determines each horse\'s likelihood of winning a race. When a horse\'s computer-generated odds are better than the public\'s odds, the team slams in its wagers. \"You create a model that can analyze each type of bet, judge the conditions [in terms of money in the pool and the associated odds], and tell you when it will be most favorable to bet,\" explains Ziemba. \"You do not necessarily want to bet a ton every time - you only do it when you can find advantages.\"

One top bettor explains it like this: \"Our computer program churns through the history of the horses and adjusts all the probability in a very sophisticated way. Having established the probability of the horses, we feed that into our betting program, which looks at all the odds for the various outcomes. It looks at your true chances of winning with the latest payoff odds and calculates what the best potential bets are, based on the chances of winning and the odds. Then it runs through all the probabilities.

\"The mathematical aspect involves [following] a basic formulation that all successful gamblers use - whether they know it or not,\" the bettor says. \"It\'s having what mathematicians call a positive expectation on the bet. You multiply the probability of winning times the payoff odds of one bet. Let\'s say the horse is 20-1. If it has a .05 probability of winning, you multiply that by 20-1. You get 1.0 - or 1-1 - and that is a fair payoff bet.

\"But if that same horse is paying 25-1, then it has a positive expectation. Now it is 1.25 [or 1.25-1]. It gives you a 25 percent edge. Given that you know the true probability of winning, the amount to bet is a closed-form problem based on how much you can lay down without hurting your odds.\"

Designing the software to do all this is a delicate operation with seemingly endless pitfalls that can disastrously skew the results. \"You have to understand,\" says Ziemba, \"that building this system, maintaining it every week, and updating the model once a year is a lot of work.\" And doing the work does not necessarily guarantee success. Benter went broke at least once before his system was efficient enough to turn a steady profit. \"Every year, more and more people come here and leave with their tail between their legs,\" says Dufficy.
Whoever writes the team\'s software needs to decide early on which aspects of a horse\'s performance to take most seriously. For instance, if a debuting horse\'s odds of winning are 50-1 and it wins its first race, the software will note that - and might be inclined to view untried horses with long odds as good bets. So the system must be tweaked to give little weight to those outcomes.

Other, more ambiguous factors - turf firmness, recent time trials, second-place finishes, and the jockeys\' racing styles, to name a few - must also be taken into account. \"Memory is another thing,\" suggests Kelly Busche, an economist who has taught at Hong Kong University and consulted for one of the major teams in town. \"How quickly do you discount information? And to what degree? What happened two seasons ago should carry less weight than what happened last season. You need a model and a database that are both agile and robust enough to handle a variety of ever-changing situations.\"

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

...And then they \"calibrate\"... And calibrate. And calibrate, and calibrate. And as soon as, or actually probably before, you find a flaw - that door will be shut down with their latest adjustment.

And this was, what, 15 years ago?

Paradoxically, at least paradoxically for some of us, using Thoro-Graph and a \"tunnel view reliance\" on Kool-Aide theories against these guys doesn\'t seem like the worst idea.

All you need is a Practical Joke to come in once in a while.

They weren\'t exactly all over him..!

June 14, 2017, 05:18:48 PM

Mathcapper: Furious Pete Wrote:
-------------------------------------------------------
> With regards to \"good spots and bad spots\"; if one
> could find the situations were the formulas are
> underperforming and I\'m sure there are flaws - one
> could even find situations better than before..?

> What do we actually know about what goes in to
> these formulas?

Pete --

I\'ve talked a little bit about their flaws (a big one in Benter\'s original model that cost them their entire $100K+ initial bankroll) and the factors they use in some previous posts like this one:

Finley Article - TDN 030416

You\'ll read more about it in Benter\'s paper in EofRBM (pp.183-198). There was also a good article called \"Horse Sense\" in an actuarial trade magazine called Contingencies I\'ve posted about:

Re: Pairs

But as Benter himself has said, the practitioners, including himself, are understandably reluctant to share much information about the composition of the actual individual factors.

There is one source where he did go into some detail about them though - in the talk he gave called \"Advances in Mathematical Modeling of Horse Race Outcome Probabilities\" at the 12th International Conference on Gambling & Risk-Taking, which is where most of my quotes from Benter are taken. Audio tapes of the conference sessions used to be available for a small fee on the conferecance website. Not sure if they are anymore, but if you\'re interested and can\'t find them, you can try contacting the Institute for the Study of Gambling & Commercial Gaming at UNR.

In any event, as I\'ve discussed in another earlier post,

Re: ROW Pick 4

there\'s no magic to the probability estimates generated by the computer teams. As Benter notes, there is no single probability line (unless you are omniscient) that is the \"true\" probability line.

The objective is to create a probability line that is different from the public\'s, but also unbiased. When that line differs from the public\'s in your favor, and such horses are shown to reliably outperform the public\'s line, you\'ve got a winning approach.

And that includes creating your own fair odds line by using things like Thoro-Graph performance figures, which to my knowledge are not part of the 80+ factors used in Benter\'s model or any others.

So one can certainly create a fair odds line that will be different not only from the public\'s, but also from the computer teams as well, and that will outperform either or both of those lines on occasion.

It\'s just a matter of reliably identifying when those situations arise. The probability lines of the computer teams are by no means flawless, nor are they the be-all and end-all of fair odds lines. They just do a very good job of identifying situations where their line is better than the public\'s, but there\'s more than one way to skin a cat (like using Thoro-graph figures, Maggie\'s physicality analysis, etc.). Ernie Dahlman made a pretty good career out of looking at shoes, with guys on the ground at each track, at a time when no one else was paying any attention to such a thing.

Rocky R

June 14, 2017, 08:32:05 PM

hellersorr: On the other hand, if you were one of the few able to grind out a 10% profit on a blended takeout of 20%, adding two points to that 20% takeout reduces your 10% profit to approximately . . . 0.00%.

Tough way to make an easy living.

June 14, 2017, 09:25:50 PM

Bet Twice: I was wondering if any of these operations use TG or other figs as one of their inputs.
Jb - has anyone ever approached you on providing a feed of your data? If that is confidential I get it.

June 14, 2017, 09:42:30 PM

hellersorr: If Benter doesn\'t use the TG recipe surely he uses the same ingredients (plus many more, of course).

Or so I would think.

June 15, 2017, 12:43:54 AM

Mathcapper: Here\'s a little more flavor on what goes into his model, from the scholastic paper and the conference I referenced earlier in this thread.

The thing that\'s always struck me most is how little discussion there is about a horse\'s actual finishing time. I didn\'t come across any mention of daily variants, let alone inter-track variants, in either the paper or in any talks he\'s given that I\'ve been privy to, unless he is in fact using speed/performance figures and is intentionally avoiding discussion of the topic.

Excerpt from Benterâ€™s paper entitled â€œComputer Based Horse Race Handicapping
and Wagering Systems: A Reportâ€:

>>
The overall goal is to estimate each horse\'s current performance potential. \"Current performance
potential\" being a single overall summary index of a horse\'s expected performance in a particular
race. To construct a model to estimate current performance potential, one must investigate the
available data to find those variables or factors which have predictive significance. The profitability of
the resulting betting system will be largely determined by the predictive power of the factors chosen.
The odds set by the public betting yield a sophisticated estimate of the horses\' win probabilities. In
order for a fundamental statistical model to be able to compete effectively, it must rival the public in
sophistication and comprehensiveness. Various types of factors can be classified into groups:

Current condition:
- performance in recent races
- time since last race
- recent workout data
- age of horse

Past performance:
- finishing position in past races
- lengths behind winner in past races
- normalized times of past races

Adjustments to past performance:
- strength of competition in past races
- weight carried in past races
- jockey\'s contribution to past performances
- compensation for bad luck in past races
- compensation for advantageous or disadvantageous post position in past races

Present race situational factors:
- weight to be carried
- today\'s jockey\'s ability
- advantages or disadvantages of the assigned post position

Preferences which could influence the horse\'s performance in today\'s race:
- distance preference
- surface preference (turf vs dirt)
- condition of surface preference (wet vs dry)
- specific track preference
>>

Excerpt from Benterâ€™s talk given at the 12th International Conference on Gambling & Risk Taking:

>>
An excellent estimator of present ability is a recency-weighted average of past demonstrated ability. An exponentially recency-weighted average with a 120-day half-life is very close to optimal. So an exponential recency weight, that would be if we did a weighted average of a horseâ€™s past performances, a race occurring yesterday would have a weight of 1, a race occurring 120 days ago would have a weight of .5, 240 days ago would be .25, 360 days ago would be .125, decreasing by half every 120 days. In maybe 18 years of research, weâ€™ve hardly been able to make any improvements on this basic 120-day exponential recency weightingâ€¦Thatâ€™s an excellent estimator.

So, what would a typical measurement-of-ability factor look like? Well, recency-weighted past normalized finishing position is an extremely â€" if thereâ€™s one single variable that you should start your model with, if youâ€™re building a model, I would start with this oneâ€¦Thatâ€™s a factor which can stand up alone in almost any jurisdiction around the world as a very good beginning estimate of a horseâ€™s past performance.

Further estimating that, and this is the way we do our model, is that starting off with the recency-weighted past normalized finishing position, you can include a number of other recency-weighted past averages as separate factors. These are recency-weighted past averages of other measures of horse performance, along with recency-weighted averages of the influence on horse performance.

What would some of those be? Well, recency-weighted past normalized finishing position for a start. A second factor in your model would be recency-weighted past race competitive level. This would be sort of like recency-weighted past class of the race. Thereâ€™s various metrics that can be used to measure the sort of race level â€" in the U.S. it could be the claiming price of the race or the amount of prize money involved or the class or different things with conditions.

Another factor you could throw in is recency-weighted past post position advantageâ€¦And the value of including this factor is that it acts as kind of a corrector to the normalized finishing positionâ€¦Similar with the jockey advantage. You look at a recency-weighted past average of all of the jockey advantages, the relative jockey skill level that that horse enjoyed in the particular races. So if a horse has achieved a certain average past finishing position and heâ€™s done it all with bad jockeys, that horse is better or probably has a higher ability level than one whoâ€™d achieved the same finishing position with relatively good jockeys. Also the recency-weighted past preferences enjoyed. If a horse has run most of its past races at its preferred distance, that means that his ability is probably not as good as what you saw because he was benefited by gee, every race was at his ideal distance. Similarly, a horse that had had the same average performance but had done so at distances that didnâ€™t favor it would be relatively better.

So all of the above factors, along with any others that you can think of add up to form a good recency-weighted average of past demonstrated ability, which in turn becomes a good estimator of todayâ€™s ability.

The second term of the equation beyond ability is the various preferences. Well, horses can possess certain preferences. We try to quantify preferences by their effect on performance, and the value of each of these preferences should be applied to todayâ€™s races as separate factors. Distance preference for todayâ€™s race becomes a second factor. Surface preference, be it turf or dirt. Condition preference â€" what is the exact expected going of the grounds, is it going to be wet or dry? â€" that becomes a factor. So all of these go in as factors. This is sort of how you get up to 80 variables, by including all these little adjustments. As I was sayingâ€¦the recency-weighted average of past preferences experienced should also be included as a separate factor in the model.

The incidental factors â€" these also need to be quantitatively measured in some wayâ€¦and then thrown in as a separate variable in the model. So youâ€™d have some variable for the jockeyâ€™s ability. This could be â€" a good one for jockeys is actually recency-weighted past jockey performance. Recency-weighted past jockey normalized finish. Look at all of the jockeyâ€™s past starts, calculate this recency-weighted average â€" and for jockeys we tend to use a half-life of about 1 year. Jockeys donâ€™t change as quickly as horses do. Horses, a 120-day half-life for recency weighting, for jockeys around a year seems to be best.

Incidental factors â€" also trainers could be considered an incidental factor. The trainer effect, if the horse comes from a good stable it probably helps its performance. So some measure of the overall trainer win percentage can be thrown in. Weight to be carried today is also a factor that youâ€™d use.
>>

June 15, 2017, 10:33:25 AM

TGJB: \"Normalized times of past races\".

June 15, 2017, 10:54:49 AM

Mathcapper: TGJB Wrote:
-------------------------------------------------------
> \"Normalized times of past races\".

To me that implies raw times, not necessarily speed/performance figures. But it\'s hard to fathom they wouldn\'t be adjusting for track surfaces.

The term \"normalized\" doesn\'t necessarily mean they\'re making those adjustments. They normalize all of their factors. It just means they put them on a scale from 0.0 to 1.0 so that they can add up all the factors on a apples-to-apples basis.

And it\'s still only one term in their 80+ factor model, unless they\'re weighting that factor to a much greater extent than the others, which is possible.

June 15, 2017, 11:41:11 AM

TempletonPeck: I would be absolutely shocked if Benter/his ilk were not creating their own figures, whether actually or effectively (in other words, they may be combining a number of attributes into their formula that would serve as the basis for performance figures without actually producing a figure).

Any information that is publicly available isn\'t often very valuable - everyone has it after all.

Take the example of NFL injury reports: unless you can get them before they\'re publicly available, then you just know what everyone else knows and the line moves accordingly. There\'s no value in having that information when everyone else has it too, you have to either get it earlier than others do, or do something with it that others can\'t (in this case, Benter takes the publicly available information regarding exotic will-pays, and his software tells him where the inefficiencies are, so he\'s doing something with publicly available information that others aren\'t doing, thereby creating some value).

(This is why TGs are valuable, IMO - we get to have some information that most people don\'t have!)

What he describes in terms of reviewing the previous performances of each horse sounds to me like making performance figures - reviewing times, weight carried, distance traveled, etc. Then they go further and include things like class/competition/jockey performance, and apply a regression to weight more recent performances more heavily (something the Sartin school believed in, in a sort of rough-and-ready way).

June 15, 2017, 12:28:09 PM

TGJB: Rocky-- The definition of speed figures is normalized times. How much they weight them is a different question.