News:

"Fibsboard allows for more considered reading and response, whereas Fibs shout is a more intuitive interaction"

Main Menu

Who you callin' overrated?

Started by pck, July 08, 2012, 11:05:27 PM

Previous topic - Next topic

pck

Is it possible for a player X with a 28% win rate in 13 point matches against the bots to be rated 1850? Surely there has to be something wrong with that. X cheats or is coming down from an even higher rating which was entirely due to luck. In the long run he will go down.

Let's see.

The fibs rating formula (typing "help formula" on the command displays it) computes rating changes using the number

(I)  Pupset = 1 / (10D*sqrt(n)/2000 + 1)

where n is the matchlength, sqrt stands for "square root", and D is the absolute value of the players' rating difference.

(II)  Rating change is computed in the following way: If the underdog (which here means the lesser rated player, not the player who plays worse) wins, he gains 4*sqrt(n)*(1-Pupset) rating points. If he loses, 4*sqrt(n)*Pupset points are deducted from his rating.

Let's call Pt < 50% the probability of winning for player "Underdog" in a match of length n against an opponent rated D points above him. Underdog's average rating change R per match will then be

(III)  R = Pt * 4 * sqrt(n) * (1 - Pupset)  -  (1 - Pt) * 4 * sqrt(n) * Pupset

We see that if Pt equals Pupset, then, by design of the rating changes for wins and losses (see II), his average rating gain will be exactly zero. If he plays better (worse) than Pupset, R will be >0 (<0), and the rating fomula will push him towards a higher (lower) rating which will make his new Pupset closer or equal to his actual performance Pt.

Pupset in fibs's explanation of the rating formula is misleadingly called the "probability that underdog wins". Obviously, fibs cannot know Underdog's actual chance of winning a match, since it depends on how well he and his opponent play. But if Underdog is not currently overrated, then playing at the skill level Pupset will maintain his rating (on average).

The fact that Pupset describes a level of skill at which one's rating remains constant (for fixed n and D) can be used to calculate whether X's rating is "justified". First we solve (I) for D. We get

(IV)  D = 2000 * log(1/Pupset - 1) / sqrt(n)

We now put n=13 and Pupset=0.28 to see what rating difference D corresponds to X's performance. We find D=227,5. Since X has been playing bots, which are rated at around 2100, we conclude that 2100 - 227,5 = 1872,5 is X's expected rating.

Another example: Playing 1 pointers against 2100-rated bots with a win rate of 41% has an expected rating of 1784.

jackdaddy



Another example: Playing 1 pointers against 2100-rated bots with a win rate of 41% has an expected rating of 1784.

[/quote]

I think this example is misleading. After all, who would base their rating on playing bots 1 ptrs the majority of the time?

pck

Quote from: jackdaddy on July 09, 2012, 01:52:28 AM
I think this example is misleading. After all, who would base their rating on playing bots 1 ptrs the majority of the time?

If the above numbers are correct, then it would certainly be misleading, even embarrassing, to state a winning rate of .41 in bot 1-pointers as an indication of outstanding backgammon skill. (Especially if it happened in public and was part of a long term strategy of self-aggrandization coupled with a compulsive need to portray oneself as possessing deep mathematical insights into the game.)

pck

Quote from: pck on July 08, 2012, 11:05:27 PM
(III)  R = Pt * 4 * sqrt(n) * (1 - Pupset)  -  (1 - Pt) * 4 * sqrt(n) * Pupset

We see that if Pt equals Pupset, then, by design of the rating changes for wins and losses (see II), his average rating gain will be exactly zero. If he plays better (worse) than Pupset, R will be >0 (<0)

Which is easier to see by noting that the right side of (III) reduces to 4*sqrt(n)*(Pt - Pupset).

Quote from: pck on July 08, 2012, 11:05:27 PM
But if Underdog is not currently overrated, then playing at the skill level Pupset will maintain his rating (on average).

This should be "if Underdog is not currently over- or underrated".

pck

Quote from: pck on July 08, 2012, 11:05:27 PM
(III)  R = Pt * 4 * sqrt(n) * (1 - Pupset)  -  (1 - Pt) * 4 * sqrt(n) * Pupset

We see that if Pt equals Pupset, then, by design of the rating changes for wins and losses (see II), his average rating gain will be exactly zero. If he plays better (worse) than Pupset, R will be >0 (<0), and the rating fomula will push him towards a higher (lower) rating which will make his new Pupset closer or equal to his actual performance Pt.

"New Pupset" needs clarification. It is assumed here that Underdog's rating ru can rise or fall, but that he only plays opponents who all have the same rating ropp, so that

D = | ru - ropp |

can actually change the value of Pupset in (I). This keeps happening until Pupset = Pt. As with all stochastic processes, the conceptual complication is that everything happens "in mean". There are no guarantees, only expectations of what will happen "most likely". So Underdog's rating "converges in mean" towards a value which produces a D that via (I) makes Pupset equal to Pt.


Dungeoneer

Quote from: pck on July 09, 2012, 11:17:45 AM
If the above numbers are correct, then it would certainly be misleading, even embarrassing, to state a winning rate of .41 in bot 1-pointers as an indication of outstanding backgammon skill. (Especially if it happened in public and was part of a long term strategy of self-aggrandization coupled with a compulsive need to portray oneself as possessing deep mathematical insights into the game.)

Would also be totally absurd to keep especially cube decisions out of such maths skill demonstrations.

Luckily so, nobody would ever try to do so, or would he?  B)
δS = 0

vegasvic

We all know jack is a over rated cube Ho :))

Someone pass the word to inim .