Who you callin' overrated?

pck · July 08, 2012, 11:05:27 PM

Is it possible for a player X with a 28% win rate in 13 point matches against the bots to be rated 1850? Surely there has to be something wrong with that. X cheats or is coming down from an even higher rating which was entirely due to luck. In the long run he will go down.

Let's see.

The fibs rating formula (typing "help formula" on the command displays it) computes rating changes using the number

(I) P_upset = 1 / (10^{D*sqrt(n)/2000} + 1)

where n is the matchlength, sqrt stands for "square root", and D is the absolute value of the players' rating difference.

(II) Rating change is computed in the following way: If the underdog (which here means the lesser rated player, not the player who plays worse) wins, he gains 4*sqrt(n)*(1-P_upset) rating points. If he loses, 4*sqrt(n)*P_upset points are deducted from his rating.

Let's call P_t < 50% the probability of winning for player "Underdog" in a match of length n against an opponent rated D points above him. Underdog's average rating change R per match will then be

(III) R = P_t * 4 * sqrt(n) * (1 - P_upset) - (1 - P_t) * 4 * sqrt(n) * P_upset

We see that if P_t equals P_upset, then, by design of the rating changes for wins and losses (see II), his average rating gain will be exactly zero. If he plays better (worse) than P_upset, R will be >0 (<0), and the rating fomula will push him towards a higher (lower) rating which will make his new P_upset closer or equal to his actual performance P_t.

P_upset in fibs's explanation of the rating formula is misleadingly called the "probability that underdog wins". Obviously, fibs cannot know Underdog's actual chance of winning a match, since it depends on how well he and his opponent play. But if Underdog is not currently overrated, then playing at the skill level P_upset will maintain his rating (on average).

The fact that P_upset describes a level of skill at which one's rating remains constant (for fixed n and D) can be used to calculate whether X's rating is "justified". First we solve (I) for D. We get

(IV) D = 2000 * log(1/P_upset - 1) / sqrt(n)

We now put n=13 and P_upset=0.28 to see what rating difference D corresponds to X's performance. We find D=227,5. Since X has been playing bots, which are rated at around 2100, we conclude that 2100 - 227,5 = 1872,5 is X's expected rating.

Another example: Playing 1 pointers against 2100-rated bots with a win rate of 41% has an expected rating of 1784.

jackdaddy · July 09, 2012, 01:52:28 AM

Another example: Playing 1 pointers against 2100-rated bots with a win rate of 41% has an expected rating of 1784.

[/quote]

I think this example is misleading. After all, who would base their rating on playing bots 1 ptrs the majority of the time?

pck · July 09, 2012, 11:17:45 AM

Quote from: jackdaddy on July 09, 2012, 01:52:28 AM
I think this example is misleading. After all, who would base their rating on playing bots 1 ptrs the majority of the time?

If the above numbers are correct, then it would certainly be misleading, even embarrassing, to state a winning rate of .41 in bot 1-pointers as an indication of outstanding backgammon skill. (Especially if it happened in public and was part of a long term strategy of self-aggrandization coupled with a compulsive need to portray oneself as possessing deep mathematical insights into the game.)

pck · July 09, 2012, 12:09:44 PM

Quote from: pck on July 08, 2012, 11:05:27 PM
(III) R = P_t * 4 * sqrt(n) * (1 - P_upset) - (1 - P_t) * 4 * sqrt(n) * P_upset

We see that if P_t equals P_upset, then, by design of the rating changes for wins and losses (see II), his average rating gain will be exactly zero. If he plays better (worse) than P_upset, R will be >0 (<0)

Which is easier to see by noting that the right side of (III) reduces to 4*sqrt(n)*(P_t - P_upset).

Quote from: pck on July 08, 2012, 11:05:27 PM
But if Underdog is not currently overrated, then playing at the skill level P_upset will maintain his rating (on average).

This should be "if Underdog is not currently over- or underrated".

pck · July 09, 2012, 02:27:34 PM

Quote from: pck on July 08, 2012, 11:05:27 PM
(III) R = P_t * 4 * sqrt(n) * (1 - P_upset) - (1 - P_t) * 4 * sqrt(n) * P_upset

We see that if P_t equals P_upset, then, by design of the rating changes for wins and losses (see II), his average rating gain will be exactly zero. If he plays better (worse) than P_upset, R will be >0 (<0), and the rating fomula will push him towards a higher (lower) rating which will make his new P_upset closer or equal to his actual performance P_t.

"New P_upset" needs clarification. It is assumed here that Underdog's rating r_u can rise or fall, but that he only plays opponents who all have the same rating r_opp, so that

D = | r_u - r_opp |

can actually change the value of P_upset in (I). This keeps happening until P_upset = P_t. As with all stochastic processes, the conceptual complication is that everything happens "in mean". There are no guarantees, only expectations of what will happen "most likely". So Underdog's rating "converges in mean" towards a value which produces a D that via (I) makes P_upset equal to P_t.

Dungeoneer · July 09, 2012, 02:37:48 PM

Quote from: pck on July 09, 2012, 11:17:45 AM
If the above numbers are correct, then it would certainly be misleading, even embarrassing, to state a winning rate of .41 in bot 1-pointers as an indication of outstanding backgammon skill. (Especially if it happened in public and was part of a long term strategy of self-aggrandization coupled with a compulsive need to portray oneself as possessing deep mathematical insights into the game.)

Would also be totally absurd to keep especially cube decisions out of such maths skill demonstrations.

Luckily so, nobody would ever try to do so, or would he?

vegasvic · July 09, 2012, 08:03:33 PM

We all know jack is a over rated cube Ho

)

Someone pass the word to inim .

FIBS Board backgammon forum

News:

Who you callin' overrated?

pck

jackdaddy

pck

pck

pck

Dungeoneer

vegasvic