A lot of people prefer goal differential to points as a method of evaluating the overall quality of a team. I do too, but there are a couple other ways of looking at it. I've gotten a little carried away in some new statistical analysis of the season recently, but I'd like to share some of the basis behind it.
| GP | W | D | L | GS | GA | GD | P | MG | Real Win % | Pythagorean Win % | Marginal Win % | |
| Man Utd | 38 | 27 | 6 | 5 | 80 | 22 | 58 | 87 | 108 | 0.789 | 0.930 | 1.079 |
| Chelsea | 38 | 25 | 10 | 3 | 65 | 26 | 39 | 85 | 89 | 0.789 | 0.862 | 0.889 |
| Arsenal | 38 | 24 | 11 | 3 | 74 | 31 | 43 | 83 | 93 | 0.776 | 0.851 | 0.929 |
| Liverpool | 38 | 21 | 13 | 4 | 67 | 28 | 39 | 76 | 89 | 0.724 | 0.851 | 0.889 |
| Everton | 38 | 19 | 8 | 11 | 55 | 33 | 22 | 65 | 72 | 0.605 | 0.735 | 0.720 |
| Aston Villa | 38 | 16 | 12 | 10 | 71 | 51 | 20 | 60 | 70 | 0.579 | 0.660 | 0.700 |
| Blackburn | 38 | 15 | 13 | 10 | 50 | 48 | 2 | 58 | 52 | 0.566 | 0.520 | 0.520 |
| Portsmouth | 38 | 16 | 9 | 13 | 48 | 40 | 8 | 57 | 58 | 0.539 | 0.590 | 0.580 |
| Man City | 38 | 15 | 10 | 13 | 45 | 53 | -8 | 55 | 42 | 0.526 | 0.419 | 0.420 |
| West Ham | 38 | 13 | 10 | 15 | 42 | 50 | -8 | 49 | 42 | 0.474 | 0.414 | 0.420 |
| Tottenham | 38 | 11 | 13 | 14 | 66 | 61 | 5 | 46 | 55 | 0.461 | 0.539 | 0.550 |
| Newcastle | 38 | 11 | 10 | 17 | 45 | 65 | -20 | 43 | 30 | 0.421 | 0.324 | 0.300 |
| Middlesbrough | 38 | 10 | 12 | 16 | 43 | 53 | -10 | 42 | 40 | 0.421 | 0.397 | 0.400 |
| Wigan | 38 | 10 | 10 | 18 | 34 | 51 | -17 | 40 | 33 | 0.395 | 0.308 | 0.330 |
| Sunderland | 38 | 11 | 6 | 21 | 36 | 59 | -23 | 39 | 27 | 0.368 | 0.271 | 0.270 |
| Bolton | 38 | 9 | 10 | 19 | 36 | 54 | -18 | 37 | 32 | 0.368 | 0.308 | 0.320 |
| Fulham | 38 | 8 | 12 | 18 | 38 | 60 | -22 | 36 | 28 | 0.368 | 0.286 | 0.280 |
| Reading | 38 | 10 | 6 | 22 | 41 | 66 | -25 | 36 | 25 | 0.342 | 0.278 | 0.250 |
| Birmingham | 38 | 8 | 11 | 19 | 46 | 62 | -16 | 35 | 34 | 0.355 | 0.355 | 0.340 |
| Derby | 38 | 1 | 8 | 29 | 20 | 89 | -69 | 11 | -19 | 0.132 | 0.048 | -0.189 |
The first few columns show the final league table as we all know it. MG represents "Marginal Goals" - more on this a bit later. Next is Real Win %. This is each team's W-D-L record represented as a percentage. For this, a draw counts as half of a win. Pythagorean Win % is the results of a simple formula used to predict how many games a team should have won:
Pythagoras didn't really have anything to do with this, but the name comes from the formula's resemblence to his theorem.
Bill James (who else?) applied this to baseball originally, and later took it a step further while developing his Win Shares statistic, which quantifies individual player contributiion to his team's record. This is where MG comes into play. Marginal Goals is the sum of a team's goal differential and the league average goals scored. In 2007/08, this was just over 50 goals. If we divide MG by 2 times the league average goals, we get another predicted winning percentage. This one generally isn't quite as accurate as the Pythagorean method, but it generally does a good job for teams who win between 30% and 70% of their games. More importantly, it gives us something to break down further and evaulate individual performance.
A "marginal" team can be described as a team of players at the lowest level of quality in the league. Think of players who would do well in the Championship, but would be lucky to make it off the bench for an average Premiership team. Sunderland of 05/06, who finished that season only 4 goals better than a marginal team, is the best example in recent memory. For most teams, 2-3 goals scored or prevented will earn another win. This season, Fulham averaged 2.007 MG per win, so every two goals we scored/prevented above that line in the sand we call "marginal" would have earned us an additional win. Manchester United, who had plenty of wins already, would need 3.6 more goals to win one more game. Derby County's negative total shows that they really didn't belong at this level, something most of us knew. Fulham, Newcastle, and all other teams who gave up points to them at any point in the season must have really been off on those days.
Right now, none of this (at least on its own) means much. The Pythagorean formula is nice to tell us where we should have been on the table, and it works for predicting future performance. Marginal Goals tell us how much better we are than a very bad, non-existent team. The real good stuff is when we use this to determine individual player value.
Ah! Tom Tango did something
Ah! Tom Tango did something on this for Ice Hockey. It involved a poisson distribution I think. Also, the power is different to what works for baseball.
I was doing a top-down player rating system a while back and settled on marginal goals as a way of establishing strength. It appears to be one omission from the Fink Tank's analysis - they use average but have found that for a lot of teams, there is serious value in being average!
Yep, a few more average
Yep, a few more average players would probably do us a lot of good! Haven't seen Tom Tango's hockey data, but I've read a couple other papers on the topic.
Ever look at an attack/defence ratio to break down marginal goals? Bill James (somewhat arbitrarily) picked .48/.52 for baseball...my best guess for football seems to be around .4/.6, loosely based on 4 defenders and 1 keeper playing defence most of the time, 4 midfielders playing defence half of the time, and 2 forwards playing not much defence. I'm trying to do the whole individual player ratings thing, but defence is proving to be tricky. The minuses of the +/- ratings are a good start, but it doesn't seem fair to everyone, so I started looking at the effect of things like clearances, tackles, etc., on marginal goals prevented. This all may end up being useful, but I'm wary of the sample size.
I need a lie down after all those stats!
You MUST be the football version of a SABRmatrician. Exhausting stats, but as long as we're not in the bottom three, I can live with it.
Love your site, man; never got around to telling you that before.
does this mean that fulham
does this mean that fulham overachieved to stay up?
whoops
or rather that sunderland should have gone down instead of birmingham?
Yep, bottom 3 probably
Yep, bottom 3 probably should have been Derby, Reading & Sunderland.
Quick question
I was looking back over this. Quick question. On your winning percentage you count a draw as half of a win, but shouldn't it be a third of a win, what with 3pts for a win and 1 pt for a draw?
If you add up the winning
If you add up the winning percentages, pythatgorean winning percentages, and marginal winning percentages, they all add up to 10 (which is 20 teams * .500). So, it probably much has to be this way in order for the predicted values to be useful.
Now, if I were to predict each team's expected point totals, then yeah, I would definitely need to count a draw as a third of a win. The tricky part there is deciding how likely each team is to earn a draw (or a win). I haven't really thought about this much, but I suspect that you could look at the actual number of draws the team ended up with, or even how many close games vs how many games won by 2 or more goals, and get something useful. I might have to play around with that, and see what we get.
Post new comment