# The secrets of SuperCoach AFL scoring

Iâ€™ve always been intrigued by the **AFL SuperCoach**
game. Playersâ€™ scores are determined by a variety of playing statistics
for each game, but we donâ€™t really have any idea what is included.
*Effective disposals? Uncontested intercept marks? Gathers from
hitouts?* There are so many statistical categories that some coaches
think it is magic! There is also match scaling to ensure each match
totals to approximately 3300 points.

Unlocking the secrets behind how SuperCoach is scored will give better insights to the community and help with player selection. Explaining the scoring will help answer how certain players achieve high averages, what stats are important to be good at, and who gets the biggest bonuses.

This is my first blog post so make sure you share if you enjoy! Letâ€™s try and reverse engineer SuperCoach scores!

## What we know about scoring

From the SuperCoach T&Câ€™s in 2023 (which match previous years), we have the following scoring breakdown:

Stat | Awarded/Deducted |
---|---|

Effective kick | 4Â Points |

Ineffective kick | 0Â Points |

Clanger kick | -4Â Points |

Effective Handball | 1.5Â Points |

Ineffective handball | 0Â Points |

Clanger handball | -4Â Points |

Handball receive | 1.5Â Points |

Hardball get | 4.5Â Points |

Loose-ball get | 4.5Â Points |

Goal | 8Â Points |

Behind | 1Â Point |

Mark uncontested (maintaining possession) | 2Â Points |

Mark contested (maintaining possession) | 6Â Points |

Mark uncontested (from opposition) | 4Â Points |

Mark contested (from opposition) | 8Â Points |

Tackle | 4Â Points |

Free kick for | 4Â Points |

Free kick against | -4Â Points |

Hitout to Advantage | 5Â Points |

Gather from hitout | 2 Points |

This is a good start but there are three main problems - where the analysis of scoring breaks down:

- These stats are too complex and canâ€™t be viewed on the AFL website.
- This isnâ€™t a comprehensive list of all stats used.
- We havenâ€™t accounted for scaling. Iâ€™ll attempt to address these points while tracking how close we are to solving the formula.

## 1. Using the AFL website stats

Letâ€™s start with the default stats available on the AFL website. Of the SuperCoach score categories we can see:

- Effective kicks
- Effective handballs (Effective Disposals - Effective Kicks)
- Clangers
- Hard/loose ball gets (Contested Possessions - Contested Marks - Frees For)
- Goals
- Behinds
- Contested marks
- Uncontested marks (Marks - Contested marks)
- Tackles
- Free Kicks

Here is an example of these stats exclusively, via the My Stats option on the AFL website.

Calculating scores using these stats and their respective point values we achieve this:

```
%>%
sc_stats rowwise() %>%
mutate(SC_Prediction = sum(
4 * effective_kicks,
-4 * clangers,
1.5 * (effective_handballs), # effective handballs
4.5 * (contested_possessions - contested_marks - frees_for), # loose/hard ball gets
2 * (marks-contested_marks), # uncontested marks
6 * contested_marks,
4 * tackles,
4 * frees_for,
-4 * frees_against,
5 * hitouts_adv # hitouts to advantage
%>%
)) ungroup() %>%
model_plot(SC_Prediction, Score, "SuperCoach prediction", "SuperCoach score estimation using AFL website stats")
```

Iâ€™ve introduced two metrics here to measure how accurate our predictions are.

- R-squared, which measures how much variation in the actual scores can be explained by our predictions.
- MAE, mean absolute error, which is the average difference between our predictions and the actual scores.

While an r-squared value of 83.1% might seem really good, it tells us that only 83% of variation in SC scores can be explained using the initial model. The MAE tells us that our predictions are 11.61 points off on average. Not very useful yet! It appears that middle scores are being underestimated and most huge scores are being overestimated. We need some more stat categories

## 2. Using advanced stats

We are missing quite a few stats that arenâ€™t directly accessible on the AFL website. Iâ€™ve found some stats via a variety of sources, while other stats can be directly calculated using others. Most importantly we can fill out the full list of stats available in the T&Cs! Letâ€™s take a look at what stats we have now:

```
## tibble [18,216 Ã— 25] (S3: tbl_df/tbl/data.frame)
## $ season : int [1:18216] 2022 2021 2022 2022 2021 2021 2021 2021 2022 2022 ...
## $ round : int [1:18216] 6 10 2 12 16 18 16 20 2 16 ...
## $ player : chr [1:18216] "Callum Mills" "Clayton Oliver" "Lachie Neale" "Max Gawn" ...
## $ team : chr [1:18216] "Sydney Swans" "Melbourne" "Brisbane Lions" "Melbourne" ...
## $ effective_kicks : int [1:18216] 21 11 16 12 4 19 14 12 18 11 ...
## $ ineffective_kicks : int [1:18216] 1 7 6 4 6 4 5 4 7 5 ...
## $ clanger_kicks : int [1:18216] 1 2 2 2 1 0 4 1 1 0 ...
## $ effective_handballs : int [1:18216] 13 17 15 9 7 12 18 18 11 13 ...
## $ ineffective_handballs : int [1:18216] 0 1 1 1 2 4 2 3 3 3 ...
## $ clanger_handballs : int [1:18216] 1 0 1 0 0 0 0 1 0 1 ...
## $ handball_receives : int [1:18216] 12 9 12 3 1 13 9 15 14 8 ...
## $ hard_ball_gets : int [1:18216] 2 7 0 5 8 5 7 2 4 13 ...
## $ loose_ball_gets : int [1:18216] 6 12 18 7 2 7 15 7 4 4 ...
## $ goals : int [1:18216] 1 3 2 3 1 1 1 0 1 3 ...
## $ behinds : int [1:18216] 0 1 2 1 1 1 0 0 2 1 ...
## $ uncontested_marks : int [1:18216] 9 0 3 3 5 7 4 5 10 4 ...
## $ uncontested_intercept_marks: int [1:18216] 0 0 0 1 1 1 1 1 1 0 ...
## $ contested_marks : int [1:18216] 2 1 3 6 3 2 1 0 0 1 ...
## $ contested_intercept_marks : int [1:18216] 2 0 3 0 1 0 1 0 0 0 ...
## $ tackles : int [1:18216] 5 9 2 0 0 5 3 12 8 6 ...
## $ frees_for : int [1:18216] 6 4 2 5 0 0 1 1 5 1 ...
## $ frees_against : int [1:18216] 0 1 1 2 2 0 0 1 0 0 ...
## $ hitouts_to_advantage : int [1:18216] 0 0 0 11 22 0 0 0 0 0 ...
## $ gathers_from_hitout : int [1:18216] 1 5 0 0 0 1 4 5 0 1 ...
## $ sc_score : int [1:18216] 214 204 198 198 193 193 190 190 189 189 ...
```

The most exciting thing about this dataset is that we can see where players score their points and how well they do before their score is scaled. If we calculate all the scores based on these new stats we achieve a better fit:

We are almost at 90% of variation explained and the MAE dropped slightly. Still not perfect, but we are making progress! We still havenâ€™t addressed the higher scores being overestimated Scaling should hopefully help with this by bringing them back to standard.

## 3. Scaling the scores via the 3300 rule

If you werenâ€™t aware, SuperCoach AFL has a set value for total points in a game. Based on the predictions from before, the estimated number of games that hit this value is 93.7%, with the game average at 3582. If this is close to correct, then scaling our predictions will help a lot.

Iâ€™ll be scaling each score linearly to 3300, as itâ€™s the best way I could do it. Iâ€™ve heard that stats are scaled quarterly but that also goes against the whole â€˜every game is equalâ€™ argument. Teams could only win one quarter and still get 4 premiership points.

Letâ€™s see how the scaling affects our scores:

Awesome! The MAE is now single digits and R-squared went above 90%, which isnâ€™t bad at all. So on average, the estimations are 8.27 points different to the actual scores. Letâ€™s see who has the biggest differences:

Most overestimated players |
|||||

Home & Away matches from 2021-2022 | |||||

Player | Games | 2YR SC Average | 2YR Estimation | Diff | |
---|---|---|---|---|---|

Jarrod Witts | 25 | 107.2 | 122.0 | 14.8 | |