Friday, October 16, 2015

Heart of a Lion Joe and Lazy Bob

One of the biggest critiques of analytics and statistics in general is that they don’t take into account intangibles like confidence, “wanting it more”, desire and heart. On the surface this seems like a fair argument, we don’t have a statistic for leadership, heart and character per 90, but to suggest analytics is ignorant of these ideas ignores how these factors actually affect the game.

I’m not a sports psychologist so I’m not going to say which of these intangible traits matter in a game like football and which don’t, but let’s assume for the sake of argument that some of them do matter. I don’t think this is a ridiculous assumption, personally I’m much better at a lot of things when I’m confident or motivated.

Assuming these things do matter there’s a few ways they could present themselves on a football pitch, let’s unpack these one-by-one.

First imagine two players, let’s call them Heart of a Lion Joe and Lazy Bob. On a purely technical level Heart of a Lion Joe and Lazy Bob are equivalent, they are both strikers and have the exact same skill set. What is different about them comes down entirely to attitude. Heart of a Lion Joe loves football, has lots of heart and always wants to win football matches above everything else. Lazy Bob isn’t that bothered about winning and probably won’t commit to a 50-50 ball with the same vigour that Heart of a Lion Joe will.

Individual Effects

Now assume that intangibles affect the game through individual effects. That is to say that because Heart of a Lion Joe has better intangibles than Lazy Bob he will be a better player. Well if he is a better player because of these intangibles than these effects will show up in the data. Sure in a comparison of individual skills outside of a game context Heart of a Lion Joe and Lazy Bob are identical, but if these intangibles really do make you a better player then we will see that Heart of a Lion Joe putting up better numbers than Lazy Bob.

Since they are strikers we will probably see Heart of a Lion Joe score more and take more shots than Lazy Bob because he “wants it more” and will be more aggressive in going for the ball or pushing himself that little bit harder. So sure in the statistics we won’t see that Heart of a Lion Joe has more ‘heart’ than Lazy Bob, but we will see that Heart of a Lion Joe is a better player who puts up better numbers than Lazy Bob and if we are capturing that does it really matter that we aren’t capturing the exact intangible that makes him better? Probably not.

Team Effects

The second argument that anti-stats people will tell you about intangibles is that you can’t measure them because they don’t just affect one individual player they affect the whole team.  When a team plays with Heart of a Lion Joe up front it gives the whole team a boost and they all play better because they are inspired by Heart of a Lion Joe’s incredible leadership. When Lazy Bob starts up front though he just mopes around and that dampens everyone’s mood.

Well luckily we can measure this as well! Using Shapley Values or GoalImpact we can compare how players affect their teammates. These are completely agnostic measures which means they don’t try and pin down the mechanisms through which players are making their team play better or worse they just compare how the team plays with or without them in the team (weighting for other factors like who they are playing with, strength of opponent etc.). So even though there isn’t a way to directly measure how one player’s intangibles affect another’s, if these mechanisms exist they will be picked up by Shapley Values or GoalImpact. Good news for Heart of a Lion Joe, we can even figure out if he makes his teammates better through his inspiring leadership.

One argument against this method, which is probably the only one I don’t have a good answer for is that maybe a player’s magical leadership qualities are so powerful that he doesn’t even need to be on the pitch to transfer them to his teammate. His presence in the dressing room alone is enough to make his team better. If that’s the case then maybe there is room to hire players purely for their dressing room abilities, but we probably shouldn’t pay them player’s wages. Teams should hire these people as inspirational speakers or as people who just “hang out” with the team.

 Jekyll and Hyde

After all this time maybe we find out that Heart of a Lion Joe and Lazy Bob were actually the same person. When he’s playing well he has confidence and takes on the persona of Heart of a Lion Joe and when he’s playing poorly he becomes Lazy Bob. This is probably the narrative we hear most often in the media, the confidence storyline. Analytics people respond to this narrative by saying it’s just random variation.

If you take off your analytics-tinted-glasses for a second you have to admit the mainstream media story of ‘confidence’ is a lot more satisfying than ‘randomness’. Randomness is an ugly word in sport. We like to think that everything happens for a reason so attributing these changes in performance level to confidence just feels better than saying it’s random variation.

The correct response here is who cares? Really, maybe the media are right and all random variation in scoring or performance level comes down to confidence, but it looks random from a data perspective. All that matters in the end is output, so if a player can’t control when he is Heart of a Lion Joe or Lazy Bob then why should teams?

If it looks like random variation that the player can’t control and it acts like random variation the player can’t control then for all intents and purposes we should treat it like random variation the player can’t control. Sure, maybe it’s confidence but if confidence is completely exogenous to what a footballer has control over why should teams care about it or evaluate based on it?

The mainstream media is probably right that the footballing world is full of mixture of Heart of a Lion Joes and Lazy Bobs, but they are wrong to suggest that analytics ignore the differences between these two players.