Taken from his 2015 OptaPro Analytics Forum proposal, Sam Gregory explores whether teams in the Premier League should look to vary their attacking style.
Sam’s abstract was originally selected by the judging panel to be presented, however Sam was unable to attend the event in London back in February.
Discussion of David Moyes’s tenure at Manchester United will often come around to the now infamous 2-2 draw against Fulham at Old Trafford when United played 81 crosses and failed to score on a single one.
Manchester United’s performance that day was considered memorable because United were seen to be too one-dimensional.
Adding to the narrative, The Telegraph’s post-game recap stated Moyes’s “tactics were too one-dimensional.” The Daily Mail suggested that at Old Trafford, “the crowd are unaccustomed to one dimensional football.” Yahoo Sports claimed, “what really stood out was their staggeringly one-dimensional approach to attack.”
All of these critiques made me wonder. What’s so bad about a one-dimensional attack? If a team excels at crossing and has forwards who are good in the air why try to beat the opponents with slick passing in the centre of the pitch just for the sake of having a more varied attack? Unless of course there is some benefit to varying between types of attack beyond the value of the attack itself.
So I set out to answer the following question:
Do teams that have varied attacks actually score more (or take more shots) than teams with a singular attack?
To answer this question I’ve used two full seasons of Premier League data (2013/14 and 2014/15), looking at attacks on a per game basis, which provided 1,520 observations in total.
For the purpose of this analysis an attack is defined as any attacking passage of play that is ended by a shot or a defensive action by the opposition inside the eighteen-yard box. As a reference point, throughout 2014/15 there were 26,222 events classified as attacks for an average of 69 attacks per game.
2014-15 Premier League attack types
Since we are looking at team attacks per game the variables being used are the number of times a team uses each attack per game. I’ve also defined a coefficient of variation variable to define how much a team varies its attack within a single game. The calculation of this coefficient is in the appendix.
Now we have the explanatory variables of interest: the number of cross attacks, build-up attacks, through ball attacks, long ball attacks, and other attacks each team makes per game and the coefficient of variation.
When a team attacks there are two basic objectives: get a shot off and score from that shot. In the data about 37% of attacks end in shots and about 3% of attacks end in goals. We can use these two ‘success’ variables: shots and goals as the dependent variables.
To test this theory I ran two linear regressions with shots and goals as the outcome variables. I expand more on the methodology in the appendix.
In both regressions using shots and goals, as the outcome variables, there were no significant effects either positive or negative that came from varying attack types.
Some attacks were more likely to result in shots than others. Build-up attacks and through ball attacks were much more potent than cross or long ball attacks, but the actual variation in attacks were not significant.
It’s important to note that the results presented are over a two-season period, but the same results hold taking into account only the 2013-14 Premier League season or 2014-15 Premier League season in isolation.
What do these results tell us? Firstly, that some attack types are better than others, but this isn’t really a surprise.
Percentage of attacks scored
What is more surprising is that varying between attacks gives no discernable advantage on a game-by-game basis. The effects of a high or low coefficient of variation are not strong enough either positively or negatively to suggest that they have any effect. In other words the coefficient of variation is not statistically significant (see regression tables for results).
Some of the stronger teams do have more varied attacks, but the benefits from these varied attacks are all captured by the increased probability of scoring from a through ball or build-up attack and have nothing to do with the variation itself. So if a team were to only use through ball and build-up attacks they would theoretically have a better attack than they would with more variation.
There are a couple caveats to these results. The first is the fact this doesn’t rule out how varying attacks back-to-back may affect outcomes. For example if after three straight cross attacks a team tries a through ball attack it may be more effective since it was unexpected. Another is that stronger teams may be more capable of varying attacks than weaker ones so although there is no direct effect of varying attacks there is a correlation between varying attacks and team strength.
Taking a look at some more specific teams and games further illuminates the idea of attack variation. The least balanced attack in a single game over this two-year span was Aston Villa’s in their 1-0 win over Liverpool last season. Villa had 18 cross attacks, and one long ball attack.
Liverpool on the other hand averaged the most balanced attack of any team over the two-season sample. They favoured cross attacks much less than other teams, which is unsurprising given the playing styles of Sturridge, Suarez and Sterling.
Top four most varied attacks
The two best teams with very unbalanced attacks were Manchester United and Swansea City. Both heavily favoured crosses with Swansea averaging 15.25 cross attacks per game and Manchester United averaging 17.98 cross attacks per game (17.11 per game under Louis van Gaal and 18.87 per game under David Moyes).
Looking back at the Manchester United example putting 81 crosses into the box in a single game may not have been a great strategy for David Moyes’s team, but only because crosses aren’t that great an attack type not because the team had an unbalanced attack.
The lesson clubs should take from this? Don’t vary attacks for the sake of variation, if you are good at something then getting even better at it isn’t such a bad idea.
Coefficient of Variation Calculation:
There are a few ways to define variance, I’ve decided to do it in terms of absolute inequality. I’ve created a coefficient of variation metric with a maximum of 1 and a minimum of 0 (similar to a Gini coefficient). If the coefficient of variation for a team in a certain game is 1 it means they only used one attack all game, if it is 0 it means they used a perfectly balanced attack.
Since there are five types of attack a perfectly balanced attack in a game would mean each attack is used 20% of the time. So to find the variation coefficient I’ve taken the absolute difference between the percent each attack is used in the game and 0.2 and added all of these differences together. The maximum this number can be is 8/5 so I’ve weighted the coefficient by 5/8 just to give the coefficient a pleasing range between 0 and 1.
Coefficient of Variation = 5/8*[|cross%-0.2|+|buildup%-0.2|+|throughball%-0.2|+|longball%-0.2|+|other%-0.2|]
Both of these regression results were Ordinary Least Squares linear regressions with heteroskedasticity-robust standard errors (note: y is shots in the first regression and shots in the second).
There is a much higher r-squared value on the first regression with shots as the outcome variable than the one with goals. This could be for several reasons, but the most likely is simply the fact goals are much more random and intermittent than shots.
Notes that the p-values for the coefficient of variation in both cases are over 0.5 and thus not even close to significant.
Attack Type Effect on Shots (R^2=0.7348)
Attack Type Effect on Goals (R^2=0.1047)