Great use of advanced data to highlight European teams' tactical trends by @colinttrainor here: optasportspro.com/en/about/inter…. Pressed. 21 Aug

Some thoughts from @devinpleuler on the use of Game State as a predictive, analytical tool: optasportspro.com/about/optapro-…. Scoreline. 14 Aug

.@colinttrainor uses advanced metrics to improve our understanding of defensive data: optasportspro.com/about/interest…. Stopper. 30 Jul

Less than a day to go! The OptaPro Marketing Coordinator application deadline is tomorrow: optasports.com/about/getting-…. Rush. 26 Jul

Last few days before the application deadline for the OptaPro Marketing Coordinator role: optasports.com/about/getting-…. Close. 24 Jul

The OptaPro Blog

RSS Feed

Welcome to the OptaPro blog, featuring news and analysis from OptaPro's cutting-edge research team.

BLOG: Assessing the performance of Premier League goalscorers

NB: all data correct up to and including April 9th, 2012 

Introduction

There is an obvious correlation between the number of shots a player takes and the number of goals they will score. Commentators will often espouse a player's willingness to go for goal, many subscribing to the 'If you don't buy a ticket, you won't win the lottery' theory: that by peppering the goalkeeper with as many shots as possible, eventually one will go in.

However, from a tactical perspective this can often be a particularly inefficient way of winning games. Shooting from a great distance or from a narrow angle may occasionally result in a goal, but missing the target or putting the ball into the opposition keeper's arms will hand possession back to the opposition team and all of the hard work done in creating that chance will be for naught. The more often this happens, the more possession is wasted.

So how do we quantify which areas of the pitch are the most likely to result in a goal and therefore, which shots have the highest probability of resulting in a goal? If we can establish this metric, we can then accurately and effectively increase our chances of scoring and therefore winning matches. Similarly, we can use this data from a defensive perspective to limit the better chances by defending key areas of the pitch.

It may seem an obvious factor to analyse: after all, anybody who has watched a football match knows that shots from the centre of the penalty area are more likely to result in a goal, whereas shots from the half way line aren't. However, the Opta data collected allows for a far more in depth analysis. It also allows us to establish which players are most effective when shooting and which are the least.

This report aims to show how OptaPro can define what makes a good chance as opposed to an average one, which players are more effective than the quality of their chances would suggest and how goalkeepers' performance can be measured effectively.   

What makes a goal?

As noted above, the most basic requirement to score goals is to take shots. Indeed, the Premier League's top scorer Robin van Persie - the focal point of the Arsenal attack - has attempted more shots this season than any other player which have resulted in 26 goals to his name. So far as expected. However, a glance at the next most frequent shot-takers sees a fairly large disparity in efficiency.

Table1

Why exactly has Luis Suárez required well over twice as many shots as van Persie to score each of his goals? He, like Van Persie, has often acted as a lone striker for his club. Although it could be argued that the support Suarez has received this season hasn't been as high a quality as Van Persie's, this doesn't alter our analysis as he has still managed to fire off 110 attempted goal-bound efforts. Similarly, even when looking at those players who haven't played as often as Van Persie but still like to shoot on sight, their goals per shot ratio is less than that of the Arsenal forward.


Table2

Only Wayne Rooney has matched the Dutchman for accuracy. The total number of shots alone doesn't explain such a large shortfall in goals between the top two scorers and the rest. Clearly we need to look a little deeper and compare the quality of each chance.

Defining Chance Quality

The most obvious factor in determining whether a shot is likely to end up in the crowd, the goalkeeper's hands or the back of the net is the location of the shot. Establishing where shots are taken from can start to explain the numbers displayed above.

The image below shows the average (mean) location of each of the players listed above in Table 1(represented by their club crest). Although a simple average could be extremely misleading, for the purposes of this exercise it provides an initial demonstration.

 Averageshootingpositions

If location is the all-important factor in establishing the likelihood of a shot resulting in a goal Wayne Rooney is clearly getting something very right and/or Luis Suárez something very wrong (I imagine the majority would opt for 'and').

However, delving deeper into the numbers can show that while location is one of the most important factors in determining a shot's quality, it is by no means the only one.

If this statement is considered to any extent all, it quickly makes sense. Clearly a header struck at the penalty spot from a corner will not have the same chance of being a goal as a shot from the same location on the counter attack, or indeed, a penalty.

To account for this disparity, we can construct a model that considers these and other factors to determine a shot's probability of being on target and/or scored.

Expected Goals (xG)

Utilising this model, we can look at each player's shots and tally up the probability of each of them being a goal to give an expected goal (xG) value.

Firstly, we can use it to assess the average quality of chance created for each player - or to what extent a player exercises a degree of control over his shot selection and is prepared to either fashion a better chance for himself or spurn an average chance to set up a better-positioned teammate.

Table3

Table 3 would appear to give an early boost to the model's accuracy. Three 18-yard vulpines (and penalty takers to boot) sit proudly at the top the table and three players who are noted for their willingness to gamble from distance (and, not incidentally, set-piece takers) bringing up the rear.

An obvious comparison can be made between the shot maps of the leader (Bent) and one of the players bringing up the rear (Taarabt). These maps quickly highlight the difference in approach:

Darren Bent Shot Map

Bentshotmap 

Adel Taarabt Shot map

Taraabtshotmap

Here, circles represent goals, triangles saves, red crosses off-target shots, black crosses blocked shots and red diamonds shots that hit the woodwork. The size of each mark indicates its goal probability.

The pictures largely tell their own story, with Bent's ability to get himself in to very dangerous positions - or selectivity- particularly evident in his number of high-quality chances close in front of goal. By contrast, Taarabt's set-piece duties and willingness to shoot from long range have resulted in far more shots (73 to the Villa striker's 43) but a much lower xG total (3.0 to 9.9).

Those following closely at home will have noted that both players have failed to match (the model's) expectations by approximately one goal (Bent) or two (Taarabt). Another of the players with a very high average chance quality, Emmanuel Adebayor, has been amongst the lowest in the league in terms of underperforming in front of goal, as demonstrated by his xG total. Adebayor is one of only four players in the Premier League this season to have a difference (dG) of five goals or more below his expected value. The top and bottom three players are listed in the table below (Liverpool fans, look away now):

Table5

There is some good news for the Reds however, with Steven Gerrard leading the way in dG per shot:

Table 51

NB: dG/X - difference between actual and expected goals scored per 10 shots (i.e. Gerrard's shooting has resulted in a little over one goal more than predicted for every 10 shots taken)Min 30 shots

Apportioning Credit

Thanks to the above analysis, we can now describe which players have over- or under-performed in front of goal. However, we can't really say why. To attempt to answer this we can look at each shot in further detail.

Specifically, we can consider the goal probability of the shot before and after it is struck (i.e. incorporating the shot's trajectory) to separate the impact of:

- The chance quality (i.e. how likely the shot was to be a goal, irrespective of how the shot is hit - already established

- The player's shooting (i.e. the difference between the goal probability before and after the shot is struck

- And the goalkeeper (i.e. whether the shot is saved or not)

By considering the goal probability factoring in the nature of the shot itself, we can attempt to disentangle the effects of shooting and goalkeeping in our dG values derived earlier.

A look at the model's 2011/12 top performers in terms of shooting reveals the league's top two scorers, led by Spurs' Rafael van der Vaart. Andy Carroll's miserable campaign is once again highlighted, trailing the league in terms of goal probability added through his shooting:

Table6

SGA - Shooting Goals Added (the difference between goal probability before and after an unblocked shot is struck)

xG(OT) - expected Goals (sum of goal probabilities factoring in the quality of shot)

Table7

SGA/X - Shooting Goals Added per 10 shots on target

The second part of the shooting model considers the difference between the number of goals scored and the number expected based on the chance quality and shot quality, xG(OT). This should then highlight the players who have been rewarded, or penalised, the most by goalkeeping (bad or good, respectively) and could be described as which players are 'luckier' in front of goal than others.:

Table8

OT -Shots on target

KGA -Keeping Goals Added

So, Mikel Arteta's appearance near the top of the earlier dG table (+4.2 goals above expectation) appears to be more due to the vagaries of the keeping he has faced than particularly good shot striking. On the flip side, Adebayor and some of the Liverpool players also seen earlier appear to have suffered particularly badly due to some excellent goalkeeping performances over the course of the season, with Luis Suarez arguably denied nearly 6 goals that would, had he faced the keeper on another day, been scored.

Table9

KGA/X - Keeping Goals Added per 10 shots on target

To illustrate the complete picture we will look at a player whose shooting figures are well above average but still has failed to match or exceed his expected goals value (based on chance quality only), Wigan's Franco di Santo.

Franco Di Santo Shot Maps

Di SantoDi Santo2

The lower image shows the location of each of his shots in the league as before, with the size determined by the goal probability based on chance quality only (xG). The upper image shows the end location of the shots on the goal mouth, with the size of each mark determined by the goal probability factoring in shot quality.

Not only has di Santo got more than half of his (unblocked) shots on target but, as can be seen from the images, he has been particularly adept at finding the corners, adding 2.2 goals worth of goal probability from his 20 shots on target, to bring his expected value (factoring in shot quality) to 6.2. He has managed to find the net only four times however, suffering from a number of high quality saves. The number of yellow triangles seen in the corners of the 'goal' above show that poor Franco has been consistently hitting the corners without gaining anything like the reward he would normally expect.

Keeping

The same approach can naturally be applied to the other side. We can use the same system to see which 'keepers have prevented the most goals, relative to the quality of shots faced.

The keepers behind two of the more surprising defensive successes of the season top the table of keeping goals added (bear in mind a negative value is a good thing for defensive players), while the much put-upon Paul Robinson brings up the rear.

Table 10

Allowing for volume of shots faced drops Tim Krul out of the top three, and lifts David De Gea just above his Manchester counterpart:

Table 11

The image below shows the location - and goal probability - of the goals conceded by De Gea (red) and Robinson (blue). The tiny size of many of the circles/shots conceded by Robinson highlights an apparent weakness when facing efforts from distance. However, De Gea demonstrates his obvious quality and athleticism against long range attempts:

De Gea & Robinson Shot Maps (Goals)

De Gea Robbo 

Summary

The data explored above only touches on the ability we now possess to explore statistical trends in football. Clearly there are other factors in play here, such as shot power, curl or dip on the shot and whether the goalkeeper is unsighted or off balance. However, as an introductory analysis we can clearly see some results that can explain some of the 2011/2012 season's defining storylines.

Despite Suarez's obvious class, he hasn't scored enough goals for Liverpool. However, using the above analysis we can see that he has been especially unfortunate in front of goal. When his expected Goals per Game ratio from the statistics is highlighted, it is clear that luck has not been on his side especially considering the number of times he has hit the post this season. Similarly, Tim Krul's excellent shot-stopping abilities have clearly contributed to Newcastle United's challenge for a Champions League spot while Wojciech Szczesny has apparently contributed to an underperforming Arsenal squad.

By utilising OptaPro's data and analysis tools, we can effectively view trends that have a definite impact on the narrative of the Premier League season. The old cliche that luck evens itself out over the course of a season could certainly be seen as a red herring.

Posted by Sam Green at 14:16

5 Comments

Rich said...
Do shot probabilities consider where defenders are? Otherwise a fast break team that uses the flanks like United are likely to have more 'clean' shots (defenders stretched/retreating/out of position) while a slow buildup team like Fulham (who also funnel everything down the middle) will be shooting in or through crowds. Same locations, very different shots. cheers Rich
April 13, 2012 08:35
Simon Farrant said...
Hi Rich, No, the study doesn't consider defensive positioning. There are many many variables that haven't been considered in this article (to do so would require something akin to a thesis!). However, what Sam has tried to do is provide a basic overview of our ability to quantify this type of information. There was a really good article written by Zonal Marking's Michael Cox on areas of the pitch teams attack from. Again, it doesn't especially consider defensive influence, but it does back up your theory about Fulham: http://www.zonalmarking.net/2012/03/16/premier-league-sides-attack-lopsided-wing-play/. Simon
April 16, 2012 10:07
Rick said...
Interesting article. I'm amazed at how must data opta must be capturing to be able to produce such detailed analysis. Looking forward to seeing this site grow!
April 27, 2012 02:34
kv said...
Hi! Great article, we tried to do something similar regarding Chance Quality(obviously not as good as yours!) and came out with a formula. Mind giving it a read?http://arsenalcolumn.co.uk/2012/03/29/exploring-the-chance-quality-index-why-more-chances-dont-necessarily-mean-more-goals/#comment-7514 Did you use similar parameters for your formula? Thanks.
May 16, 2012 03:24
So xG(OT) are expected goals from on-target shots as derived from league-wide goal conversion probability of a similarly struck (Left, Right, Head - Straight, Curled, Chipped - Placement across goal mouth - etc) on-target effort at about same position? High SGA means outperforming the league average in similar circumstances, while low SGA means underperforming there? It would be interesting to see a compilation of shots of either SGA performance extreme. Is there any way to get video material of exemplary SGA shots? For some reason I'd be hesitant to oppose SGA and KGA so symmetrically, based on xG(OT) as key indicator. I wonder whether there could be an as of yet unconceptualized or unquantified factor at work that separates the successful finishes from unsuccessful ones. Very interested in this. Great article!
December 17, 2012 10:43

Post a comment