Ball possession is key in a soccer game. I guess there is a positive correlation between ball possession and game result. Of course, as discussed (namely) here, things are not as simple as that:
1. http://www.zonalmarking.net/2012/05/04/the-relationship-between-possession-and-shots/
2. http://www.mlssoccer.com/numerology/news/article/2012/06/21/central-winger-getting-ball-and-get-defensive
3. http://www.soccerbythenumbers.com/2012/07/pass-accuracy-and-possession-supremacy.html
But, above the overall percentage of possession of the ball, I am interested to know more about evolution of ball possession during a game.
Specifically, based on the sample file provided by MCFCAnalytics and Opta sports, I’m interested to know more about ball possession of our two teams. For example, I want to see if there is an increase or a decrease of ones team possession as the game progress. I also want to see if the team protecting its 3-2 advantage manages to increase ball possession in order to take control of the game. Conversely, I want to see if the team desesperately trying to score the tie goal worked into creating good sequences of possession of the ball in order to build solid attacks. I’d also want to see if fatigue, lack of timing and so on could be seen through evolution of ball possession.
Finally, as ball possession kinda rythms the game, I want to see how possessions of the ball in offensive zones were managed by the two teams all along the game. This last point will help us go from quantitative to spatial observations leading us eventually to patterns of play.
And of course, the goal being to build something that can lead us to realtime analytics.
So, our first steps is to represent possession of the ball by the two teams. To do that, I’ve written a little Python script (see previous post for basis) that does this:
– distribute each team’s events in respective arrays
– each event is associated to a timestamp (min * 60 + sec)
– sort arrays
– pipe out the output into two files (one for team 30 and one for team 43)
To show the data, I used a really cool javascript JQuery plugin which name is Flot. Flot displays data nicely without too much effort.
First pass is rough but it looks promising:
Now if we focus on approx time when goals were scored (see previous post), we would have this (tick in sec):
goal scored at by
1514 1-0 43
2219 2-0 43
2334 2-1 43
2799 3-1 43
3726 3-2 43
So let’s add another data series corresponding to goals (value 20 on Y axis) so we could see goals. Then with a focus on time of goals, we have for goal number one:
In this case, it is easy to notice a clear domination of ball possession by club 43 in the moments preceeding the goal.
Good, so this one first pass.
Next, we will have to focus more precisely on ball possession (and not team-related events in time) and on measurement of time possession in order to study its evolution along the game. We will also introduce spatial criterias.