The World Cup is fast approaching and with it come no end to projections, estimations and forecasts about who is going to do what. 

Using a Bayes analysis, once you reach the semifinal you have a 50 percent chance of being correct - just like flipping a coin. But can it work before then? Yes, is you do enough simulations.  But before you fire up your copy of Championship Manager and try to play 100 times, there is an easier solution.

Computer scientists David Dormagen and Raul Rojas from the Intelligent Systems and Robotics Group at Freie Universität Berlin have developed software that lets you set or modify the simulation parameters, providing their own estimation of which are the appropriate indicators for measuring the skill level of the different national teams. 

You won't be alone, of course. Teams understand big data. Manchester City has 11 people analysing players' data and they claim a large part of the reason for that team's recent success.

The virtual World Cup games are simulated based on the rankings of the 32 participating teams. Both the FIFA and Elo rankings assign a certain number of points to the national teams corresponding to their estimated strength, based on games held in the past. If there is a large difference in the respective playing strength of two teams, the team with the higher rating has a higher probability of winning. The games are played on the basis of such probabilities computed from historical data.


Click image for larger size. Credit: Freie Universitaet Berlin

The better team does not always win –even good teams can have a bad day. The computer tries to model such eventualities by using stochastic selection of the winner based on the difference in playing strengths.

In order to find out which team will become the world champion, all games in a virtual world cup are played. But in soccer randomness and the current fitness level play a big role. A single simulation is not reliable enough. Therefore the “virtual world cup” is run up to 10,000 times. With this aggregation method, the statistical progress of each team can be measured up to the finals. At the end, the system calculates the relative frequency with which a team like Brazil or Spain reaches, for example, the quarterfinals, semifinals, or even the final game. Such frequencies represent the odds for any given team of winning the championship.

In the simulator, home advantage and even the market value of the players can be used as factors influencing victory or defeat. Users can assign weights to the various rankings and factors. For example, if users think that the FIFA ranking is more significant than the market value, they can assign the former a higher weighting factor.

According to the calculations of the simulator, currently the favorite teams are Brazil, Argentina, Germany, and Spain. There is an almost 66 to 70 percent probability that one of these four teams will win the next World Cup. It would be a big surprise if none of these four teams reaches the finals.

So who will be the big winner? Run your simulation and tell us. This isn't Popular Science, you are still allowed to comment here.

Source: Freie Universitaet Berlin