Is Federer Stronger in a Tournament Without Nadal? An Evaluation of Odds and Seedings for Wimbledon 2009

Wimbledon is one of the most popular annual sports tournament. In the Gentlemen’s Single 2009 the top seeded and defending champion Rafael Nadal withdrew from the tournament due to injury days prior to the tournament. Here, we try to analyze the eﬀects of Nadal’s withdrawal especially on the ability/strength of the main competitor Roger Federer by using bookmakers expectancies to estimate the unknown abilities of the players and compare them for two diﬀerent odds sets. The comparison shows that the bookmakers did not incorporate Nadal’s withdrawal adequately, assigning too high expected winning probabilities to Federer and Murray.


Introduction
The Championships, more commonly known as Wimbledon, is the oldest tennis tournament, being held at the All England Club in the London suburb of Wimbledon since 1877.It is the most popular tournament played on grass in the world and belongs to the four annual major tennis tournaments, the Grand Slams, along with the Australian Open, the French Open and the US Open (Wimbledon, 2009).
In the Gentlemen's singles of Wimbledon 2009 the top seeded and defending champion Rafael Nadal withdrew from the tournament due to injury days prior to the tournament.Here, we analyze the effects of this withdrawal, especially on the expected ability of the bookmakers' favorite Federer.Therefore, we compare different measures of performance, like the official rankings of the Association of Tennis Professionals (ATP), the seeding, and the bookmakers expectancies measured in odds.After showing that the bookmakers odds which are prospective ratings of the participating players' performance perform better, in terms of forecasting the tournament outcome, than the Wimbledon seeding and the ATP ranking, we estimate the abilities of each participating player using two different odds sets including expectancies of a variety of bookmakers: One including winning expectancies for Nadal, and one obtained after his withdrawal.The comparison of the estimated abilities shows that Federer's and Murray's chance of winning Wimbledon 2009 was overestimated by the bookmakers after Nadal's withdrawal.Furthermore, we use all estimated abilities to simulate the outcome of three different tournament designs, showing that in the long run the seeding has not that much influence and a round-robin tournament would be more favorable to top players than the origin single elimination tournament.
In recent literature, ATP rankings as well as seedings, which are based on ATP rankings are used to predict the winner of a tennis match (e.g., Barnett and Clarke, 2005;Klaassen and Magnus, 2003) or a major tennis tournament (e.g., Clarke and Dyte, 2000;Boulier and Stekler, 1999).Bookmakers odds were successfully used to predict the outcome of single games (e.g., Spann and Skiera, 2009) or European football tournaments (see Leitner et al., 2009a,b).
The remainder of this paper is organized as follows: Section 2 provides a tournament and data description of Wimbledon 2009 for which the players' abilities are modeled and analyzed in Section 3. Section 4 concludes the paper.

Tournament
In the Gentlemen's Singles of Wimbledon 2009, a total of 128 international tennis players compete in a single elimination tournament modus (knockout system) to determine the "best" tennis player on grass.Players wishing to enter Wimbledon are required to submit their entry on a special form.The organizing committee evaluates all applications for entry, and use ATP rankings to determine which players will be admitted directly into the tournament, those who have to qualify and those who are rejected.A player without a high enough ATP ranking can be admitted as a "wild card" by the committee.Wild cards are usually offered on the basis of past performance at Wimbledon or to increase British interest.A player who neither has a high enough ranking nor receives a wild card can participate in a qualifying tournament (a three-round event) held one week before Wimbledon.The players who win all three rounds will progress."Lucky losers" are losers from the final round of qualifying competitions -chosen in order of ATP rankings -to fill any vacancy which occurs in the draw before the first round has been completed.The committee seeds the top 32 players based on their ATP rankings in order to make sure that the top 32 players do not meet each other in the tournament before the third round.The seedings can also be changed due to players' previous grass court performance by the committee (see Wimbledon, 2009).

Data
Bookmakers Odds.Long-term odds of winning Wimbledon 2009 (Gentlemen's Singles) were obtained from the website http://odds.bestbetting.comwhich compares odds of a variety of international bookmakers.We obtained all available odds on two different dates, 2009-06-16 (before the tournament draw and before Nadal's withdrawal; henceforth called W1) and on 2009-06-22 (before the tournament started, but after the draw; henceforth called W2).The first dataset contains odds of 17 international bookmakers for 96 players who are expected to participate in Wimbledon 2009.The latter dataset contains odds of 15 international bookmakers for 105 participating players.
The quoted odds of the bookmakers can be easily transformed into winning probabilities, but they do not represent the true chances that a player will win the tournament, because they include the stake and a profit margin, better known as the "overround" on the "book" (for further details see e.g., Henery, 1999).To recover the underlying beliefs of the bookmakers, we have to adjust the quoted odds by reducing one, the stake, i.e., the payment for placing the bet and adjust it by the profit of the bookmaker, the overround (for more details see Leitner et al., 2009a).This adjustment is done separately for all bookmakers yielding bookmaker-specific overrounds and expected winning probabilities p i,b for each player i and bookmaker b derived from the adjusted odds.
ATP Rankings (Singles).The South African Airways ATP rankings (singles) is based on the players' results (measured in points) at the four Grand Slams, the eight mandatory ATP World Tour Masters 1000 tournaments and the Barclays ATP World Tour Finals of the ranking period, and the best four results from all ATP World Tour 500 tournaments played in the calendar year.We obtained the points assigned to the rankings (henceforth called ATP ratings) from 2009-06-22 from ATP's website for all 128 participating players and for the injured player Rafael Nadal (Association of Tennis Professionals, 2009).
Seeding and Draw for Wimbledon 2009.As described above, the Wimbledon organizing committee seeds the top 32 players of the tournament based on their ATP rankings and their previous grass court performance.We obtained the seeding for Wimbledon 2009 from 2009-06-17 and from 2009-06-19 (after Nadal's withdrawal) from the Wimbledon webpage (Wimbledon, 2009).Additionally, we obtained the draw from 2009-06-19.According to the Wimbledon seeding from 2009-06-17 Nadal was the top seeded player, followed by Federer, Murray, Djokovic, and Del Potro.Due to Nadal's withdrawal after the draw, the committee left the top position blank, and seeded the previously unseeded player Kiefer as 33 and included Thiago Alves as a lucky loser to the draw.The draw changed in that way, that Del Potro (seeded on 5) took the place from Nadal, Blake seeded as 17 took Del Potro's place, and Kiefer took Blake's place.

Modeling Players' Abilities
The focus of our paper is to analyze the effect of Nadal's withdrawal from Wimbledon 2009, especially on the expected abilities of the main competitor Federer.It is obvious that Nadal's withdrawal increases, on average, the chance of winning the tournament of all other players.However, the ability/strength of each player should not change.Thus, the winning probability for a specific match, e.g., Federer beating Murray in a potential Wimbledon 2009 final, should not be affected by Nadal's withdrawal.The "true" abilities of the players are unknown, but an approximation can be derived from performance measures or winning expectancies, like the ATP rating, the seedings, or the bookmakers odds.Here, we compare all three rating strategies in a forecasting study for Wimbledon 2009.As in previous studies (e.g., Leitner et al., 2009a,b), we find that a consensus derived from the (prospective) bookmakers odds has higher predictive power than retrospective ratings based on historical results (in this study, the Wimbledon seeding and the ATP rankings, see Table 2).Subsequently, we estimate players' abilities based on bookmakers odds using two different odds sets: one including winning expectancies for Nadal and one obtained after his withdrawal.The resulting expected abilities are compared to assess the effect on Nadal's withdrawal.Furthermore, we use the players' abilities in order to compare different tournament designs in a simulation study.

Consensus Information
Since the bookmakers' expectations about Wimbledon 2009 are rather homogeneous, we use a very straightforward aggregation strategy computing the means of the winning logits (i.e., winning logodds) to find appropriate consensus measures of all bookmakers: where B is the number of bookmakers and call this strategy bookmaker consensus model (BCM).See Leitner et al. (2009b) for an exploration of several other aggregation strategies including different variance specifications.Transforming these consensus winning logits back to the probability scale yields the bookmakers' consensus winning probabilities p i for each player i for whom odds are available.
Table 1 shows the estimated winning probabilities p i and their associated winning logits logit(p i ) of the top ten participating players of Wimbledon 2009 using the winning odds W1 and W2.
According to the BCM for W1 and W2, Federer has the highest chance of winning ).His withdrawal increases the winning probabilities of both players strongly, whereas the winning probabilities of all other players do not change as clearly.
In order to test the predictive power of the bookmaker consensus we compare the consensus winning logits including the last available information (W2) with the actual tournament outcome, the Wimbledon seeding, and the ATP ranking of the top ten players using Spearman's rank correlation (Table 2).Although the correlation between the bookmaker consensus winning probabilities and the actual tournament outcome is rather low (0.109) the BCM still performs better than the Wimbledon seeding (−0.156) and the ATP ranking (−0.185).Both, the seeding and the ATP ranking have a negative Spearman's rank correlation with the actual tournament outcome, assigning rather high ranks to two players who reach the quarter-finals (Hewitt) or the semi-finals (Haas).
In addition to the correlation, we analyze the correctly predicted participants of each round (third round to winner).Table 3 shows that the BCM correctly predicts nine players of the last 16, whereas the Wimbledon seeding predicts only seven and the ATP ranking only eight players correctly.Furthermore, the BCM correctly predicts five of the last eight and three of the last four, everytime one more than the Wimbledon seeding and the ATP ranking.All three approaches forecast the actual winner Federer correctly, but expected Murray who was beaten by Roddick in the semi-finals, as the runner-up.
Nevertheless, the ex post analysis shows that the correlation between the bookmakers expectancies for Wimbledon 2009 and the actual tournament outcome is not high, but the bookmakers perform better than the Wimbledon seeding and the ATP ranking.The reasons for the difficulties in forecasting tennis are twofold.First, tennis is an individual sport competition and the outcome of a match/tournament depend only on one individual who can easily have a day off or an injury rather than a whole team.Second, in the tennis tournament design (single elimination tournament) every single match is important, if a player loses one match he is eliminated from the tournament.

Estimation of Abilities
With the winning logits and associated winning probabilities we have computed measures for the specific tournament, Wimbledon 2009, including information about the tournament design (in W1 and W2) and including the original draw (in W2).In order to obtain measures of the unknown "true" abilities of the players we have to adjust the winning logits by the tournament effects (tournament schedule and draw).I.e., we try to estimate the abilities which correspond with the winning logits.For this we employ the well known Bradley and Terry (1952) model which measures abilities on a ratio scale and for which the probability π i,j for competitor i beating competitor j is given by: where ability i is the ability for competitor i.
Given the abilities of all players and the tournament schedule, we can compute the associated winning probabilities based on the pairwise probabilities from Equation 2. Alternatively, we can simulate a large number of tournament runs (100,000 say) and then assessing the empirical winning proportions p for each competitor: ability π i,j 100,000 runs p abilities of all competitors → pairwise winning probabilities for all matches → tournament simulations → simulated winning probabilities for tournament I.e., for given abilities ability i (i = 1, . . ., 128) for all competitors we obtain the simulated winning probability p(ability) i for competitor i.We can try to estimate the unknown "true" abilities by choosing them in a way that the p(ability) i match the Bookmaker Consensus Model winning probabilities p i as closely as possible.In our case, we minimize the total absolute deviation between p and p, i.e., we solve the optimization for using a local search strategy.
In order to estimate the ability for each player, we need winning logits for all players.Due to the fact that not all players are assigned to odds, we do not obtain winning logits for all players from the BCM.Therefore, we use a simple linear model modeling the relationship between the ATP ratings on the log-scale and the consensus winning logits: and predicted the consensus winning logits of the "unrated" players.The relationships have a high correlation for both W1 and W2 (W1: 0.828, W2: 0.836).and the estimated model parameters for the slope β 1 and the intercept β 0 are 1.73 and −18.79 for W1, and 1.71 and −18.66 for W2.For ease of comparison, we show the estimated abilities on the log-scale and their associated simulated winning probabilities p i (which match the winning probabilities p i dervied from the BCM) of the top players of Wimbledon 2009 for W1 and W2 in Table 1.According to the estimated log-abilities Federer is still the best player of Wimbledon 2009 (W1: −3.627, W2: −3.315), followed again by Murray (W1:−4.409,W2: −4.030).If Nadal had played Wimbledon 2009, he was expected to be the third strongest player of the tournament (W1: −4.492, with an associated simulated winning probability of 14.40%).
In order to assess whether the ability of a player was altered due to Nadal's withdrawal, we compare the players estimated log-abilities by subtracting the log-abilities of a reference player.We choose Söderling, because he has rather similar log-abilities for W1 and W2.Thus, Figure 1 shows for each top ten player if the chance of beating Söderling increases or decreases after Nadal's withdrawal.
The comparison of the log-abilities shows that the abilities of almost all top ten players (except Djokovic) increases, but primarily the abilities of Federer, Murray, and Haas.E.g., the probability that Federer beat Söderling increases from 80.84% to 85.25%.
The changes in the (log)abilities of the top two, Federer and Murray, show that the bookmakers do not react on Nadal's withdrawal and its consequential changes of the draw as expected.Apparently, they have not considered the whole tournament again and instead just increased Federer's and Murray's winning probabilities-presumably because they expected much more punters betting on a tournament by one of the two players.In any case, this explanation for the increase in Federer's and Murray's expected abilities seems to be far more plausible than interpreting the results literally as an increase in their abilities.In the latter case, one would have to argue that Federer and Murray are so relieved by the drop-out of Nadal that they even play stronger in matches against other players (such as Söderling).Furthermore, the changes in the abilities of Haas and Djokovic can be explained by a delayed reaction to the outcome of the Wimbledon warm up tournament in Halle, where Haas beat Djokovic rather clearly (6-3 6-7(4) 6-1) in the final.Although this information had already been available at time W1, it appears to have only been used in the odds at time W2-potentially due to a change in the punters' betting behaviour in the week between the tournaments of Halle and Wimbledon.

Effects of the Tournament Design
With the estimated abilities of the players a measure adjusted for the tournament effects is now available and we are able to determine the effects of different tournament designs by simulating winning probabilities of all participants.A tennis tournament is typically a single elimination tournament and so each match plays an important role.A player with the ambition of winning the tournament is not able to have a day off.Furthermore, in a tennis tournament like Wimbledon a specific number of players is seeded.
In order to determine the effects of the tennis tournament with its seeding, we compare three different designs: (1) a single elimination tournament with the original seeding and draw of Wimbledon 2009, (2) a single elimination tournament without seeding and random draw, and (3) a round-robin tournament, where each player plays each other ones.We use the estimated abilities from all 128 players of Wimbledon 2009 derived from the BCM (W2) and simulate their chances of winning the tournament according to the above described simulation approach (100,000 runs).For comparison reason we transform the empirical propabilities into winning logits and compare them for the top ten players in Figure 2. The winning logits of the single elimination tournament with seeding and without seeding differ not really much.Only some winning logits slightly increase (for a few of the weaker players) and some slightly decrease (e.g., Murrary and Djokovic) if the single elimination tournament is played without seeding.However, overall these differences are minor signalling that in the long run, the seeding does not have a large effect on the tournament outcome.In contrast, if we consider a round-robin where instead of 127 matches 8128 matches have to be played, the winning probability of the player with the highest ability (here: Federer) increases strongly compared to the single elimination tournaments.The winning logits of all other players (except the second strongest player Murray) decrease sharply.In general, we can conclude that a single elimination tournament is clearly more exciting than a round-robin tournament.Whereas in a round-robin with 128 players each player has to play 127 matches, in a single elimination tournament the final participants have to play seven matches.Nevertheless, a round-robin tournament would be more favorable to top players.

Conclusion
In this paper we investigate a strategy for estimating the expected players' abilities of a tennis tournament (Wimbledon 2009) using bookmakers expectancies for winning the tournament.A comparison of the estimated abilities for two datasets incorporating different information about the (expected) participants of the tournament shows that the bookmakers do not react appropriately on a rapid change of the tournament (here: Nadal's withdrawal).The abilities of the main competitors (Federer and Murray) increase.We also investigate the effect of the tournament schedule on top players' chances of winning the tournament by a simulation study, comparing three different tournament designs.

Computational Details
All computations were carried out in the R system (version 2.9.2) for statistical computing (R Development Core Team, 2009).

Figure 2 :
Figure 2: Winning probabilities of the top ten players simulated by three different tournament designs (single elimination tournament with seeding, single elimination tournament without seeding and a round-robin tournament) using the estimated abilities of all 128 participating players of Wimbledon 2009.

Table 1 :
Wimbledon 2009Estimated winning probabilities p i , their associated winning logits logit(p i ), estimated logabilities log(ability i ) and associated simulated winning probabilities p i of the top ten participating players of Wimbledon 2009 and Nadal using their winning odds from 2009-06-16 (W1) and from 2009-06-22 (W2).

Table 2 :
Wimbledon 2009. correlation between the actual tournament ranking and rankings according to the estimated BCM winning probabilities, the seeding, and the ATP rating of the top ten participating players ofWimbledon 2009.

Table 3 :
Wimbledon 2009.ction of the last 16, 8, 4, 2, and the winner using the (log-)abilities, the seeding, and the ATP raking of the top 128 participating players ofWimbledon 2009.