The raw data behind the stories "Club Soccer Predictions" https://projects.fivethirtyeight.com/soccer-predictions/ and "Global Club Soccer Rankings" https://projects.fivethirtyeight.com/global-club-soccer-rankings/.

spi_matches

Format

The dataset is a data frame with 34,109 rows representing soccer matches and 13 variables:

date

The date that the match took place.

league_id

A numerical identifier of the league within which the match was played.

league

League name.

team1

One team that participated in the match.

team2

The other team that participated in the match.

spi1

The SPI score of team1.

spi2

The SPI score of team2.

prob1

The probability that team1 would have won the match.

prob2

The probability that team2 would have won the match.

probtie

The probability that the match would have resulted in a tie.

proj_score1

The predicted number of goals that team1 would have scored.

proj_score2

The predicted number of goals that team2 would have scored.

importance1

importance2

score1

The number of goals that team1 scored.

score2

The number of goals that team2 scored.

xg1

xg2

nsxg1

nsxg2

adj_score1

adj_score2

Source

See https://github.com/fivethirtyeight/data/blob/master/soccer-spi/README.md

See also