Understat¶
- class soccerdata.Understat(leagues=None, seasons=None, proxy=None, no_cache=False, no_store=False, data_dir=PosixPath('/home/docs/soccerdata/data/Understat'))¶
Provides pd.DataFrames from data at https://understat.com.
Data will be downloaded as necessary and cached locally in
~/soccerdata/data/Understat
.- Parameters:
proxy ('tor' or dict or list(dict) or callable, optional) –
- Use a proxy to hide your IP address. Valid options are:
”tor”: Uses the Tor network. Tor should be running in the background on port 9050.
dict: A dictionary with the proxy to use. The dict should be a mapping of supported protocols to proxy addresses. For example:
{ 'http': 'http://10.10.1.10:3128', 'https': 'http://10.10.1.10:1080', }
list(dict): A list of proxies to choose from. A different proxy will be selected from this list after failed requests, allowing rotating proxies.
callable: A function that returns a valid proxy. This function will be called after failed requests, allowing rotating proxies.
no_cache (bool) – If True, will not use cached data.
no_store (bool) – If True, will not store downloaded data.
data_dir (Path) – Path to directory where data will be cached.
leagues (str | List[str] | None)
seasons (str | int | Iterable[str | int] | None)
- property seasons: List[str]¶
Return a list of selected seasons.
- read_leagues()¶
Retrieve the selected leagues from the datasource.
- Return type:
pd.DataFrame
- read_seasons()¶
Retrieve the selected seasons from the datasource.
- Return type:
pd.DataFrame
- read_schedule(include_matches_without_data=True, force_cache=False)¶
Retrieve the matches for the selected leagues and seasons.
- Parameters:
include_matches_without_data (bool) – By default matches with and without data are returned. If False, will only return matches with data.
force_cache (bool) – By default no cached data is used for the current season. If True, will force the use of cached data anyway.
- Return type:
pd.DataFrame
- read_team_match_stats(force_cache=False)¶
Retrieve the team match stats for the selected leagues and seasons.
- Parameters:
force_cache (bool) – By default no cached data is used for the current season. If True, will force the use of cached data anyway.
- Return type:
pd.DataFrame
- read_player_season_stats(force_cache=False)¶
Retrieve the player season stats for the selected leagues and seasons.
- Parameters:
force_cache (bool) – By default no cached data is used for the current season. If True, will force the use of cached data anyway.
- Return type:
pd.DataFrame
- read_player_match_stats(match_id=None)¶
Retrieve the player match stats for the selected leagues and seasons.
- Parameters:
match_id (int or list of int, optional) – Retrieve the player match stats for a specific match.
- Raises:
ValueError – If the given match_id could not be found in the selected seasons.
- Return type:
pd.DataFrame
- read_shot_events(match_id=None)¶
Retrieve the shot events for the selected matches or the selected leagues and seasons.
- Parameters:
match_id (int or list of int, optional) – Retrieve the shot events for a specific match.
- Raises:
ValueError – If the given match_id could not be found in the selected seasons.
- Return type:
pd.DataFrame
- classmethod available_leagues()¶
Return a list of league IDs available for this source.
- Return type:
List[str]
- get(url, filepath=None, max_age=None, no_cache=False, var=None)¶
Load data from url.
By default, the source of url is downloaded and saved to filepath. If filepath exists, the url is not visited and the cached data is returned.
- Parameters:
url (str) – URL to download.
filepath (Path, optional) – Path to save downloaded file. If None, downloaded data is not cached.
max_age (int for age in days, or timedelta object) – The max. age of locally cached file before re-download.
no_cache (bool) – If True, will not use cached data. Overrides the class property.
var (str or list of str, optional) – Return a JavaScript variable instead of the page source.
- Raises:
TypeError – If max_age is not an integer or timedelta object.
- Returns:
File-like object of downloaded data.
- Return type:
io.BufferedIOBase
- property leagues: List[str]¶
Return a list of selected leagues.