ESPN

class soccerdata.ESPN(leagues=None, seasons=None, proxy=None, no_cache=False, no_store=False, data_dir=PosixPath('/home/docs/soccerdata/data/ESPN'))

Provides pd.DataFrames from JSON api available at http://site.api.espn.com.

Data will be downloaded as necessary and cached locally in ~/soccerdata/data/ESPN.

Parameters:
  • leagues (string or iterable, optional) – IDs of leagues to include.

  • seasons (string, int or list, optional) – Seasons to include. Supports multiple formats. Examples: ‘16-17’; 2016; ‘2016-17’; [14, 15, 16]

  • proxy ('tor' or dict or list(dict) or callable, optional) –

    Use a proxy to hide your IP address. Valid options are:
    • ”tor”: Uses the Tor network. Tor should be running in the background on port 9050.

    • dict: A dictionary with the proxy to use. The dict should be a mapping of supported protocols to proxy addresses. For example:

      {
          'http': 'http://10.10.1.10:3128',
          'https': 'http://10.10.1.10:1080',
      }
      
    • list(dict): A list of proxies to choose from. A different proxy will be selected from this list after failed requests, allowing rotating proxies.

    • callable: A function that returns a valid proxy. This function will be called after failed requests, allowing rotating proxies.

  • no_cache (bool) – If True, will not use cached data.

  • no_store (bool) – If True, will not store downloaded data.

  • data_dir (Path) – Path to directory where data will be cached.

property seasons: list[str]

Return a list of selected seasons.

read_schedule(force_cache=False)

Retrieve the game schedule for the selected leagues and seasons.

Parameters:

force_cache (bool) – By default no cached data is used for the current season. If True, will force the use of cached data anyway.

Return type:

pd.DataFrame

read_matchsheet(match_id=None)

Retrieve match sheets for the selected leagues and seasons.

Parameters:

match_id (int or list of int, optional) – Retrieve the match sheet for a specific game.

Raises:

ValueError – If no games with the given IDs were found for the selected seasons and leagues.

Return type:

pd.DataFrame.

read_lineup(match_id=None)

Retrieve lineups for the selected leagues and seasons.

Parameters:

match_id (int or list of int, optional) – Retrieve the lineup for a specific game.

Raises:

ValueError – If no games with the given IDs were found for the selected seasons and leagues.

Return type:

pd.DataFrame.

classmethod available_leagues()

Return a list of league IDs available for this source.

Return type:

list[str]

get(url, filepath=None, max_age=None, no_cache=False, var=None)

Load data from url.

By default, the source of url is downloaded and saved to filepath. If filepath exists, the url is not visited and the cached data is returned.

Parameters:
  • url (str) – URL to download.

  • filepath (Path, optional) – Path to save downloaded file. If None, downloaded data is not cached.

  • max_age (int for age in days, or timedelta object) – The max. age of locally cached file before re-download.

  • no_cache (bool) – If True, will not use cached data. Overrides the class property.

  • var (str or list of str, optional) – Return a JavaScript variable instead of the page source.

Raises:

TypeError – If max_age is not an integer or timedelta object.

Returns:

File-like object of downloaded data.

Return type:

io.BufferedIOBase

property leagues: list[str]

Return a list of selected leagues.