mlair.helpers.data_sources.data_loader

Module Contents

Functions

download_data(file_name: str, meta_file: str, station, statistics_per_var, sampling, store_data_locally=True, data_origin: Dict = None, time_dim=DEFAULT_TIME_DIM, target_dim=DEFAULT_TARGET_DIM, iter_dim=DEFAULT_ITER_DIM, window_dim=DEFAULT_WINDOW_DIM, era5_data_path=None, era5_file_names=None, ifs_data_path=None, ifs_file_names=None) → [xarray.DataArray, pandas.DataFrame]

Download data from TOAR database using the JOIN interface or load local era5 data.

get_data_with_query(opts: Dict, headers: Dict, as_json: bool = True, max_retries=5, timeout_base=60) → bytes

Download data from statistics rest api. This API is based on three steps: (1) post query and retrieve job id, (2)

get_data(opts: Dict, headers: Dict, as_json: bool = True, max_retries=5, timeout_base=60) → Union[Dict, List, str]

Download join data using requests framework.

correct_stat_name(stat: str) → str

Map given statistic name to new namespace defined by mapping dict.

create_url(base: str, service: str, param_id: Union[str, int, None] = None, **kwargs: Union[str, int, float, None]) → str

Create a request url with given base url, service type and arbitrarily many additional keyword arguments.

retries_session(max_retries=5)

Attributes

__author__

__date__

DEFAULT_TIME_DIM

DEFAULT_TARGET_DIM

DEFAULT_ITER_DIM

DEFAULT_WINDOW_DIM

mlair.helpers.data_sources.data_loader.__author__ = Lukas Leufen
mlair.helpers.data_sources.data_loader.__date__ = 2023-06-01
mlair.helpers.data_sources.data_loader.DEFAULT_TIME_DIM = datetime
mlair.helpers.data_sources.data_loader.DEFAULT_TARGET_DIM = variables
mlair.helpers.data_sources.data_loader.DEFAULT_ITER_DIM = Stations
mlair.helpers.data_sources.data_loader.DEFAULT_WINDOW_DIM = window
mlair.helpers.data_sources.data_loader.download_data(file_name: str, meta_file: str, station, statistics_per_var, sampling, store_data_locally=True, data_origin: Dict = None, time_dim=DEFAULT_TIME_DIM, target_dim=DEFAULT_TARGET_DIM, iter_dim=DEFAULT_ITER_DIM, window_dim=DEFAULT_WINDOW_DIM, era5_data_path=None, era5_file_names=None, ifs_data_path=None, ifs_file_names=None)[xarray.DataArray, pandas.DataFrame]

Download data from TOAR database using the JOIN interface or load local era5 data.

Data is transformed to a xarray dataset. If class attribute store_data_locally is true, data is additionally stored locally using given names for file and meta file.

Parameters
  • file_name – name of file to save data to (containing full path)

  • meta_file – name of the meta data file (also containing full path)

Returns

downloaded data and its meta data

exception mlair.helpers.data_sources.data_loader.EmptyQueryResult

Bases: Exception

Exception that get raised if a query to JOIN returns empty results.

mlair.helpers.data_sources.data_loader.get_data_with_query(opts: Dict, headers: Dict, as_json: bool = True, max_retries=5, timeout_base=60)bytes

Download data from statistics rest api. This API is based on three steps: (1) post query and retrieve job id, (2) read status of id until finished, (3) download data with job id.

mlair.helpers.data_sources.data_loader.get_data(opts: Dict, headers: Dict, as_json: bool = True, max_retries=5, timeout_base=60) → Union[Dict, List, str]

Download join data using requests framework.

Data is returned as json like structure. Depending on the response structure, this can lead to a list or dictionary.

Parameters
  • opts – options to create the request url

  • headers – additional headers information like authorization, can be empty

  • as_json – extract response as json if true (default True)

Returns

requested data (either as list or dictionary)

mlair.helpers.data_sources.data_loader.correct_stat_name(stat: str)str

Map given statistic name to new namespace defined by mapping dict.

Return given name stat if not element of mapping namespace.

Parameters

stat – namespace from JOIN server

Returns

stat mapped to local namespace

mlair.helpers.data_sources.data_loader.create_url(base: str, service: str, param_id: Union[str, int, None] = None, **kwargs: Union[str, int, float, None])str

Create a request url with given base url, service type and arbitrarily many additional keyword arguments.

Parameters
  • base – basic url of the rest service

  • service – service type, e.g. series, stats

  • param_id – id for a distinct service, is added between ending / of service and ? of kwargs

  • kwargs – keyword pairs for optional request specifications, e.g. ‘statistics=maximum’

Returns

combined url as string

mlair.helpers.data_sources.data_loader.retries_session(max_retries=5)