:py:mod:`mlair.helpers.join` ============================ .. py:module:: mlair.helpers.join .. autoapi-nested-parse:: Functions to access join database. Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: mlair.helpers.join.download_join mlair.helpers.join.correct_data_format mlair.helpers.join.get_data mlair.helpers.join.retries_session mlair.helpers.join.load_series_information mlair.helpers.join._select_distinct_series mlair.helpers.join._save_to_pandas mlair.helpers.join._correct_stat_name mlair.helpers.join._lower_list mlair.helpers.join.create_url Attributes ~~~~~~~~~~ .. autoapisummary:: mlair.helpers.join.__author__ mlair.helpers.join.__date__ mlair.helpers.join.str_or_none mlair.helpers.join.var_all_dic .. py:data:: __author__ :annotation: = Felix Kleinert, Lukas Leufen .. py:data:: __date__ :annotation: = 2019-10-16 .. py:data:: str_or_none .. py:exception:: EmptyQueryResult Bases: :py:obj:`Exception` Exception that get raised if a query to JOIN returns empty results. .. py:function:: download_join(station_name: Union[str, List[str]], stat_var: dict, station_type: str = None, network_name: str = None, sampling: str = 'daily', data_origin: Dict = None) -> [pandas.DataFrame, pandas.DataFrame] Read data from JOIN/TOAR. :param station_name: Station name e.g. DEBY122 :param stat_var: key as variable like 'O3', values as statistics on keys like 'mean' :param station_type: set the station type like "traffic" or "background", can be none :param network_name: set the measurement network like "UBA" or "AIRBASE", can be none :param sampling: sampling rate of the downloaded data, either set to daily or hourly (default daily) :param data_origin: additional dictionary to specify data origin as key (for variable) value (origin) pair. Valid origins are "REA" for reanalysis data and "" (empty string) for observational data. :returns: data frame with all variables and statistics and meta data frame with all meta information .. py:function:: correct_data_format(data) Transform to the standard data format. For some cases (e.g. hourly data), the data is returned as list instead of a dictionary with keys datetime, values and metadata. This functions addresses this issue and transforms the data into the dictionary version. :param data: data in hourly format :return: the same data but formatted to fit with aggregated format .. py:function:: get_data(opts: Dict, headers: Dict) -> Union[Dict, List] Download join data using requests framework. Data is returned as json like structure. Depending on the response structure, this can lead to a list or dictionary. :param opts: options to create the request url :param headers: additional headers information like authorization, can be empty :return: requested data (either as list or dictionary) .. py:function:: retries_session(max_retries=3) .. py:function:: load_series_information(station_name: List[str], station_type: str_or_none, network_name: str_or_none, join_url_base: str, headers: Dict, data_origin: Dict = None) -> [Dict, Dict] List all series ids that are available for given station id and network name. :param station_name: Station name e.g. DEBW107 :param station_type: station type like "traffic" or "background" :param network_name: measurement network of the station like "UBA" or "AIRBASE" :param join_url_base: base url name to download data from :param headers: additional headers information like authorization, can be empty :param data_origin: additional information to select a distinct series e.g. from reanalysis (REA) or from observation ("", empty string). This dictionary should contain a key for each variable and the information as key :return: all available series for requested station stored in an dictionary with parameter name (variable) as key and the series id as value. .. py:function:: _select_distinct_series(vars: List[Dict], data_origin: Dict = None) -> [Dict, Dict] Select distinct series ids for all variables. Also check if a parameter is from REA or not. .. py:function:: _save_to_pandas(df: Union[pandas.DataFrame, None], data: dict, stat: str, var: str) -> pandas.DataFrame Save given data in data frame. If given data frame is not empty, the data is appened as new column. :param df: data frame to append the new data, can be none :param data: new data to append or format as data frame containing the keys 'datetime' and '' :param stat: extracted statistic to get values from data (e.g. 'mean', 'dma8eu') :param var: variable the data is from (e.g. 'o3') :return: new created or concatenated data frame .. py:function:: _correct_stat_name(stat: str) -> str Map given statistic name to new namespace defined by mapping dict. Return given name stat if not element of mapping namespace. :param stat: namespace from JOIN server :return: stat mapped to local namespace .. py:function:: _lower_list(args: List[str]) -> Iterator[str] Lower all elements of given list. :param args: list with string entries to lower :return: iterator that lowers all list entries .. py:function:: create_url(base: str, service: str, **kwargs: Union[str, int, float, None]) -> str Create a request url with given base url, service type and arbitrarily many additional keyword arguments. :param base: basic url of the rest service :param service: service type, e.g. series, stats :param kwargs: keyword pairs for optional request specifications, e.g. 'statistics=maximum' :return: combined url as string .. py:data:: var_all_dic