:py:mod:`mlair.helpers.join`
============================

.. py:module:: mlair.helpers.join

.. autoapi-nested-parse::

   Functions to access join database.


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   mlair.helpers.join.download_join
   mlair.helpers.join.correct_data_format
   mlair.helpers.join.get_data
   mlair.helpers.join.retries_session
   mlair.helpers.join.load_series_information
   mlair.helpers.join._select_distinct_series
   mlair.helpers.join._save_to_pandas
   mlair.helpers.join._correct_stat_name
   mlair.helpers.join._lower_list
   mlair.helpers.join.create_url


Attributes
~~~~~~~~~~

.. autoapisummary::

   mlair.helpers.join.__author__
   mlair.helpers.join.__date__
   mlair.helpers.join.str_or_none
   mlair.helpers.join.var_all_dic


.. py:data:: __author__
   :annotation: = Felix Kleinert, Lukas Leufen

   
.. py:data:: __date__
   :annotation: = 2019-10-16

   
.. py:data:: str_or_none
   

.. py:exception:: EmptyQueryResult

   Bases: :py:obj:`Exception`

   Exception that get raised if a query to JOIN returns empty results.


.. py:function:: download_join(station_name: Union[str, List[str]], stat_var: dict, station_type: str = None, network_name: str = None, sampling: str = 'daily', data_origin: Dict = None) -> [pandas.DataFrame, pandas.DataFrame]

   Read data from JOIN/TOAR.

   :param station_name: Station name e.g. DEBY122
   :param stat_var: key as variable like 'O3', values as statistics on keys like 'mean'
   :param station_type: set the station type like "traffic" or "background", can be none
   :param network_name: set the measurement network like "UBA" or "AIRBASE", can be none
   :param sampling: sampling rate of the downloaded data, either set to daily or hourly (default daily)
   :param data_origin: additional dictionary to specify data origin as key (for variable) value (origin) pair. Valid
       origins are "REA" for reanalysis data and "" (empty string) for observational data.

   :returns: data frame with all variables and statistics and meta data frame with all meta information


.. py:function:: correct_data_format(data)

   Transform to the standard data format.

   For some cases (e.g. hourly data), the data is returned as list instead of a dictionary with keys datetime, values
   and metadata. This functions addresses this issue and transforms the data into the dictionary version.

   :param data: data in hourly format

   :return: the same data but formatted to fit with aggregated format


.. py:function:: get_data(opts: Dict, headers: Dict) -> Union[Dict, List]

   Download join data using requests framework.

   Data is returned as json like structure. Depending on the response structure, this can lead to a list or dictionary.

   :param opts: options to create the request url
   :param headers: additional headers information like authorization, can be empty

   :return: requested data (either as list or dictionary)


.. py:function:: retries_session(max_retries=3)


.. py:function:: load_series_information(station_name: List[str], station_type: str_or_none, network_name: str_or_none, join_url_base: str, headers: Dict, data_origin: Dict = None) -> [Dict, Dict]

   List all series ids that are available for given station id and network name.

   :param station_name: Station name e.g. DEBW107
   :param station_type: station type like "traffic" or "background"
   :param network_name: measurement network of the station like "UBA" or "AIRBASE"
   :param join_url_base: base url name to download data from
   :param headers: additional headers information like authorization, can be empty
   :param data_origin: additional information to select a distinct series e.g. from reanalysis (REA) or from observation
       ("", empty string). This dictionary should contain a key for each variable and the information as key
   :return: all available series for requested station stored in an dictionary with parameter name (variable) as key
       and the series id as value.


.. py:function:: _select_distinct_series(vars: List[Dict], data_origin: Dict = None) -> [Dict, Dict]

   Select distinct series ids for all variables. Also check if a parameter is from REA or not.


.. py:function:: _save_to_pandas(df: Union[pandas.DataFrame, None], data: dict, stat: str, var: str) -> pandas.DataFrame

   Save given data in data frame.

   If given data frame is not empty, the data is appened as new column.

   :param df: data frame to append the new data, can be none
   :param data: new data to append or format as data frame containing the keys 'datetime' and '<stat>'
   :param stat: extracted statistic to get values from data (e.g. 'mean', 'dma8eu')
   :param var: variable the data is from (e.g. 'o3')

   :return: new created or concatenated data frame


.. py:function:: _correct_stat_name(stat: str) -> str

   Map given statistic name to new namespace defined by mapping dict.

   Return given name stat if not element of mapping namespace.

   :param stat: namespace from JOIN server

   :return: stat mapped to local namespace


.. py:function:: _lower_list(args: List[str]) -> Iterator[str]

   Lower all elements of given list.

   :param args: list with string entries to lower

   :return: iterator that lowers all list entries


.. py:function:: create_url(base: str, service: str, **kwargs: Union[str, int, float, None]) -> str

   Create a request url with given base url, service type and arbitrarily many additional keyword arguments.

   :param base: basic url of the rest service
   :param service: service type, e.g. series, stats
   :param kwargs: keyword pairs for optional request specifications, e.g. 'statistics=maximum'

   :return: combined url as string


.. py:data:: var_all_dic