:py:mod:`mlair.plotting.data_insight_plotting` ============================================== .. py:module:: mlair.plotting.data_insight_plotting .. autoapi-nested-parse:: Collection of plots to get more insight into data. Module Contents --------------- Classes ~~~~~~~ .. autoapisummary:: mlair.plotting.data_insight_plotting.PlotStationMap mlair.plotting.data_insight_plotting.PlotAvailability mlair.plotting.data_insight_plotting.PlotAvailabilityHistogram mlair.plotting.data_insight_plotting.PlotDataMonthlyDistribution mlair.plotting.data_insight_plotting.PlotDataHistogram mlair.plotting.data_insight_plotting.PlotPeriodogram mlair.plotting.data_insight_plotting.PlotClimateFirFilter mlair.plotting.data_insight_plotting.PlotFirFilter Functions ~~~~~~~~~ .. autoapisummary:: mlair.plotting.data_insight_plotting.f_proc mlair.plotting.data_insight_plotting.f_proc_2 mlair.plotting.data_insight_plotting.f_proc_hist Attributes ~~~~~~~~~~ .. autoapisummary:: mlair.plotting.data_insight_plotting.__author__ mlair.plotting.data_insight_plotting.__date__ .. py:data:: __author__ :annotation: = Lukas Leufen, Felix Kleinert .. py:data:: __date__ :annotation: = 2021-04-13 .. py:class:: PlotStationMap(generators: List, plot_folder: str = '.', plot_name='station_map') Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Plot geographical overview of all used stations as squares. Different data sets can be colorised by its key in the input dictionary generators. The key represents the color to plot on the map. Currently, there is only a white background, but this can be adjusted by loading locally stored topography data (not implemented yet). The plot is saved under plot_path with the name station_map.pdf .. image:: ../../../../../_source/_plots/station_map.png :width: 400 .. py:method:: _draw_background(self) Draw coastline, lakes, ocean, rivers and country borders as background on the map. .. py:method:: _plot_stations(self, generators) Loop over all keys in generators dict and its containing stations and plot the stations's position. Position is highlighted by a square on the map regarding the given color. :param generators: dictionary with the plot color of each data set as key and the generator containing all stations as value. .. py:method:: _adjust_marker(marker) :staticmethod: .. py:method:: _get_collection_and_opts(element) :staticmethod: .. py:method:: _plot(self, generators: List) Create the station map plot. Set figure and call all required sub-methods. :param generators: dictionary with the plot color of each data set as key and the generator containing all stations as value. .. py:method:: _adjust_extent(self) .. py:class:: PlotAvailability(generators: Dict[str, mlair.data_handler.DataCollection], plot_folder: str = '.', sampling='daily', summary_name='data availability', time_dimension='datetime', window_dimension='window') Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Create data availablility plot similar to Gantt plot. Each entry of given generator, will result in a new line in the plot. Data is summarised for given temporal resolution and checked whether data is available or not for each time step. This is afterwards highlighted as a colored bar or a blank space. You can set different colors to highlight subsets for example by providing different generators for the same index using different keys in the input dictionary. Note: each bar is surrounded by a small white box to highlight gabs in between. This can result in too long gabs in display, if a gab is only very short. Also this appears on a (fluent) transition from one to another subset. Calling this class will create three versions fo the availability plot. 1) Data availability for each element 1) Data availability as summary over all elements (is there at least a single elemnt for each time step) 1) Combination of single and overall availability .. image:: ../../../../../_source/_plots/data_availability.png :width: 400 .. image:: ../../../../../_source/_plots/data_availability_summary.png :width: 400 .. image:: ../../../../../_source/_plots/data_availability_combined.png :width: 400 .. py:method:: _prepare_data(self, generators: Dict[str, mlair.data_handler.DataCollection]) .. py:method:: _summarise_data(self, generators: Dict[str, mlair.data_handler.DataCollection], summary_name: str) .. py:method:: _plot(self, plt_dict) Abstract plot class needs to be implemented in inheritance. .. py:class:: PlotAvailabilityHistogram(generators: Dict[str, mlair.data_handler.DataCollection], plot_folder: str = '.', subset_dim: str = 'DataSet', history_dim: str = 'window', station_dim: str = 'Stations') Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Create data availability plots as histogram. Each entry of each generator is checked for `notnull()` values along all the datetime axis (boolean). Calling this class creates two different types of histograms where each generator 1) data_availability_histogram: datetime (xaxis) vs. number of stations with availabile data (yaxis) 2) data_availability_histogram_cumulative: number of samples (xaxis) vs. number of stations having at least number of samples (yaxis) .. image:: ../../../../../_source/_plots/data_availability_histogram_hist.png :width: 400 .. image:: ../../../../../_source/_plots/data_availability_histogram_hist_cum.png :width: 400 .. py:method:: _set_dims_from_datahandler(self, data_handler) .. py:method:: allowed_plot_types(self) :property: .. py:method:: _prepare_data(self, generators: Dict[str, mlair.data_handler.DataCollection]) Prepares data to be used by plot methods. Creates xarrays which are sums of valid data (boolean sums) across i) station_dim and ii) temporal_dim .. py:method:: _reduce_dims(self, dataset) .. py:method:: _get_first_and_last_indexelement_from_xarray(xarray, dim_name, return_type='as_tuple') :staticmethod: .. py:method:: _make_full_time_index(irregular_time_index, freq) :staticmethod: .. py:method:: _plot(self, plt_type='hist', *args) Abstract plot class needs to be implemented in inheritance. .. py:method:: _plot_hist(self, *args) .. py:method:: _plot_hist_cum(self, *args) .. py:class:: PlotDataMonthlyDistribution(generators: Dict[str, mlair.data_handler.DataCollection], plot_folder: str = '.', variables_dim='variables', time_dim='datetime', window_dim='window', target_var: str = '', target_var_unit: str = 'ppb') Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Abstract class for all plotting routines to unify plot workflow. Each inheritance requires a _plot method. Create a plot class like: .. code-block:: python class MyCustomPlot(AbstractPlotClass): def __init__(self, plot_folder, *args, **kwargs): super().__init__(plot_folder, "custom_plot_name") self._data = self._prepare_data(*args, **kwargs) self._plot(*args, **kwargs) self._save() def _prepare_data(*args, **kwargs): return data def _plot(*args, **kwargs): The save method is already implemented in the AbstractPlotClass. If special saving is required (e.g. if you are using pdfpages), you need to overwrite it. Plots are saved as .pdf with a resolution of 500dpi per default (can be set in super class initialisation). Methods like the shown _prepare_data() are optional. The only method required to implement is _plot. If you want to add a time tracking module, just add the TimeTrackingWrapper as decorator around your custom plot class. It will log the spent time if you call your plotting without saving the returned object. .. code-block:: python @TimeTrackingWrapper class MyCustomPlot(AbstractPlotClass): pass Let's assume it takes a while to create this very special plot. >>> MyCustomPlot() INFO: MyCustomPlot finished after 00:00:11 (hh:mm:ss) .. py:method:: _prepare_data(self, generators) -> List[xarray.DataArray] Pre.process data required to plot. :param generator: data :return: The entire data set, flagged with the corresponding month. .. py:method:: _spell_out_chemical_concentrations(short_name: str, add_concentration: bool = False) :staticmethod: .. py:method:: _plot(self, target_var: str, target_var_unit: str) Create a monthly grouped box plot over all stations but with separate boxes for each lead time step. :param target_var: display name of the target variable on plot's axis .. py:class:: PlotDataHistogram(generators: Dict[str, mlair.data_handler.DataCollection], plot_folder: str = '.', plot_name='histogram', variables_dim='variables', time_dim='datetime', window_dim='window', upsampling=False) Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Plot histogram on transformed input and target data. This data is the same that the model sees during training. No plots are create for the original values space (raw / unformatted data). This plot method will create a histogram for input and target each comparing the subsets train, val and test, as well as a distinct one for the three subsets. .. image:: ../../../../../_source/_plots/datahistogram.png :width: 400 .. py:method:: _handle_upsampling(generators) :staticmethod: .. py:method:: _get_inputs_targets(gens, dim) :staticmethod: .. py:method:: _calculate_hist(self, generators, variables, input_data=True, branch_pos=0) .. py:method:: _plot(self, add_name, subset) Abstract plot class needs to be implemented in inheritance. .. py:method:: _plot_combined(self, add_name) .. py:class:: PlotPeriodogram(generator: Dict[str, mlair.data_handler.DataCollection], plot_folder: str = '.', plot_name='periodogram', variables_dim='variables', time_dim='datetime', sampling='daily', use_multiprocessing=False) Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Create Lomb-Scargle periodogram in raw input and target data. The Lomb-Scargle version can deal with missing values. This plot routine is creating the following plots: * "raw": data is not aggregated, 1 graph per variable * "": single data lines are aggregated, 1 graph per variable * "total": data is aggregated on all variables, single graph If data consists on different sampling rates, a separate plot is create for each sampling. .. image:: ../../../../../_source/_plots/periodogram.png :width: 400 .. note:: This plot is not included in the default plot list. To use this plot, add "PlotPeriodogram" to the `plot_list`. .. warning:: This plot is highly sensitive to the data handler structure. Therefore, it is highly likely that this method is not compatible with any custom data handler. Proven data handlers are `DefaultDataHandler`, `DataHandlerMixedSampling`, `DataHandlerMixedSamplingWithFilter`. To work properly, the data handler must have the attribute `.id_class._data`. .. py:method:: _has_filter_dimension(g, pos) :staticmethod: Inspect if filtered data is provided and return number and labels of filtered components. .. py:method:: _prepare_pgram(self, generator, pos, multiple=1, use_multiprocessing=False, use_last_input_value=True) Create periodogram data. .. py:method:: _prepare_pgram_parallel_var(self, generator, m, pos, use_multiprocessing) Implementation of data preprocessing using parallel variables element processing. .. py:method:: _prepare_pgram_parallel_gen(self, generator, m, pos, use_multiprocessing, use_last_input_value=True) Implementation of data preprocessing using parallel generator element processing. .. py:method:: _add_annotation_line(ax, pos, div, lims, unit) :staticmethod: .. py:method:: _format_figure(self, ax, var_name='total') Set log scale on both axis, add labels and annotation lines, and set title. :param ax: current ax object :param var_name: name of variable that will be included in the title .. py:method:: _plot(self, raw=True) Abstract plot class needs to be implemented in inheritance. .. py:method:: _plot_total(self, raw=True) .. py:method:: _plot_difference(self, label_names, plot_name_add='') .. py:function:: f_proc(var, d_var, f_index, time_dim='datetime', use_last_value=True) .. py:function:: f_proc_2(g, m, pos, variables_dim, time_dim, f_index, use_last_value) .. py:function:: f_proc_hist(data, variables, n_bins, variables_dim) .. py:class:: PlotClimateFirFilter(plot_folder, plot_data, sampling, name) Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Plot climate FIR filter components. * Creates a separate folder climFIR inside the given plot directory. * For each station up to 4 examples are shown (1 for each season). * Each filtered component and its residuum is drawn in a separate plot. * A filter component plot includes the climate FIR input, the filter response, the true non-causal (ideal) filter input, and the corresponding ideal response (containing information about future) * A filter residuum plot include the climate FIR residuum and the ideal filter residuum. .. py:method:: _prepare_data(self, data) Restructure plot data. .. py:method:: _plot(self, plot_dict, sampling, new_dim='window') Abstract plot class needs to be implemented in inheritance. .. py:method:: _set_ylim_by_valid_range(ax, a, b, dim, valid_range) :staticmethod: .. py:method:: _set_xlim(self, ax, t0, order, valid_range, td_type, time_axis) Set xlims Use order and valid_range to find a good zoom in that hides edges of filter values that are effected by reduced filter order. Limits are returned to be usable for other plots. .. py:method:: _plot_valid_area(self, ax, t0, valid_range, td_type) .. py:method:: _plot_t0(self, ax, t0) .. py:method:: _plot_series(self, ax, time_axis, data, style) .. py:method:: _plot_original_data(self, ax, time_axis, data) .. py:method:: _plot_apriori(self, ax, time_axis, data, new_dim, ifilter, offset) .. py:method:: _plot_clim_filter(self, ax, time_axis, data, new_dim, h, output_dtypes) .. py:method:: _plot_ideal_filter(self, ax, time_axis, data, new_dim, h, output_dtypes) .. py:method:: _store_plot_data(self, data) Store plot data. Could be loaded in a notebook to redraw. .. py:class:: PlotFirFilter(plot_folder, plot_data, name) Bases: :py:obj:`mlair.plotting.abstract_plot_class.AbstractPlotClass` Plot FIR filter components. * Creates a separate folder FIR inside the given plot directory. * For each station up to 4 examples are shown (1 for each season). * Each filtered component and its residuum is drawn in a separate plot. * A filter component plot includes the FIR input and the filter response * A filter residuum plot include the FIR residuum .. py:method:: _prepare_data(self, data) Restructure plot data. .. py:method:: _plot(self, plot_dict) Abstract plot class needs to be implemented in inheritance. .. py:method:: _plot_t0(self, ax, t0) .. py:method:: _plot_series(self, ax, time_axis, data, style) .. py:method:: _plot_data(self, ax, time_axis, data, style='original') .. py:method:: _store_plot_data(self, data) Store plot data. Could be loaded in a notebook to redraw.