va_am package#

Submodules#

va_am.va_am module#

square_dims(size: Union[int, list[int], np.ndarray[int]], ratio_w_h: Union[int, float] = 1)[source]#

Function that return the needed dimensions for the plots of the encoded, given the latent dimension and the ratio between width and height.

Parameters:
  • size (int of list[int]) – The latent dimension size.

  • ratio_w_h (int or float) – Desired ration between width and height.

Returns:

A ndarray with the new square dimensions.

Return type:

ndarray

runAE(input_dim: Union[int, list[int]], latent_dim: int, arch: int, use_VAE: bool, with_cpu: bool, n_epochs: int, data_prs: Union[np.ndarray, list, xr.DataArray], file_save: str, verbose: bool, compile_params: dict = {}, fit_params: dict() = {})[source]#

Function that performs the AE traing.

Parameters:
  • input_dim (int or list of int) – Contains the shape of the input data to the keras.model.

  • latent_dim (int) – Represent the shape of the latent (code) space.

  • arch (int) – Value that determine which model architecture sould be used to build the model.

  • use_VAE (bool) – Value that determines if the model should be a Variational Autoencoder or not.

  • with_cpu (bool) – Value that determines if the cpu should be used instead of (default) gpu.

  • n_epochs (int) – The number of epochs for the keras.model.

  • data_prs (np.ndarray) – Driver/predictor data (usually) to train the model.

  • file_save (str) – Where to save the .h5 model.

  • verbose (bool) – Value that determines if the execution information should be displayed.

  • compile_params (dict) – Dictionary that contains all the parameters (avaible depending on tensorflow/keras version) to use for the .compile() function.

  • fit_params (dict) – Dictionary that contains all the parametes (avaible depending on tensorflow/keras version) to use for the .fit() function, except for epochs and verbose.

Returns:

AE.encoder – Keras object that correspond to the fitted encoder model.

Return type:

keras.model.

get_AE_stats(with_cpu: bool, use_VAE: bool, AE_pre=None, AE_ind=None, pre_indust_prs: Optional[Union[list, ndarray]] = None, indust_prs: Optional[Union[list, ndarray]] = None, data_of_interest_prs: Optional[Union[list, ndarray]] = None, period: str = 'both') Union[ndarray, list][source]#

Function used to obtain statistical information about the encoded data by the Autoencoder. It codifies train data based on the period and specific details of the architecture.

Parameters:
  • use_VAE (bool) – Booleans values that determines if the model should be VAE, if the cpu should be used instead gpu or if the, respectively.

  • with_cpu (bool) – Booleans values that determines if the model should be VAE, if the cpu should be used instead gpu or if the, respectively.

  • AE_pre (keras.model) – Encoders keras.model for pre and post industrial period.

  • AE_ind (keras.model) – Encoders keras.model for pre and post industrial period.

  • pre_indust_prs (list or np.ndarray) – Driver/predictor data of pre and post industrial period.

  • indust_prs (list or np.ndarray) – Driver/predictor data of pre and post industrial period.

  • data_of_interes_prs (list or np.ndarray) – Driver/predictor data of interest.

  • period (str) – Value that handle wich part of the data is used.

Returns:

A ndarray containind the data.

Return type:

list or ndarray.

analogSearch(p: int, k: int, data_prs: Union[list, ndarray], data_of_interest_prs: Union[list, ndarray], time_prs: DataArray, data_temp: Dataset, enhanced_distance: bool, threshold: Union[int, float], img_size: Union[list, ndarray], iter: int, threshold_offset_counter: int = 20, replace_choice: bool = True, temp_var_name: str = 'air') tuple[source]#

Funtion that performs the Analog Search Method for a given diver/predictor and temperature (target) variable.

Parameters:
  • p (int) – The p-order of Minskowski distance to perform.

  • k (int) – Number of near neighbours to search.

  • data_prs (list or ndarray) – Driver/predictor data where to search.

  • data_of_interes_prs (list or ndarray) – Driver/predictor data to be searched.

  • time_prs (DataArray) – Time DataArray corresponding to the driver/predictor data where is searching.

  • data_temp (Dataset) – Temperature Dataset used to check the target value.

  • enhanced_distance (bool) – Flag that decides if local proximity has to be performed or no.

  • threshold (int or float) – Threshold used in analogSearch to compute local proximity.

  • img_size (list or ndarray) – List that determine the size of the driver/predictor and target images.

  • iter (int) – How many random neighbours to select.

  • threshold_offset_counter (int) – Number used to perform the local proximity. Default 20.

  • replace_choice (bool) – Flag that indicates if iter selected can be replaced.

  • temp_var_name (str) – The name of the temperature variable in case of working with different Dataset.

Returns:

A tuple containing selected driver/predictor and target.

Return type:

tuple

calculate_interest_region(interest_region: Union[list, ndarray], latitude_min: int, latitude_max: int, longitude_min: int, longitude_max: int, resolution: Union[int, float] = 2, is_teleg: bool = False, secret_file: str = './secret.txt') list[source]#

Method which transform latitude/longitude degrees to index. It is used to increase the speed of the methods by using numpy arrays insted of Dataset or DataArray.

Parameters:
  • interest_region (list or ndarray) – List which contains the latitude and longitude degrees to be converted as index.

  • latitude_min (int) – The latitude minimum limit.

  • latitude_max (int) – The latitude maximum limit.

  • longitude_min (int) – The longitude minimum limit.

  • longitude_max (int) – The longitude maximum limit.

  • resolution (int or float) – Degrees resolution employed. Default value is 2º.

  • is_teleg (bool) – Flag that indicate if the warnings have to be sent to Telegram or not.

  • secret_file (str) – Auxiliar variable only needed if is_teleg True to read token and chat_id values.

Returns:

new_interest_region – A list that contains the equivalent index values.

Return type:

int

save_reconstruction(params: dict, reconstructions_Pre_Analog: list, reconstructions_Post_Analog: list, reconstructions_Pre_AE: list, reconstructions_Post_AE: list)[source]#

Method that save the target reconstruction based on the runs maded. It do not return anything, only save the Xarray Datasets on the corresponding file on data folder. Each file have the format [name]-[period]-[method]-[time].nc.

Parameters:
  • params (dict) – A dictionary which contains all the needed parameters and configuration. Mainly loaded from the configuration file, with some auxiliar parameters added by other functions.

  • reconstruction_Pre_Analog (list) – A list with the multiple reconstructed pre-industrial data by the Analog Method, for each day (or week).

  • reconstruction_Post_Analog (list) – A list with the multiple reconstructed post-industrial data by the Analog Method, for each day (or week).

  • reconstruction_Pre_AE (list) – A list with the multiple reconstructed pre-industrial data by the AutoEncoder, for each day (or week).

  • reconstruction_Post_AE (list) – A list with the multiple reconstructed post-industrial data by the AutoEncoder, for each day (or week).

perform_preprocess(params: dict) tuple[source]#

Method that perform the preprocessing stage

Parameters:

params (dict) – A dictionary with needed parameters and configuration. Mainly loaded from the configuration file, with some auxiliar parameters added by other functions.

Returns:

A tuple of all needed data.

Return type:

tuple

runComparison(params: dict) tuple[source]#

Method that perform the preprocessing, use of the others previous methods, and comparison between analogSearch and AE + analogSearch.

Parameters:

params (dict) – A dictionary with needed parameters and configuration. Mainly loaded from the configuration file, with some auxiliar parameters added by other functions.

Returns:

A tuple of 4 elemets, each containing the corresponding reconstructions list data.

Return type:

tuple

identify_heatwave_days(params: dict) Union[list, ndarray][source]#

Method that perform the identifitacion of the heat wave period, following the definition from http://doi.org/10.1088/1748-9326/10/12/124003.

Parameters:

params (dict) – A dictionary with needed parameters and configuration. Mainly loaded from the configuration file, with some auxiliar parameters added by other functions.

Returns:

heatwave_period – A list of datetime that contains the heat wave period.

Return type:

list or ndarray

va_am(ident: bool = False, method: str = 'day', config_file: str = 'params.json', secret_file: str = 'secrets.txt', verbose: bool = False, teleg: bool = False, period: str = 'both', save_recons: bool = False)[source]#

Equivalent to main function. Its scope is to provide a way to perform the same procedures as main function, but by importing it in another python code.

Parameters
ident: bool

Value of flag to performs the identification period task or not.

method: str

Specify an method to execute between: day (default), days, seasons, execs, latents, seasons-execs, latents-execs or latents-seasons-execs

config_file: str

The default name of the params/configuration file.

secret_file: str

The default name of the Telegram bot informatin file.

verbose: bool

Value of flag that indicates if verbosity information should be show or not.

teleg: bool

Value of flag for sending Exceptions to Telegram bot.

period: str

Specify the period where to perform the operation between both (default), pre or post.

save_recons: bool

Value of flag for saving or not the reconstrucion information in an .nc file.

main()[source]#

Main

Function prepared for runing and managing the program functionality. It use the argparse module to manage the execution of va_am.py as a bash function. To see help use:

python va_am.py -h

Module contents#