How to…?#
Configuration file#
Regardless the way you will use VA-AM (inside or outside a python code), you will need a configuration file. It has to be a JSON file with a structure as:
{
"season": "all",
"name": "-city_athens1987-",
"latitude_min": 28,
"latitude_max": 66,
"longitude_min": -8,
"longitude_max": 50,
"pre_init": "1851-01-06",
"pre_end": "1950-12-31",
"post_init": "1951-01-01",
"post_end": "2014-12-28",
"data_of_interest_init": "1987-06-01",
"data_of_interest_end": "1987-08-31",
"load_AE": false,
"load_AE_pre": true,
"file_AE_pre": "./models/AE_pre.h5",
"file_AE_post": "./models/AE_post.h5",
"latent_dim": 600,
"use_VAE": true,
"with_cpu": false,
"n_epochs": 5,
"k": 20,
"iter": 1000,
"interest_region": [38,38,24,24],
"resolution": 2,
"interest_region_type": "coord",
"per_what": "per_day",
"remove_year": false,
"replace_choice": true,
"arch": 5,
"verbose": true,
"temp_dataset": "~/path/to/data/data_dailyMax_t2m_1940-2022.nc",
"prs_dataset": "~/path/to/data/prmsl.nc",
"ident_dataset": "~/path/to/data_dailyMax_t2m_1940-2022.nc",
"temp_var_name": "t2m_dailyMax",
"p": 2,
"enhanced_distance": true,
"compile_params": {
"optimizer":"adamax",
"loss":"mape",
"metrics":["mae", "mse"]
},
"fit_params": {
"batch_size":64,
"shuffle":true,
"validation_split":0.15
}
}
By the flag -f
| --configfile
or the config_file
parameter you can provide the Path
to your .json file containing the configuration parameters. If not provided, the program will search
for a default params.json
file in the directory.
Then, we provide a list of all posible parameters, the type of parameter and a brief description of each one:
Parameter |
Type |
Description |
---|---|---|
season |
str or list of str |
String (or list of strings) that Specify
in wich season to perform the method,
between: |
name |
str |
Arbitrary name for identification of the execution/simulation and result file. |
latitude/longitude |
int |
The defined search region in terms of minimal and maximal latitude and longitude. |
interest_region |
list of int |
Defined interest region where the
reconstruction has to be maded, in terms
of initial and end latitude and logitude.
In should be a subregion of the defined
search region. Otherwise it could be
also the entire search region, but not
bigger that it. See
|
interest_region_type |
str |
Define if the |
resolution |
int |
Coordinates resolution of the dataset
(Defaul value |
pre/post |
str |
String with datetime of start (_init) and
end (_end) of what we consider |
period |
str |
String that indicates in wich period the
analysis will be performed. If could be
|
data_of_interest |
str |
Same as previous, but for specify which is your interest datetime. (See Identify) |
load_AE |
bool |
Flag that specify if the VA sould be
loaded from the |
load_AE_pre |
bool |
Same as previous flag, but only for VA in
|
file_AE |
str |
Path to where to save the trained models
of VA for |
latent_dim |
int |
Latent (or code) dimension to which the predictor/driver should be reduced (or codified). |
use_VAE |
bool |
Flag. If |
with_cpu |
bool |
Flag that indicate if the CPU or GPU version of tensorflow should be used, in case of having (or not) a GPU. |
n_epochs |
int |
Number of maximum epoch of training step. |
n_execs |
int |
If method is one of |
k |
int |
How many analogue situation to select from
the nearest ones. If |
iter |
int |
Number of random extraction to perform
from the |
per_what |
str |
String to specify if the analysis should
be diary ( |
remove_year |
bool |
Flag that indicates if the year of the
interest period should be removed entirely
or not. If false, only the period between
|
replace_choice |
bool |
Flag that determines if the |
arch |
int |
Wich architecture of the available has to to be used. See section for the available architectures. |
verbose |
bool |
If |
temp/prs_dataset |
str |
Path to target (temp) and predictor/driver
(prs) datasets ( |
ident_dataset |
str |
Path to dataset where the identification will be performed. It could be the same (or not) as the target dataset. |
temp_var_name |
str |
Name of target variable in the dataset (default value if not specified is inferred from the dataset). |
prs_var_name |
str |
Name of predictor/driver variable in the dataset. In case you don’t specify it, the name will be inferred automatically. In future multi-variate VA-AM version, this parameter will change, probably to a list of strings or something like this. |
p |
int |
Wich p-Minkowski distance to perform while
the analog search, where taxicab
distance is |
enhanced_distance |
bool |
Flag that indicates if the enhanced local proximity criterion should be used along with the p-Minkowski distance. |
save_recons |
bool |
Flag that indicates if the reconstruction
of the target event should be saved
(default value |
percentile |
int |
Wich percentile should be used during the
identification step (default value
|
out_preprocess |
str or list[str] |
What to return from |
compile_params |
dict |
Dictionary wich contains the configuration input arguments for the model.compile() method, depending on the tensorflow/keras version. |
fit_params |
dict |
Dictionary wich contains the configuration input arguments for the model.fit() method, except for epochs and verbose, depending on the tensorflow/keras version. |
Functionality#
This package provide, for now, the below functionality. More are expected in future versions. The github repository have some example of configuration files for some well known heat waves, but you should first check the Configuration file section.
Identify heat waves#
We can perform the identifitacion of the heat wave period, following the definition from Russo
paper. You will need a dataset of, ideally, maximum daily (or weekly) temperature as ident_dataset
.
From that you can perform the identification by by -i
| --identifyhw
flag or ident
parameter as shown below,
with the corresponding Configuration file.
# Outside of the python code
$ python -m va_am -i -f "path/to/config-file" ...
# Inside of the python code
from va_am import va_am
va_am(ident=True, config_file="path/to/config-file", ...)
Default methods of package are for Analog search or Va-AM, so you can face 2 different scenarios: you will want to make de itentification as a first step of the other methods, or you will want to only make the identification.
In case you will use the identification as a first step of other methods, it is compatible with all methods
except day
. E.g., for method execs
:
# Outside of the python code
$ python -m va_am -i -m execs -f "path/to/config-file" ...
# Inside of the python code
from va_am import va_am
va_am(ident=True, method="execs", config_file="path/to/config-file", ...)
In case you will use only the identification, is not required to specify any method. If the -i
|
--identifyhw
flag is used, it will return a warning like Indentify Heat wave period (flag -i
--identifyhw) for {params['name'][1:-1]} is not compatible with default 'method' ('day') and this
will not be executed
indicating that only the identification is going to be performed (instead of
defauls day
method).
# Outside of the python code
$ python -m va_am -i -f "path/to/config-file" ...
# Inside of the python code
from va_am import va_am
va_am(ident=True, config_file="path/to/config-file", ...)
Note
If Telegram bot is used you will also recive this warning. See section for more details.
Analog search#
The Analog method is a classic statistical search method based in a KNN search with a defined metric (See Zorita for a more detailed definition).
Until now, analog search is an auxiliar method that is not available from the outside python code versión. It is expected that in next version of VA-AM, the preprocess stage will be a more generic one. With this, an only analog search method option will be allowed for outside python code execution. For now, you can use it by:
from va_am import analogSearch
analogSearch(...)
See API reference for details about analogSearch
arguments
VA-AM methods#
The usual functionality of VA-AM is to use deep learning methods (mainly Autoencoder-based) to enhance the performance of the classic analog. We provide several already-done architectures, such as Variational-Autoencoder , Autoencoder, Deep-Autoencoder, Simetric-Autoencode r, among others (see API reference).
Note
Where the order of architecture in the documentation correspond to its arch
value in Configuration file.
For heat wave case a specific architecture is recommended (arch=5
)
Is expected to implement in future versions a user-framework or method to use user-own architecture in VA-AM.
Telegram bot#
VA-AM include compatibility with a Telegram bot as warn and allert mechanism. It could be useful when you are performing diferent long task and want to be notified about possibles errors, exceptions and warnings.
To use it is quite easy by -t
| --teleg
flag or teleg
parameter as shown below, but
first you will need to fulfill some previous steps:
# Outside of the python code
$ python -m va_am -t ...
# Inside of the python code
from va_am import va_am
va_am(..., teleg=True)
Step 1. Create your own Telegram bot#
For the -t
| --teleg
option to work, you will need to create your own Telegram bot,
which will be who will notify you. BotFather is a built-in Telegram bot that allows you to
create another bots. We recommend to follow this Tutorial
in order to create the bot.
Note
It is very important to save the token provided by BotFather of your Telegram bot.
Step 2. Create a channel or group#
The next step is to create a Telegram channel or group where you will get the allerts. We recommed the use of a channel, but also a group could be possible. You will need to add your created bot to this channel (or group) and allow it to send message (check the permissions you give to other users/bots as admin of the channel).
When everything ready, you could follow the next step of the Tutorial
to get the chat id
. Some snippet like the following could give you the chat id
:
import requests
TOKEN = "YOUR TELEGRAM BOT TOKEN"
url = f"https://api.telegram.org/bot{TOKEN}/getUpdates"
print(requests.get(url).json())
Note
Chat id
is an integer number that represents the channel (or group) which bot is member. It
is important to Note that it could be a possitive or negative integer number, so be aware about
the -
sign.
Step 3. Telegram secrets configuration file#
The last step is to provide a secret file to the program to be able to use your Telegram bot.
By the flag -sf
| --secretfile
or the secret_file
parameter you can provide the Path
to your .txt (or similar) file containing the secrets.
# Outside of the python code
$ python -m va_am -sr path/to/secret-file ...
# Inside of the python code
from va_am import va_am
va_am(..., secret_file="path/to/secret-file")
If not specified the secret file path, it will be searched at the default secret.txt
file.
The scructure of the secret file need to be:
[TOKEN]
[chat-id]
@[user-name]
Important
VA-AM will send exceptions and warnings to the Telegram bot. In order to distinguish better
exceptions from warnings, it use your [user-name]
to notify you. If not wanted to follow this
functionality, you could not provide it and replace @[user-name]
by and empty space.
In any case, a third row is needed in the file, regardless it is empty, a white/blank space,
or your @[user-name]
.
Caution
DON’T SHARE YOUR SECRET FILE WITH ANYONE!!!!
The [TOKEN]
provides absolute access and admin permissions
with your bot. In the wrong hands, it could end in a mess (probably your bot will became a spam bot,
at best). If your going to use VA-AM in a repository (especially a public one), we recommed you
to add your secret file name to the .gitignore file.