basic

class aitoolbox.torchtrain.callbacks.basic.ListRegisteredCallbacks[source]

Bases: AbstractCallback

List all the callbacks which are used in the current TrainLoop

Logic executed at the beginning of the overall training

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.EarlyStopping(monitor='val_loss', min_delta=0.0, patience=0)[source]

Bases: AbstractCallback

Early stopping of the training if the performance stops improving

Parameters:

monitor (str) – performance measure that is tracked to decide if performance is improving during training
min_delta (float) – by how much the performance has to improve to still keep training the model
patience (int) – how many epochs the early stopper waits after the performance stopped improving

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.ThresholdEarlyStopping(monitor, threshold, patience=0)[source]

Bases: AbstractCallback

Early stopping of the training if the performance doesn’t reach the specified threshold

Parameters:

monitor (str) – performance measure that is tracked to decide if performance reached the desired threshold
threshold (float) – performance threshold that needs to be exceeded in order to continue training
patience (int) – how many epochs the early stopper waits for the tracked performance to reach the desired threshold

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.TerminateOnNaN(monitor='loss')[source]

Bases: AbstractCallback

Terminate training if NaNs are predicted, thus metrics are NaN

Parameters:: monitor (str) – performance measure that is tracked to decide if performance is improving during training

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.AllPredictionsSame(value=0.0, stop_training=False, verbose=True)[source]

Bases: AbstractCallback

Checks if all the predicted values are the same

Useful for example when dealing with extremely unbalanced classes.

Parameters:

value (float) – all predictions are the same as this value
stop_training (bool) – if all predictions match the specified value, should the training be (early) stopped
verbose (bool) – output messages

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.EmailNotification(sender_name, sender_email, recipient_email, project_name=None, experiment_name=None, aws_region='eu-west-1')[source]

Bases: AbstractCallback

Notify user via email about the training progression

Parameters:

sender_name (str) – Name of the email sender
sender_email (str) – Email of the email sender
recipient_email (str) – Email where the email will be sent
project_name (str or None) – root name of the project
experiment_name (str or None) – name of the particular experiment
aws_region (str) – AWS SES region

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

on_train_end()[source]

Logic executed at the end of the overall training

Returns:: None

get_metric_list_html()[source]

Generate performance metrics list HTML

Returns:: HTML doc
Return type:: str

get_hyperparams_html()[source]

Generate hyperparameters list HTML

Returns:: HTML doc
Return type:: str

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.LogUpload(log_file_path='~/project/training.log', fail_if_cloud_missing=True, project_name=None, experiment_name=None, local_model_result_folder_path=None, cloud_save_mode=None, bucket_name=None, cloud_dir_prefix=None)[source]

Bases: AbstractExperimentCallback

Upload logging file to the cloud storage

Uploading happens after each epoch and at the end of the training process.

Parameters:

log_file_path (str) – path to the local logging file
fail_if_cloud_missing (bool) – should throw the exception if cloud saving is not available
project_name (str or None) – root name of the project
experiment_name (str or None) – name of the particular experiment
local_model_result_folder_path (str or None) – root local path where project folder will be created
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:: None

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

on_train_end()[source]

Logic executed at the end of the overall training

Returns:: None

upload_log_file()[source]

class aitoolbox.torchtrain.callbacks.basic.DataSubsetTestRun(num_train_batches=1, num_val_batches=0, num_test_batches=0)[source]

Bases: AbstractCallback

Subset the provided data loaders to execute neural net only on a small dataset subset

This is especially useful when first developing the neural architectures and debugging them. Subsetting the full dataset helps with fast development iterations.

Parameters:

num_train_batches (int) – number of the training data batches that are kept in the training dataset
num_val_batches (int) – number of the validation data batches that are kept in the validation dataset
num_test_batches (int) – number of the test data batches that are kept in the test dataset

on_train_begin()[source]

Logic executed at the beginning of the overall training

Returns:: None

static subset_data_loader(data_loader, num_batches)[source]

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:: None

class aitoolbox.torchtrain.callbacks.basic.FunctionOnTrainLoop(fn_to_execute, tl_registration=False, epoch_begin=False, epoch_end=False, train_begin=False, train_end=False, batch_begin=False, batch_end=False, after_gradient_update=False, after_optimizer_step=False, execution_order=0, device_idx_execution=None)[source]

Bases: AbstractCallback

Execute given function as a callback in the TrainLoop

Parameters:

fn_to_execute (function) – function logic to be executed at the desired point of the TrainLoop. The function should take a single input as an argument which is the reference to the encapsulating TrainLoop object (self.train_loop_obj).
tl_registration (bool) – should execute on TrainLoop registration
epoch_begin (bool) – should execute at the beginning of the epoch
epoch_end (bool) – should execute at the end of the epoch
train_begin (bool) – should execute at the beginning of the training
train_end (bool) – should execute at the end of the training
batch_begin (bool) – should execute at the beginning of the batch
batch_end (bool) – should execute at the end of the batch
after_gradient_update (bool) – should execute after the gradient update
after_optimizer_step (bool) – should execute after the optimizer step
execution_order (int) – order of the callback execution. If all the used callbacks have the orders set to 0, then the callbacks are executed in the order they were registered.

execute_callback()[source]

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:: None

on_epoch_begin()[source]

Logic executed at the beginning of the epoch

Returns:: None

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

on_train_begin()[source]

Logic executed at the beginning of the overall training

Returns:: None

on_train_end()[source]

Logic executed at the end of the overall training

Returns:: None

on_batch_begin()[source]

Logic executed before the batch is inserted into the model

Returns:: None

on_batch_end()[source]

Logic executed after the batch is inserted into the model

Returns:: None

on_after_gradient_update(optimizer_idx)[source]

Logic executed after the model gradients are updated

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise, logic implemented here will not be executed by the TrainLoop.

Parameters:: optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.
Returns:: None

on_after_optimizer_step()[source]

Logic executed after the optimizer does a new step and updates the model weights

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise, logic implemented here will not be executed by the TrainLoop.

Returns:: None