basic

class aitoolbox.torchtrain.callbacks.basic.ListRegisteredCallbacks[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

List all the callbacks which are used in the current TrainLoop

on_train_begin()[source]

Logic executed at the beginning of the overall training

Returns

None

class aitoolbox.torchtrain.callbacks.basic.EarlyStopping(monitor='val_loss', min_delta=0.0, patience=0)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Early stopping of the training if the performance stops improving

Parameters
  • monitor (str) – performance measure that is tracked to decide if performance is improving during training

  • min_delta (float) – by how much the performance has to improve to still keep training the model

  • patience (int) – how many epochs the early stopper waits after the performance stopped improving

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

class aitoolbox.torchtrain.callbacks.basic.ThresholdEarlyStopping(monitor, threshold, patience=0)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Early stopping of the training if the performance doesn’t reach the specified threshold

Parameters
  • monitor (str) – performance measure that is tracked to decide if performance reached the desired threshold

  • threshold (float) – performance threshold that needs to be exceeded in order to continue training

  • patience (int) – how many epochs the early stopper waits for the tracked performance to reach the desired threshold

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

class aitoolbox.torchtrain.callbacks.basic.TerminateOnNaN(monitor='loss')[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Terminate training if NaNs are predicted, thus metrics are NaN

Parameters

monitor (str) – performance measure that is tracked to decide if performance is improving during training

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

class aitoolbox.torchtrain.callbacks.basic.AllPredictionsSame(value=0.0, stop_training=False, verbose=True)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Checks if all the predicted values are the same

Useful for example when dealing with extremely unbalanced classes.

Parameters
  • value (float) – all predictions are the same as this value

  • stop_training (bool) – if all predictions match the specified value, should the training be (early) stopped

  • verbose (bool) – output messages

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

class aitoolbox.torchtrain.callbacks.basic.EmailNotification(sender_name, sender_email, recipient_email, project_name=None, experiment_name=None, aws_region='eu-west-1')[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Notify user via email about the training progression

Parameters
  • sender_name (str) – Name of the email sender

  • sender_email (str) – Email of the email sender

  • recipient_email (str) – Email where the email will be sent

  • project_name (str or None) – root name of the project

  • experiment_name (str or None) – name of the particular experiment

  • aws_region (str) – AWS SES region

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

on_train_end()[source]

Logic executed at the end of the overall training

Returns

None

get_metric_list_html()[source]

Generate performance metrics list HTML

Returns

HTML doc

Return type

str

get_hyperparams_html()[source]

Generate hyperparameters list HTML

Returns

HTML doc

Return type

str

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

class aitoolbox.torchtrain.callbacks.basic.LogUpload(log_file_path='~/project/training.log', fail_if_cloud_missing=True, project_name=None, experiment_name=None, local_model_result_folder_path=None, cloud_save_mode=None, bucket_name=None, cloud_dir_prefix=None)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractExperimentCallback

Upload logging file to the cloud storage

Uploading happens after each epoch and at the end of the training process.

Parameters
  • log_file_path (str) – path to the local logging file

  • fail_if_cloud_missing (bool) – should throw the exception if cloud saving is not available

  • project_name (str or None) – root name of the project

  • experiment_name (str or None) – name of the particular experiment

  • local_model_result_folder_path (str or None) – root local path where project folder will be created

  • cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk

  • bucket_name (str) – name of the bucket in the cloud storage

  • cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

on_train_end()[source]

Logic executed at the end of the overall training

Returns

None

upload_log_file()[source]
class aitoolbox.torchtrain.callbacks.basic.DataSubsetTestRun(num_train_batches=1, num_val_batches=0, num_test_batches=0)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Subset the provided data loaders to execute neural net only on a small dataset subset

This is especially useful when first developing the neural architectures and debugging them. Subsetting the full dataset helps with fast development iterations.

Parameters
  • num_train_batches (int) – number of the training data batches that are kept in the training dataset

  • num_val_batches (int) – number of the validation data batches that are kept in the validation dataset

  • num_test_batches (int) – number of the test data batches that are kept in the test dataset

on_train_begin()[source]

Logic executed at the beginning of the overall training

Returns

None

static subset_data_loader(data_loader, num_batches)[source]
on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

class aitoolbox.torchtrain.callbacks.basic.FunctionOnTrainLoop(fn_to_execute, tl_registration=False, epoch_begin=False, epoch_end=False, train_begin=False, train_end=False, batch_begin=False, batch_end=False, after_gradient_update=False, after_optimizer_step=False, execution_order=0, device_idx_execution=None)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Execute given function as a callback in the TrainLoop

Parameters
  • fn_to_execute (function) – function logic to be executed at the desired point of the TrainLoop. The function should take a single input as an argument which is the reference to the encapsulating TrainLoop object (self.train_loop_obj).

  • tl_registration (bool) – should execute on TrainLoop registration

  • epoch_begin (bool) – should execute at the beginning of the epoch

  • epoch_end (bool) – should execute at the end of the epoch

  • train_begin (bool) – should execute at the beginning of the training

  • train_end (bool) – should execute at the end of the training

  • batch_begin (bool) – should execute at the beginning of the batch

  • batch_end (bool) – should execute at the end of the batch

  • after_gradient_update (bool) – should execute after the gradient update

  • after_optimizer_step (bool) – should execute after the optimizer step

  • execution_order (int) – order of the callback execution. If all the used callbacks have the orders set to 0, than the callbacks are executed in the order they were registered.

execute_callback()[source]
on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

on_epoch_begin()[source]

Logic executed at the beginning of the epoch

Returns

None

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

on_train_begin()[source]

Logic executed at the beginning of the overall training

Returns

None

on_train_end()[source]

Logic executed at the end of the overall training

Returns

None

on_batch_begin()[source]

Logic executed before the batch is inserted into the model

Returns

None

on_batch_end()[source]

Logic executed after the batch is inserted into the model

Returns

None

on_after_gradient_update(optimizer_idx)[source]

Logic executed after the model gradients are updated

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise logic implemented here will not be executed by the TrainLoop.

Parameters

optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

Returns

None

on_after_optimizer_step()[source]

Logic executed after the optimizer does a new step and updates the model weights

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise logic implemented here will not be executed by the TrainLoop.

Returns

None