basic
- class aitoolbox.torchtrain.callbacks.basic.ListRegisteredCallbacks[source]
Bases:
AbstractCallback
List all the callbacks which are used in the current TrainLoop
- class aitoolbox.torchtrain.callbacks.basic.EarlyStopping(monitor='val_loss', min_delta=0.0, patience=0)[source]
Bases:
AbstractCallback
Early stopping of the training if the performance stops improving
- Parameters:
- class aitoolbox.torchtrain.callbacks.basic.ThresholdEarlyStopping(monitor, threshold, patience=0)[source]
Bases:
AbstractCallback
Early stopping of the training if the performance doesn’t reach the specified threshold
- Parameters:
monitor (str) – performance measure that is tracked to decide if performance reached the desired threshold
threshold (float) – performance threshold that needs to be exceeded in order to continue training
patience (int) – how many epochs the early stopper waits for the tracked performance to reach the desired threshold
- class aitoolbox.torchtrain.callbacks.basic.TerminateOnNaN(monitor='loss')[source]
Bases:
AbstractCallback
Terminate training if NaNs are predicted, thus metrics are NaN
- Parameters:
monitor (str) – performance measure that is tracked to decide if performance is improving during training
- class aitoolbox.torchtrain.callbacks.basic.AllPredictionsSame(value=0.0, stop_training=False, verbose=True)[source]
Bases:
AbstractCallback
Checks if all the predicted values are the same
Useful for example when dealing with extremely unbalanced classes.
- Parameters:
- class aitoolbox.torchtrain.callbacks.basic.EmailNotification(sender_name, sender_email, recipient_email, project_name=None, experiment_name=None, aws_region='eu-west-1')[source]
Bases:
AbstractCallback
Notify user via email about the training progression
- Parameters:
sender_name (str) – Name of the email sender
sender_email (str) – Email of the email sender
recipient_email (str) – Email where the email will be sent
project_name (str or None) – root name of the project
experiment_name (str or None) – name of the particular experiment
aws_region (str) – AWS SES region
- get_metric_list_html()[source]
Generate performance metrics list HTML
- Returns:
HTML doc
- Return type:
- class aitoolbox.torchtrain.callbacks.basic.LogUpload(log_file_path='~/project/training.log', fail_if_cloud_missing=True, project_name=None, experiment_name=None, local_model_result_folder_path=None, cloud_save_mode=None, bucket_name=None, cloud_dir_prefix=None)[source]
Bases:
AbstractExperimentCallback
Upload logging file to the cloud storage
Uploading happens after each epoch and at the end of the training process.
- Parameters:
log_file_path (str) – path to the local logging file
fail_if_cloud_missing (bool) – should throw the exception if cloud saving is not available
project_name (str or None) – root name of the project
experiment_name (str or None) – name of the particular experiment
local_model_result_folder_path (str or None) – root local path where project folder will be created
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved
- class aitoolbox.torchtrain.callbacks.basic.DataSubsetTestRun(num_train_batches=1, num_val_batches=0, num_test_batches=0)[source]
Bases:
AbstractCallback
Subset the provided data loaders to execute neural net only on a small dataset subset
This is especially useful when first developing the neural architectures and debugging them. Subsetting the full dataset helps with fast development iterations.
- Parameters:
- class aitoolbox.torchtrain.callbacks.basic.FunctionOnTrainLoop(fn_to_execute, tl_registration=False, epoch_begin=False, epoch_end=False, train_begin=False, train_end=False, batch_begin=False, batch_end=False, after_gradient_update=False, after_optimizer_step=False, execution_order=0, device_idx_execution=None)[source]
Bases:
AbstractCallback
Execute given function as a callback in the TrainLoop
- Parameters:
fn_to_execute (function) – function logic to be executed at the desired point of the TrainLoop. The function should take a single input as an argument which is the reference to the encapsulating TrainLoop object (self.train_loop_obj).
tl_registration (bool) – should execute on TrainLoop registration
epoch_begin (bool) – should execute at the beginning of the epoch
epoch_end (bool) – should execute at the end of the epoch
train_begin (bool) – should execute at the beginning of the training
train_end (bool) – should execute at the end of the training
batch_begin (bool) – should execute at the beginning of the batch
batch_end (bool) – should execute at the end of the batch
after_gradient_update (bool) – should execute after the gradient update
after_optimizer_step (bool) – should execute after the optimizer step
execution_order (int) – order of the callback execution. If all the used callbacks have the orders set to 0, then the callbacks are executed in the order they were registered.
- on_train_loop_registration()[source]
Execute callback initialization / preparation after the train_loop_object becomes available
- Returns:
None
- on_after_gradient_update(optimizer_idx)[source]
Logic executed after the model gradients are updated
To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise, logic implemented here will not be executed by the TrainLoop.
- Parameters:
optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.
- Returns:
None
- on_after_optimizer_step()[source]
Logic executed after the optimizer does a new step and updates the model weights
To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise, logic implemented here will not be executed by the TrainLoop.
- Returns:
None