gradient

class aitoolbox.torchtrain.callbacks.gradient.GradientCallbackBase(callback_name, execution_order=0)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Base abstract class for gradient related callbacks

It has not implemented logic except for the the turning enabling of the grad_cb_used inside TrainLoop as part of the on_train_loop_registration(). Consequently, this potentially repeated task in every gradient calculation callback doesn’t need to be done for every implemented callback.

Parameters
  • callback_name (str) – name of the callback

  • execution_order (int) – order of the callback execution. If all the used callbacks have the orders set to 0, than the callbacks are executed in the order they were registered.

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

class aitoolbox.torchtrain.callbacks.gradient.GradValueClip(max_grad_value)[source]

Bases: aitoolbox.torchtrain.callbacks.gradient.GradientCallbackBase

Gradient value clipping

Parameters

max_grad_value (int or float) – maximum allowed value of the gradients

on_after_gradient_update(optimizer_idx)[source]

Logic executed after the model gradients are updated

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise logic implemented here will not be executed by the TrainLoop.

Parameters

optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

Returns

None

class aitoolbox.torchtrain.callbacks.gradient.GradNormClip(max_grad_norm, **kwargs)[source]

Bases: aitoolbox.torchtrain.callbacks.gradient.GradientCallbackBase

Gradient norm clipping

Parameters
on_after_gradient_update(optimizer_idx)[source]

Logic executed after the model gradients are updated

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise logic implemented here will not be executed by the TrainLoop.

Parameters

optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

Returns

None

class aitoolbox.torchtrain.callbacks.gradient.GradientStatsPrint(model_layers_extract_def, on_every_grad_update=False)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractCallback

Model gradients statistics reporting

Parameters
  • model_layers_extract_def (lambda or function) – lambda/function accepting model as the input and returning a list of all the layers in the model for which the gradient stats should be calculated

  • on_every_grad_update (bool) – should the gradient stats be calculated on every gradient update, e.g. after every batch or only at the end of the epoch

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

on_after_gradient_update(optimizer_idx)[source]

Logic executed after the model gradients are updated

To ensure the execution of this callback enable the self.train_loop_obj.grad_cb_used = True option in the on_train_loop_registration(). Otherwise logic implemented here will not be executed by the TrainLoop.

Parameters

optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

Returns

None

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

gradients_report()[source]
class aitoolbox.torchtrain.callbacks.gradient.GradDistributionPlot(model_layers_extract_def, grad_plots_dir_name='grad_distribution', file_format='png', project_name=None, experiment_name=None, local_model_result_folder_path=None, cloud_save_mode=None, bucket_name=None, cloud_dir_prefix=None)[source]

Bases: aitoolbox.torchtrain.callbacks.abstract.AbstractExperimentCallback

Plot layers’ gradient distributions after every epoch

Parameters
  • model_layers_extract_def (lambda or function) – lambda/function accepting model as the input and returning a list of all the layers in the model for which the gradient stats should be calculated

  • grad_plots_dir_name (str) – name of the folder where gradient distribution plots are saved after every epoch

  • file_format (str) – output file format. Can be either ‘png’ for saving separate images or ‘pdf’ for combining all the plots into a single pdf file.

  • project_name (str or None) – root name of the project

  • experiment_name (str or None) – name of the particular experiment

  • local_model_result_folder_path (str or None) – root local path where project folder will be created

  • cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk

  • bucket_name (str) – name of the bucket in the cloud storage

  • cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns

None

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns

None

gradient_plot()[source]
save_to_cloud(saved_plot_paths)[source]
create_plot_dirs()[source]
prepare_results_saver()[source]