multi_loss_optim

class aitoolbox.torchtrain.multi_loss_optim.MultiLoss(loss_dict, loss_optimizer_map=None, retain_graph_until_last=True)[source]

Bases: MutableMapping

Multiple loss wrapper for TrainLoop based training

Internally this class is based on a dict. On the outside it can behave the same as a python dict with several multi-loss specific extensions.

Parameters:
  • loss_dict (dict) – dict of loss objects which are used to calculate losses in the TrainLoop

  • loss_optimizer_map (dict or None) – dict mapping the loss name to the corresponding optimizer’s index in the MultiOptimizer. If this parameter is left to None the mapping is automatically created by assigning values from range(len(loss_dict)) as corresponding optimizer indices.

  • retain_graph_until_last (bool) – when calling backward should retain_graph option be enabled for all but last loss tensor

backward(optimizer_idx, iteration, amp_grad_scaler)[source]

Executes backward() for the specific loss based on provided optimizer_idx

Parameters:
  • optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

  • iteration (int) – Current iteration index. Not used in the most simple setup but provided in case of more elaborate loss backward logic is devised.

  • amp_grad_scaler (torch.cuda.amp.GradScaler) – AMP GradScaler. If scaler enabled parameter is set to False the loss is still passed to it, but it gets returned unscaled so the behaviour is as it is in the case of non-AMP training.

Returns:

None

item()[source]
numpy()[source]
detach()[source]
cpu(*args, **kwargs)[source]
cuda(*args, **kwargs)[source]
to(*args, **kwargs)[source]
property device
get_loss_dict()[source]
class aitoolbox.torchtrain.multi_loss_optim.MultiOptimizer(optimizer_list)[source]

Bases: object

Multiple optimizer wrapper for TrainLoop based training

Parameters:

optimizer_list (list) – list of optimizer objects which are used in the TrainLoop

step(optimizer_idx, iteration, amp_grad_scaler)[source]

Execute step for optimizer at the specified index

Parameters:
  • optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

  • iteration (int) – Current iteration index. Not used in the most simple setup but provided in case of more elaborate loss backward logic is devised.

  • amp_grad_scaler (torch.cuda.amp.GradScaler) – AMP GradScaler. If scaler enabled parameter is set to False the optimizer have it’s normal step() method called without applying the AMP mandated unscaling beforehand. In this respect the behaviour will be the same as in the non-AMP training.

Returns:

None

zero_grad(optimizer_idx, iteration)[source]

Execute zero_grad for optimizer at the specified index

Parameters:
  • optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.

  • iteration (int) – Current iteration index. Not used in the most simple setup but provided in case of more elaborate loss backward logic is devised.

Returns:

None

state_dict()[source]
load_state_dict(state_dict_list)[source]