multi_loss_optim
- class aitoolbox.torchtrain.multi_loss_optim.MultiLoss(loss_dict, loss_optimizer_map=None, retain_graph_until_last=True)[source]
Bases:
MutableMapping
Multiple loss wrapper for TrainLoop based training
Internally this class is based on a dict. On the outside it can behave the same as a python dict with several multi-loss specific extensions.
- Parameters:
loss_dict (dict) – dict of loss objects which are used to calculate losses in the TrainLoop
loss_optimizer_map (dict or None) – dict mapping the loss name to the corresponding optimizer’s index in the
MultiOptimizer
. If this parameter is left toNone
the mapping is automatically created by assigning values fromrange(len(loss_dict))
as corresponding optimizer indices.retain_graph_until_last (bool) – when calling backward should
retain_graph
option be enabled for all but last loss tensor
- backward(optimizer_idx, iteration, amp_grad_scaler)[source]
Executes backward() for the specific loss based on provided optimizer_idx
- Parameters:
optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.
iteration (int) – Current iteration index. Not used in the most simple setup but provided in case of more elaborate loss backward logic is devised.
amp_grad_scaler (torch.cuda.amp.GradScaler) – AMP GradScaler. If scaler
enabled
parameter is set to False the loss is still passed to it, but it gets returned unscaled so the behaviour is as it is in the case of non-AMP training.
- Returns:
None
- property device
- class aitoolbox.torchtrain.multi_loss_optim.MultiOptimizer(optimizer_list)[source]
Bases:
object
Multiple optimizer wrapper for TrainLoop based training
- Parameters:
optimizer_list (list) – list of optimizer objects which are used in the TrainLoop
- step(optimizer_idx, iteration, amp_grad_scaler)[source]
Execute step for optimizer at the specified index
- Parameters:
optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.
iteration (int) – Current iteration index. Not used in the most simple setup but provided in case of more elaborate loss backward logic is devised.
amp_grad_scaler (torch.cuda.amp.GradScaler) – AMP GradScaler. If scaler
enabled
parameter is set to False the optimizer have it’s normal step() method called without applying the AMP mandated unscaling beforehand. In this respect the behaviour will be the same as in the non-AMP training.
- Returns:
None
- zero_grad(optimizer_idx, iteration)[source]
Execute zero_grad for optimizer at the specified index
- Parameters:
optimizer_idx (int) – index of the current optimizer. Mostly useful when using multiple optimizers. When only a single optimizer is used this parameter can be ignored.
iteration (int) – Current iteration index. Not used in the most simple setup but provided in case of more elaborate loss backward logic is devised.
- Returns:
None