warmup

class aitoolbox.torchtrain.schedulers.warmup.ConstantWithWarmupScheduler(num_warmup_steps, last_epoch=-1, **kwargs)[source]

Bases: LambdaLRScheduler

Constant scheduler with the initial warmup

Schedule with a constant learning rate preceded by a warmup period during which the learning rate increases linearly between 0 and the initial lr set in the optimizer.

https://huggingface.co/transformers/main_classes/optimizer_schedules.html#transformers.get_constant_schedule_with_warmup

Parameters:
  • num_warmup_steps (int) – The number of steps for the warmup phase

  • last_epoch (int) – The index of the last epoch when resuming training

  • **kwargs – learning rate scheduler additional parameters

class aitoolbox.torchtrain.schedulers.warmup.CosineWithWarmupScheduler(num_warmup_steps, num_training_steps, num_cycles=0.5, last_epoch=-1, **kwargs)[source]

Bases: LambdaLRScheduler

Cosine decreasing scheduler with the initial warmup

Schedule with a learning rate that decreases following the values of the cosine function between the initial lr set in the optimizer to 0, after a warmup period during which it increases linearly between 0 and the initial lr set in the optimizer.

https://huggingface.co/transformers/main_classes/optimizer_schedules.html#transformers.get_cosine_schedule_with_warmup

Parameters:
  • num_warmup_steps (int) – The number of steps for the warmup phase

  • num_training_steps (int) – The total number of training steps

  • num_cycles (float) – The number of waves in the cosine schedule (the defaults is to just decrease from the max value to 0 following a half-cosine).

  • last_epoch (int) – The index of the last epoch when resuming training

  • **kwargs – learning rate scheduler additional parameters

class aitoolbox.torchtrain.schedulers.warmup.HardRestartsCosineWithWarmupScheduler(num_warmup_steps, num_training_steps, num_cycles=0.5, last_epoch=-1, **kwargs)[source]

Bases: LambdaLRScheduler

Cosine scheduler with hard restarts and the initial warmup

Schedule with a learning rate that decreases following the values of the cosine function between the initial lr set in the optimizer to 0, with several hard restarts, after a warmup period during which it increases linearly between 0 and the initial lr set in the optimizer.

https://huggingface.co/transformers/main_classes/optimizer_schedules.html#transformers.get_cosine_with_hard_restarts_schedule_with_warmup

Parameters:
  • num_warmup_steps (int) – The number of steps for the warmup phase

  • num_training_steps (int) – The total number of training steps

  • num_cycles (float) – The number of waves in the cosine schedule (the defaults is to just decrease from the max value to 0 following a half-cosine).

  • last_epoch (int) – The index of the last epoch when resuming training

  • **kwargs – learning rate scheduler additional parameters

class aitoolbox.torchtrain.schedulers.warmup.LinearWithWarmupScheduler(num_warmup_steps, num_training_steps, last_epoch=-1, **kwargs)[source]

Bases: LambdaLRScheduler

Linearly decreasing scheduler with the initial warmup

Schedule with a learning rate that decreases linearly from the initial lr set in the optimizer to 0, after a warmup period during which it increases linearly from 0 to the initial lr set in the optimizer.

Especially useful in the context of BERT-like models. Implementation based on HuggingFace Transformers library’s get_linear_schedule_with_warmup() method.

https://huggingface.co/transformers/main_classes/optimizer_schedules.html#transformers.get_linear_schedule_with_warmup

Parameters:
  • num_warmup_steps (int) – The number of steps for the warmup phase

  • num_training_steps (int) – The total number of training steps

  • last_epoch (int) – The index of the last epoch when resuming training

  • **kwargs – learning rate scheduler additional parameters