model_save

class aitoolbox.torchtrain.callbacks.model_save.ModelCheckpoint(project_name, experiment_name, local_model_result_folder_path, hyperparams, cloud_save_mode='s3', bucket_name='model-result', cloud_dir_prefix='', rm_subopt_local_models=False, num_best_checkpoints_kept=2)[source]

Bases: AbstractCallback

Check-point save the model during training to disk or also to S3 / GCS cloud storage

Parameters:

project_name (str) – root name of the project
experiment_name (str) – name of the particular experiment
local_model_result_folder_path (str) – root local path where project folder will be created
hyperparams (dict) – used hyper-parameters. When running the TrainLoop from jupyter notebook in order to ensure the python experiment file copying to the experiment folder, the user needs to manually specify the python file path as the value for the experiment_file_path key. If running the training directly from the terminal the path deduction is done automatically.
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved
rm_subopt_local_models (bool or str) – if True, the deciding metric is set to ‘loss’. Give string metric name to set it as a deciding metric for suboptimal model removal. If metric name consists of substring ‘loss’ the metric minimization is done otherwise metric maximization is done
num_best_checkpoints_kept (int) – number of best performing models which are kept when removing suboptimal model checkpoints

on_epoch_end()[source]

Logic executed at the end of the epoch

Returns:: None

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:: None

save_hyperparams()[source]

class aitoolbox.torchtrain.callbacks.model_save.ModelIterationCheckpoint(save_frequency, project_name, experiment_name, local_model_result_folder_path, hyperparams, cloud_save_mode='s3', bucket_name='model-result', cloud_dir_prefix='', rm_subopt_local_models=False, num_best_checkpoints_kept=2)[source]

Bases: ModelCheckpoint

Check-point save the model during training to disk or also to S3 / GCS cloud storage

Parameters:

save_frequency (int) – frequency of saving the model checkpoint every specified number of training iterations
project_name (str) – root name of the project
experiment_name (str) – name of the particular experiment
local_model_result_folder_path (str) – root local path where project folder will be created
hyperparams (dict) – used hyper-parameters. When running the TrainLoop from jupyter notebook in order to ensure the python experiment file copying to the experiment folder, the user needs to manually specify the python file path as the value for the experiment_file_path key. If running the training directly from the terminal the path deduction is done automatically.
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved
rm_subopt_local_models (bool or str) – if True, the deciding metric is set to ‘loss’. Give string metric name to set it as a deciding metric for suboptimal model removal. If metric name consists of substring ‘loss’ the metric minimization is done otherwise metric maximization is done
num_best_checkpoints_kept (int) – number of best performing models which are kept when removing suboptimal model checkpoints

on_batch_end()[source]

Logic executed after the batch is inserted into the model

Returns:: None

class aitoolbox.torchtrain.callbacks.model_save.ModelTrainEndSave(project_name, experiment_name, local_model_result_folder_path, hyperparams, val_result_package=None, test_result_package=None, cloud_save_mode='s3', bucket_name='model-result', cloud_dir_prefix='')[source]

Bases: AbstractCallback

At the end of training execute model performance evaluation, build result package report and save it: together with the final model to local disk and possibly to S3 / GCS cloud storage

Parameters:

project_name (str) – root name of the project
experiment_name (str) – name of the particular experiment
local_model_result_folder_path (str) – root local path where project folder will be created
hyperparams (dict) – used hyper-parameters. When running the TrainLoop from jupyter notebook in order to ensure the python experiment file copying to the experiment folder, the user needs to manually specify the python file path as the value for the experiment_file_path key. If running the training directly from the terminal the path deduction is done automatically.
val_result_package (AbstractResultPackage) – result package to be evaluated on the validation dataset
test_result_package (AbstractResultPackage) – result package to be evaluated on the test dataset
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved

on_train_end()[source]

Logic executed at the end of the overall training

Returns:: None

on_train_loop_registration()[source]

Execute callback initialization / preparation after the train_loop_object becomes available

Returns:: None

save_hyperparams()[source]

check_result_packages()[source]