model_save
- class aitoolbox.torchtrain.callbacks.model_save.ModelCheckpoint(project_name, experiment_name, local_model_result_folder_path, hyperparams, cloud_save_mode='s3', bucket_name='model-result', cloud_dir_prefix='', rm_subopt_local_models=False, num_best_checkpoints_kept=2)[source]
Bases:
AbstractCallback
Check-point save the model during training to disk or also to S3 / GCS cloud storage
- Parameters:
project_name (str) – root name of the project
experiment_name (str) – name of the particular experiment
local_model_result_folder_path (str) – root local path where project folder will be created
hyperparams (dict) – used hyper-parameters. When running the TrainLoop from jupyter notebook in order to ensure the python experiment file copying to the experiment folder, the user needs to manually specify the python file path as the value for the experiment_file_path key. If running the training directly from the terminal the path deduction is done automatically.
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved
rm_subopt_local_models (bool or str) – if True, the deciding metric is set to ‘loss’. Give string metric name to set it as a deciding metric for suboptimal model removal. If metric name consists of substring ‘loss’ the metric minimization is done otherwise metric maximization is done
num_best_checkpoints_kept (int) – number of best performing models which are kept when removing suboptimal model checkpoints
- class aitoolbox.torchtrain.callbacks.model_save.ModelIterationCheckpoint(save_frequency, project_name, experiment_name, local_model_result_folder_path, hyperparams, cloud_save_mode='s3', bucket_name='model-result', cloud_dir_prefix='', rm_subopt_local_models=False, num_best_checkpoints_kept=2)[source]
Bases:
ModelCheckpoint
Check-point save the model during training to disk or also to S3 / GCS cloud storage
- Parameters:
save_frequency (int) – frequency of saving the model checkpoint every specified number of training iterations
project_name (str) – root name of the project
experiment_name (str) – name of the particular experiment
local_model_result_folder_path (str) – root local path where project folder will be created
hyperparams (dict) – used hyper-parameters. When running the TrainLoop from jupyter notebook in order to ensure the python experiment file copying to the experiment folder, the user needs to manually specify the python file path as the value for the experiment_file_path key. If running the training directly from the terminal the path deduction is done automatically.
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved
rm_subopt_local_models (bool or str) – if True, the deciding metric is set to ‘loss’. Give string metric name to set it as a deciding metric for suboptimal model removal. If metric name consists of substring ‘loss’ the metric minimization is done otherwise metric maximization is done
num_best_checkpoints_kept (int) – number of best performing models which are kept when removing suboptimal model checkpoints
- class aitoolbox.torchtrain.callbacks.model_save.ModelTrainEndSave(project_name, experiment_name, local_model_result_folder_path, hyperparams, val_result_package=None, test_result_package=None, cloud_save_mode='s3', bucket_name='model-result', cloud_dir_prefix='')[source]
Bases:
AbstractCallback
- At the end of training execute model performance evaluation, build result package report and save it
together with the final model to local disk and possibly to S3 / GCS cloud storage
- Parameters:
project_name (str) – root name of the project
experiment_name (str) – name of the particular experiment
local_model_result_folder_path (str) – root local path where project folder will be created
hyperparams (dict) – used hyper-parameters. When running the TrainLoop from jupyter notebook in order to ensure the python experiment file copying to the experiment folder, the user needs to manually specify the python file path as the value for the experiment_file_path key. If running the training directly from the terminal the path deduction is done automatically.
val_result_package (AbstractResultPackage) – result package to be evaluated on the validation dataset
test_result_package (AbstractResultPackage) – result package to be evaluated on the test dataset
cloud_save_mode (str or None) – Storage destination selector. For AWS S3: ‘s3’ / ‘aws_s3’ / ‘aws’ For Google Cloud Storage: ‘gcs’ / ‘google_storage’ / ‘google storage’ Everything else results just in local storage to disk
bucket_name (str) – name of the bucket in the cloud storage
cloud_dir_prefix (str) – path to the folder inside the bucket where the experiments are going to be saved