data_access

class aitoolbox.cloud.AWS.data_access.BaseDataSaver(bucket_name='model-result')[source]

Bases: object

Base class implementing S3 file saving logic

Parameters

bucket_name (str) – S3 bucket into which the files will be saved

save_file(local_file_path, cloud_file_path)[source]

Save / upload file on local drive to the AWS S3

Parameters
  • local_file_path (str) – path to the file on the local drive

  • cloud_file_path (str) – destination where the file will be saved on S3 inside the specified bucket

Returns

None

save_folder(local_folder_path, cloud_folder_path)[source]

Save / upload the contents of the local folder on the local drive to AWS S3

This function uploads the contents inside the provided local folder. If the encapsulating folder should also be created on the S3, specify the folder name at the end of the cloud_folder_path.

For example if:

local_folder_path = '~/bla/my_folder'

and we want to have the content of my_folder also placed into the folder my_folder on S3 then append my_folder at the end of the cloud_folder_path:

cloud_folder_path = 'cloud_bla/my_folder'

Parameters
  • local_folder_path (str) – local path to the folder which should be uploaded

  • cloud_folder_path (str) – destination path on S3 where the folder and its content should be uploaded

Returns

None

class aitoolbox.cloud.AWS.data_access.BaseDataLoader(bucket_name='dataset-store', local_base_data_folder_path='~/project/data')[source]

Bases: object

Base class implementing S3 file downloading logic

Parameters
  • bucket_name (str) – S3 bucket from which the files will be downloaded

  • local_base_data_folder_path (str) – local main experiment saving folder

load_file(cloud_file_path, local_file_path)[source]

Download the file AWS S3 to the local drive

Parameters
  • cloud_file_path (str) – location where the file is saved on S3 inside the specified bucket

  • local_file_path (str) – destination path where the file will be downloaded to the local drive

Returns

None

exists_local_data_folder(data_folder_name, protect_local_folder=True)[source]

Check if a specific folder exists in the base data folder

For example, Squad dataset folder inside /data folder, or pretrained_models folder inside /model_results folder

Parameters
  • data_folder_name (str) –

  • protect_local_folder (bool) –

Returns

Return type

bool

preproc_dataset_available(preproc_dataset_name)[source]
class aitoolbox.cloud.AWS.data_access.AbstractDatasetFetcher[source]

Bases: abc.ABC

abstract fetch_dataset(dataset_name=None, protect_local_folder=True)[source]
Parameters
  • dataset_name (str or None) –

  • protect_local_folder (bool) –

Returns

None

class aitoolbox.cloud.AWS.data_access.SQuAD2DatasetFetcher(bucket_name='dataset-store', local_dataset_folder_path='~/project/data')[source]

Bases: aitoolbox.cloud.AWS.data_access.AbstractDatasetFetcher, aitoolbox.cloud.AWS.data_access.BaseDataLoader

Parameters
  • bucket_name (str) –

  • local_dataset_folder_path (str) –

fetch_dataset(dataset_name=None, protect_local_folder=True)[source]
Parameters
  • dataset_name (None) – no effect here

  • protect_local_folder (bool) –

Returns

None

class aitoolbox.cloud.AWS.data_access.QAngarooDatasetFetcher(bucket_name='dataset-store', local_dataset_folder_path='~/project/data')[source]

Bases: aitoolbox.cloud.AWS.data_access.AbstractDatasetFetcher, aitoolbox.cloud.AWS.data_access.BaseDataLoader

Parameters
  • bucket_name (str) –

  • local_dataset_folder_path (str) –

fetch_dataset(dataset_name=None, protect_local_folder=True)[source]
Parameters
  • dataset_name (str or None) – possible options: medhop, wikihop or None

  • protect_local_folder (bool) –

Returns

None

class aitoolbox.cloud.AWS.data_access.CNNDailyMailDatasetFetcher(bucket_name='dataset-store', local_dataset_folder_path='~/project/data')[source]

Bases: aitoolbox.cloud.AWS.data_access.AbstractDatasetFetcher, aitoolbox.cloud.AWS.data_access.BaseDataLoader

Parameters
  • bucket_name (str) –

  • local_dataset_folder_path (str) –

fetch_dataset(dataset_name=None, protect_local_folder=True)[source]
Parameters
  • dataset_name (None) – no effect here

  • protect_local_folder (bool) –

Returns

None

fetch_preprocessed_dataset(preprocess_name, protect_local_folder=True)[source]
Parameters
  • preprocess_name (str) –

  • protect_local_folder (bool) –

Returns

None

class aitoolbox.cloud.AWS.data_access.HotpotQADatasetFetcher(bucket_name='dataset-store', local_dataset_folder_path='~/project/data')[source]

Bases: aitoolbox.cloud.AWS.data_access.AbstractDatasetFetcher, aitoolbox.cloud.AWS.data_access.BaseDataLoader

https://hotpotqa.github.io/ https://arxiv.org/pdf/1809.09600.pdf

https://github.com/hotpotqa/hotpot

Parameters
  • bucket_name (str) –

  • local_dataset_folder_path (str) –

fetch_dataset(dataset_name=None, protect_local_folder=True)[source]
Parameters
  • dataset_name (None) – no effect here

  • protect_local_folder (bool) –

Returns

None

class aitoolbox.cloud.AWS.data_access.TriviaQADatasetFetcher(bucket_name='dataset-store', local_dataset_folder_path='~/project/data')[source]

Bases: aitoolbox.cloud.AWS.data_access.AbstractDatasetFetcher, aitoolbox.cloud.AWS.data_access.BaseDataLoader

Parameters
  • bucket_name (str) –

  • local_dataset_folder_path (str) –

fetch_dataset(dataset_name=None, protect_local_folder=True)[source]
Parameters
  • dataset_name (str or None) – possible options: rc, unfiltered or None

  • protect_local_folder (bool) –

Returns

None