manual_SQuAD2

aitoolbox.nlp.dataset.SQuAD2.deprecated.manual_SQuAD2.get_dataset_local_copy(local_dataset_folder_path, protect_local_folder=True)[source]

Interface method for getting a local copy of SQuAD2 dataset

If a local copy is not found, dataset is automatically downloaded from S3.

Parameters
  • local_dataset_folder_path (str) –

  • protect_local_folder (bool) –

Returns

None

class aitoolbox.nlp.dataset.SQuAD2.deprecated.manual_SQuAD2.SQuAD2DatasetPrepareResult(dataset_name, dataset_type='train', save_vocab=True)[source]

Bases: object

Parameters
  • dataset_name

  • dataset_type

  • vocab_memory_safeguard

store_data(context_text_list, question_text_list, answer_text_list, orig_answer_start_end_tuple_list, answer_start_end_tuple_list)[source]
Parameters
  • context_text_list

  • question_text_list

  • answer_text_list

  • orig_answer_start_end_tuple_list

  • answer_start_end_tuple_list

Returns:

store_vocab(vocab)[source]
Parameters

vocab

Returns:

store_max_context_questions_max_len(max_ctx_qs_len)[source]
Parameters

max_ctx_qs_len

Returns:

class aitoolbox.nlp.dataset.SQuAD2.deprecated.manual_SQuAD2.SQuAD2DataPreparation(train_path, dev_path, skip_is_impossible=True, skip_examples_w_span=True)[source]

Bases: object

Parameters
  • train_path

  • dev_path

  • skip_is_impossible

  • skip_examples_w_span

process_data(dump_folder_path=None)[source]
Parameters

dump_folder_path

Returns:

vectorize_data(train_data=None, dev_data=None, vocab=None, dump_folder_path=None)[source]
Parameters
  • train_data

  • dev_data

  • vocab

  • dump_folder_path

Returns:

get_vectorized_data_prep_result(data_prep_result, vocab)[source]
Parameters
  • data_prep_result

  • vocab

Returns:

build_dataset(data_json, dataset_name)[source]
Parameters
  • data_json

  • dataset_name

Returns:

process_context_text(context_text, is_train)[source]
Parameters
  • context_text

  • is_train

Returns:

process_question_text(question_text, is_train)[source]
Parameters
  • question_text

  • is_train

Returns:

process_answer_text(answer_text, is_train)[source]
Parameters
  • answer_text

  • is_train

Returns:

load_json_file(file_path)[source]
Parameters

file_path

Returns:

load_prep_dumps(dump_folder_path)[source]
Parameters

dump_folder_path

Returns:

load_vect_prep_dumps(dump_folder_path)[source]
Parameters

dump_folder_path

Returns: