ducho.multimodal.textual package

ducho.multimodal.textual.TextualFeatureExtractor module

class ducho.multimodal.textual.TextualFeatureExtractor.TextualFeatureExtractor(gpu='-1')[source]

Bases: FeatureExtractorFather

This class represents the Textual Feature Extractor utilized for feature extraction.

extract_feature(sample_input)[source]

This function extracts features from the input text. Prior to calling this function, the framework, model, and layer have to be configured using their respective set methods.

Parameters:

sample_input – The preprocessed textual data.

Returns:

A numpy array representing the extracted features, which will be stored in a .npy file using the appropriate method of the Dataset Class.

set_model(model)[source]

This procedure facilitates the configuration of the Textual Feature Extractor model using YAML specifications.

Parameters:

model – The row of the YAML file containing the user’s specifications.

Returns:

None

ducho.multimodal.textual.TextualDataset module

class ducho.multimodal.textual.TextualDataset.TextualDataset(input_directory_path, output_directory_path, columns=None)[source]

Bases: DatasetFather

This class represents the Textual Dataset used for the data loading process.

create_output_file(index, extracted_data, model_layer, fusion=None)[source]

Overwrites the method of the Father class because all the Strings come from the same file, and it only changes the row.

Parameters:
  • index – It indicates the row of the String.

  • extracted_data – The output to put in the file.

  • model_layer – The layer used, it is a String, it will be shown on the final name.

  • fusion – Fusion is not used for TextualDataset’s objects.

Returns:

None

set_preprocessing_flag(preprocessing_flag)[source]
set_type_of_extraction(type_of_extraction)[source]

It set the origin of the data, from item or users interactions, it is needed later to read correctly the tsv.

Parameters:

type_of_extraction – ‘items’ or ‘interactions’.

Returns:

None

ducho.multimodal.textual.TextualDataset.complex_spit_of_list_of_string(sample, splitter)[source]