ducho.multimodal.multiple.visual_textual package
ducho.multimodal.multiple.visual_textual.VisualTextualFeatureExtractor module
- class ducho.multimodal.multiple.visual_textual.VisualTextualFeatureExtractor.VisualTextualFeatureExtractor(gpu='-1')[source]
This class represents the Visual-Textual Feature Extractor utilized for feature extraction.
- extract_feature(sample_input)[source]
This function extracts features from the input image and textual data. Prior to calling this function, the framework, model, and layer have to be configured using their respective set methods.
- Parameters:
sample_input – The preprocessed data.
- Returns:
Two numpy array representing the extracted features, which will be stored in two .npy files using the appropriate method of the Dataset Class.
- set_framework(backend_libraries_list)
Set the framework(s) for use (e.g. tensorflow, pytorch, etc.).
- Parameters:
backend_libraries_list (List[str]) – A list of strings representing the framework(s) to utilize. It is acceptable to have only one item in the list.
- Returns:
None
ducho.multimodal.multiple.visual_textual.VisualTextualDataset module
- class ducho.multimodal.multiple.visual_textual.VisualTextualDataset.VisualTextualDataset(input_directory_path, output_directory_path, columns=None, model_name='openai/clip-vit-base-patch32', reshape=(224, 224))[source]
This class represents the Visual-Textual Dataset used for the data loading process.
- create_output_file(index, extracted_data, model_layer, fusion=None)[source]
This procedure is responsible for generating output files.
- Parameters:
index – The index of the file to be processed.
extracted_data – A tuple containing the extracted features.
model_layer – The name of the output layer for the selected model.
fusion – A string indicating the type of fusion to perform. If None, the procedure generates two separate output files. Otherwise, it creates a single output file based on the specified fusion type.
- Returns:
None