ducho.multimodal.visual package
ducho.multimodal.visual.VisualFeatureExtractor module
- class ducho.multimodal.visual.VisualFeatureExtractor.VisualFeatureExtractor(gpu='-1')[source]
This class represents the Visual Feature Extractor utilized for feature extraction.
- extract_feature(image)[source]
This function extracts features from the input image data. Prior to calling this function, the framework, model, and layer have to be configured using their respective set methods.
- Parameters:
image – The preprocessed image data.
- Returns:
A numpy array representing the extracted features, which will be stored in a .npy file using the appropriate method of the Dataset Class.
- set_framework(backend_libraries_list)
Set the framework(s) for use (e.g. tensorflow, pytorch, etc.).
- Parameters:
backend_libraries_list (List[str]) – A list of strings representing the framework(s) to utilize. It is acceptable to have only one item in the list.
- Returns:
None
ducho.multimodal.visual.VisualDataset module
- class ducho.multimodal.visual.VisualDataset.VisualDataset(input_directory_path, output_directory_path, model_name='VGG19', reshape=(224, 224))[source]
This class represents the Visual Dataset used for the data loading process.
- create_output_file(index, extracted_data, model_layer, fusion=None)
Create an output numpy file with extracted data. (E.g. datasetFolder/framework/modelName/modelLayer/fileName.npy)
- Parameters:
index (int) – The index to the filenames list.
extracted_data (Any) – The data to be stored in the .npy file.
model_layer (str) – The name of the layer.
fusion (str, optional) – The type of fusion for multimodal models.
- Returns:
None
- set_framework(backend_libraries_list)
Set the framework(s) to use.
- Parameters:
backend_libraries_list (list of str) – A list of strings representing the framework(s) to use. It’s acceptable to have only one item in the list.
- Returns:
None
- set_mean_std(mean: Tensor, std: Tensor) None[source]
Set custom values of mean and std for z-score normalization.
- Parameters:
mean – torch.Tensor containing the desired mean along the three channels.
std – torch.Tensor containing the desired standard deviation along the three channels.
- Returns:
None