pyiqa.archs.fid_arch¶
FID and clean-fid metric implementation.
- Codes are borrowed from the clean-fid project:
References
- [1] GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium.
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Sepp Hochreiter NeurIPS, 2017
- [2] On Aliased Resizing and Surprising Subtleties in GAN Evaluation
Gaurav Parmar, Richard Zhang, Jun-Yan Zhu CVPR, 2022
Module Contents¶
- class pyiqa.archs.fid_arch.ResizeDataset(files, mode, size=(299, 299))[source]¶
Bases:
torch.utils.data.DatasetA placeholder Dataset that enables parallelizing the resize operation using multiple CPU cores files: list of all files in the folder mode:
clean: use PIL resize before calculate features
legacy_pytorch: do not resize here, but before pytorch model
- pyiqa.archs.fid_arch.get_reference_statistics(name, res, mode='clean', split='test', metric='FID')[source]¶
Load precomputed reference statistics for commonly used datasets
- pyiqa.archs.fid_arch.frechet_distance(mu1, sigma1, mu2, sigma2, eps=1e-06)[source]¶
Numpy implementation of the Frechet Distance. The Frechet distance between two multivariate Gaussians X_1 ~ N(mu_1, C_1) and X_2 ~ N(mu_2, C_2) is
d^2 = ||mu_1 - mu_2||^2 + Tr(C_1 + C_2 - 2*sqrt(C_1*C_2)).
Stable version by Danica J. Sutherland. Params:
- mu1Numpy array containing the activations of a layer of the
inception net (like returned by the function ‘get_predictions’) for generated samples.
- mu2The sample mean over activations, precalculated on an
representative data set.
sigma1: The covariance matrix over activations for generated samples. sigma2: The covariance matrix over activations, precalculated on an
representative data set.
- pyiqa.archs.fid_arch.maximum_mean_discrepancy(feats1, feats2, kernel_type='polynomial', num_subsets=100, max_subset_size=1000)[source]¶
- pyiqa.archs.fid_arch.mmd_polynomial_kernel(feats1, feats2, num_subsets=100, max_subset_size=1000)[source]¶
Compute the KID score given the sets of features
- pyiqa.archs.fid_arch.mmd_rbf_kernel(x, y, sigma: float = 10.0, scale: int = 1000)[source]¶
Compute MMD with RBF kernel, ref to https://github.com/google-research/google-research/blob/master/cmmd/distance.py
- pyiqa.archs.fid_arch.get_folder_features(fdir, model=None, num_workers=12, batch_size=32, test_img_size=(299, 299), device=torch.device('cuda'), mode='clean', description='', verbose=True)[source]¶
Compute the inception features for a folder of image files
- class pyiqa.archs.fid_arch.DINOv2[source]¶
DINOv2 model for feature extraction.
Provides a wrapper for the DINOv2 vision transformer model for image feature extraction.
- class pyiqa.archs.fid_arch.FID(dims: int = 2048, backbone: str = 'inceptionv3')[source]¶
Bases:
torch.nn.ModuleImplements the Fréchet Inception Distance (FID) and Clean-FID metrics.
The FID measures the distance between the feature representations of two sets of images, one generated by a model and the other from a reference dataset.
- model¶
The feature extraction network.
- Type:
nn.Module
- test_img_size¶
Default image size for feature extraction.
- Type:
Tuple[int, int]
- forward(fdir1: str | None = None, fdir2: str | None = None, mode: str = 'clean', distance_type: str = 'frechet', kernel_type: str = 'polynomial', dataset_name: str | None = None, dataset_res: int = 1024, dataset_split: str = 'train', num_workers: int = 4, batch_size: int = 8, device: torch.device = torch.device('cuda'), verbose: bool = True, **kwargs: Any) float[source]¶
Compute the FID or Clean-FID score between two sets of images.
- Parameters:
fdir1 (Optional[str]) – Path to the first folder of images.
fdir2 (Optional[str]) – Path to the second folder of images.
mode (str, optional) – Calculation mode. Defaults to ‘clean’.
distance_type (str, optional) – Distance metric to use. Defaults to ‘frechet’.
kernel_type (str, optional) – Kernel type for MMD. Defaults to ‘polynomial’.
dataset_name (Optional[str], optional) – Reference dataset name. Defaults to None.
dataset_res (int, optional) – Reference dataset resolution. Defaults to 1024.
dataset_split (str, optional) – Reference dataset split. Defaults to ‘train’.
num_workers (int, optional) – Number of workers for data loading. Defaults to 4.
batch_size (int, optional) – Batch size for processing. Defaults to 8.
device (torch.device, optional) – Computation device. Defaults to cuda.
verbose (bool, optional) – Print progress messages. Defaults to True.
- Returns:
FID or distance score between image sets.
- Return type:
float
- Raises:
ValueError – For invalid input combinations or parameters.