pyiqa.archs.fid_arch

FID and clean-fid metric implementation.

Codes are borrowed from the clean-fid project:

References

[1] GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium.

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, Sepp Hochreiter NeurIPS, 2017

[2] On Aliased Resizing and Surprising Subtleties in GAN Evaluation

Gaurav Parmar, Richard Zhang, Jun-Yan Zhu CVPR, 2022

Module Contents

pyiqa.archs.fid_arch.default_model_urls[source]
class pyiqa.archs.fid_arch.ResizeDataset(files, mode, size=(299, 299))[source]

Bases: torch.utils.data.Dataset

A placeholder Dataset that enables parallelizing the resize operation using multiple CPU cores files: list of all files in the folder mode:

  • clean: use PIL resize before calculate features

  • legacy_pytorch: do not resize here, but before pytorch model

pyiqa.archs.fid_arch.get_reference_statistics(name, res, mode='clean', split='test', metric='FID')[source]

Load precomputed reference statistics for commonly used datasets

pyiqa.archs.fid_arch.frechet_distance(mu1, sigma1, mu2, sigma2, eps=1e-06)[source]

Numpy implementation of the Frechet Distance. The Frechet distance between two multivariate Gaussians X_1 ~ N(mu_1, C_1) and X_2 ~ N(mu_2, C_2) is

d^2 = ||mu_1 - mu_2||^2 + Tr(C_1 + C_2 - 2*sqrt(C_1*C_2)).

Stable version by Danica J. Sutherland. Params:

mu1Numpy array containing the activations of a layer of the

inception net (like returned by the function ‘get_predictions’) for generated samples.

mu2The sample mean over activations, precalculated on an

representative data set.

sigma1: The covariance matrix over activations for generated samples. sigma2: The covariance matrix over activations, precalculated on an

representative data set.

pyiqa.archs.fid_arch.maximum_mean_discrepancy(feats1, feats2, kernel_type='polynomial', num_subsets=100, max_subset_size=1000)[source]
pyiqa.archs.fid_arch.mmd_polynomial_kernel(feats1, feats2, num_subsets=100, max_subset_size=1000)[source]

Compute the KID score given the sets of features

pyiqa.archs.fid_arch.mmd_rbf_kernel(x, y, sigma: float = 10.0, scale: int = 1000)[source]

Compute MMD with RBF kernel, ref to https://github.com/google-research/google-research/blob/master/cmmd/distance.py

pyiqa.archs.fid_arch.get_folder_features(fdir, model=None, num_workers=12, batch_size=32, test_img_size=(299, 299), device=torch.device('cuda'), mode='clean', description='', verbose=True)[source]

Compute the inception features for a folder of image files

class pyiqa.archs.fid_arch.DINOv2[source]

DINOv2 model for feature extraction.

Provides a wrapper for the DINOv2 vision transformer model for image feature extraction.

class pyiqa.archs.fid_arch.FID(dims: int = 2048, backbone: str = 'inceptionv3')[source]

Bases: torch.nn.Module

Implements the Fréchet Inception Distance (FID) and Clean-FID metrics.

The FID measures the distance between the feature representations of two sets of images, one generated by a model and the other from a reference dataset.

model

The feature extraction network.

Type:

nn.Module

test_img_size

Default image size for feature extraction.

Type:

Tuple[int, int]

forward(fdir1: str | None = None, fdir2: str | None = None, mode: str = 'clean', distance_type: str = 'frechet', kernel_type: str = 'polynomial', dataset_name: str | None = None, dataset_res: int = 1024, dataset_split: str = 'train', num_workers: int = 4, batch_size: int = 8, device: torch.device = torch.device('cuda'), verbose: bool = True, **kwargs: Any) float[source]

Compute the FID or Clean-FID score between two sets of images.

Parameters:
  • fdir1 (Optional[str]) – Path to the first folder of images.

  • fdir2 (Optional[str]) – Path to the second folder of images.

  • mode (str, optional) – Calculation mode. Defaults to ‘clean’.

  • distance_type (str, optional) – Distance metric to use. Defaults to ‘frechet’.

  • kernel_type (str, optional) – Kernel type for MMD. Defaults to ‘polynomial’.

  • dataset_name (Optional[str], optional) – Reference dataset name. Defaults to None.

  • dataset_res (int, optional) – Reference dataset resolution. Defaults to 1024.

  • dataset_split (str, optional) – Reference dataset split. Defaults to ‘train’.

  • num_workers (int, optional) – Number of workers for data loading. Defaults to 4.

  • batch_size (int, optional) – Batch size for processing. Defaults to 8.

  • device (torch.device, optional) – Computation device. Defaults to cuda.

  • verbose (bool, optional) – Print progress messages. Defaults to True.

Returns:

FID or distance score between image sets.

Return type:

float

Raises:

ValueError – For invalid input combinations or parameters.