pyiqa.archs.topiq_arch

TOP-IQ metric, proposed by

TOPIQ: A Top-down Approach from Semantics to Distortions for Image Quality Assessment. Chaofeng Chen, Jiadi Mo, Jingwen Hou, Haoning Wu, Liang Liao, Wenxiu Sun, Qiong Yan, Weisi Lin. Transactions on Image Processing, 2024.

Paper link: https://arxiv.org/abs/2308.03060

Module Contents

pyiqa.archs.topiq_arch.default_model_urls[source]
class pyiqa.archs.topiq_arch.TransformerEncoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.1, activation='gelu', normalize_before=False)[source]

Bases: torch.nn.Module

Transformer encoder layer used in local self-attention blocks.

forward(src)[source]
class pyiqa.archs.topiq_arch.TransformerDecoderLayer(d_model, nhead, dim_feedforward=2048, dropout=0.1, activation='gelu', normalize_before=False)[source]

Bases: torch.nn.Module

Transformer decoder layer used for cross-scale attention.

forward(tgt, memory)[source]
class pyiqa.archs.topiq_arch.TransformerEncoder(encoder_layer, num_layers)[source]

Bases: torch.nn.Module

Stacked wrapper for encoder layers.

forward(src)[source]
class pyiqa.archs.topiq_arch.TransformerDecoder(decoder_layer, num_layers)[source]

Bases: torch.nn.Module

Stacked wrapper for decoder layers.

forward(tgt, memory)[source]
class pyiqa.archs.topiq_arch.GatedConv(weightdim, ksz=3)[source]

Bases: torch.nn.Module

Gated local pooling module for no-reference feature aggregation.

forward(x)[source]
class pyiqa.archs.topiq_arch.CFANet(semantic_model_name='resnet50', model_name='cfanet_nr_koniq_res50', backbone_pretrain=True, in_size=None, use_ref=True, num_class=1, num_crop=1, crop_size=256, inter_dim=256, num_heads=4, num_attn_layers=1, dprate=0.1, activation='gelu', pretrained=True, pretrained_model_path=None, out_act=False, block_pool='weighted_avg', test_img_size=None, align_crop_face=True, default_mean=IMAGENET_DEFAULT_MEAN, default_std=IMAGENET_DEFAULT_STD)[source]

Bases: torch.nn.Module

TOPIQ/CFANet architecture for NR and FR quality prediction.

Parameters:
  • semantic_model_name (str) – Backbone name, for example 'resnet50', 'clip_ViT-B/32', or a Swin variant.

  • model_name (str) – Registered checkpoint key.

  • backbone_pretrain (bool) – Whether to load pretrained backbone weights.

  • in_size (tuple[int, int] | None) – Optional training input size.

  • use_ref (bool) – Whether to use a reference image input.

  • num_class (int) – Number of output dimensions.

  • num_crop (int) – Number of evaluation crops.

  • crop_size (int) – Crop size for multi-crop evaluation.

  • inter_dim (int) – Intermediate feature dimension.

  • num_heads (int) – Attention head count.

  • num_attn_layers (int) – Number of attention layers per block.

  • dprate (float) – Dropout probability.

  • activation (str) – Activation name.

  • pretrained (bool) – Whether to load pretrained CFANet checkpoint.

  • pretrained_model_path (str | None) – Optional local checkpoint path.

  • out_act (bool) – Whether to apply positive output activation for scalar prediction.

  • block_pool (str) – Feature block pooling mode.

  • test_img_size (tuple[int, int] | None) – Optional test-time resize.

  • align_crop_face (bool) – Whether to run face alignment for GFIQA models.

  • default_mean (tuple[float, float, float]) – Input normalization mean.

  • default_std (tuple[float, float, float]) – Input normalization std.

Notes

Set use_ref=True for full-reference mode and use_ref=False for no-reference mode.

preprocess(x)[source]
fix_bn(model)[source]
get_swin_feature(model, x)[source]
dist_func(x, y, eps=1e-12)[source]
forward_cross_attention(x, y=None)[source]
preprocess_face(x)[source]
forward(x, y=None, return_mos=True, return_dist=False)[source]

Compute quality prediction.

Parameters:
  • x (torch.Tensor) – Distorted image tensor with shape (N, 3, H, W).

  • y (torch.Tensor | None) – Optional reference image tensor with shape (N, 3, H, W). Required when use_ref is True.

  • return_mos (bool) – Whether to return mapped MOS output.

  • return_dist (bool) – Whether to return raw distance/logit output.

Returns:

Single tensor when one output is requested, otherwise [mos, dist] in that order.

Return type:

torch.Tensor | list[torch.Tensor]

Raises:

AssertionError – If use_ref is True but y is not given.