pyiqa.archs.afine_arch

A-FINE architecture for generalized image quality assessment.

Reference:

Chen, D., Wu, T., Ma, K., and Zhang, L. Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption. CVPR 2025.

Project page:

https://github.com/ChrisDud0257/AFINE

This implementation is intended for inference with pretrained checkpoints.

Module Contents

pyiqa.archs.afine_arch.default_model_urls[source]
pyiqa.archs.afine_arch.scale_finalscore(score, yita1=100, yita2=0, yita3=-1.971, yita4=-2.3734)[source]

Map raw A-FINE score to a bounded, human-readable range.

Parameters:
  • score (torch.Tensor) – Raw score tensor.

  • yita1 (float) – Upper bound of target range.

  • yita2 (float) – Lower bound of target range.

  • yita3 (float) – Logistic midpoint parameter.

  • yita4 (float) – Logistic scale parameter.

Returns:

Scaled score tensor.

Return type:

torch.Tensor

class pyiqa.archs.afine_arch.AFINEQhead(chns=(3, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768), feature_out_channel=1, input_dim=768, hidden_dim=128, mean=(0.48145466, 0.4578275, 0.40821073), std=(0.26862954, 0.26130258, 0.27577711))[source]

Bases: torch.nn.Module

Naturalness head used by A-FINE.

This head aggregates mean and variance statistics from CLIP feature maps and predicts the no-reference naturalness term.

Parameters:
  • chns (tuple[int, ...]) – Channel dimensions for input image and feature levels.

  • feature_out_channel (int) – Number of output channels for the score head.

  • input_dim (int) – Channel dimension of CLIP feature tokens.

  • hidden_dim (int) – Hidden width for projection layers.

  • mean (tuple[float, float, float]) – RGB normalization mean.

  • std (tuple[float, float, float]) – RGB normalization standard deviation.

forward(x, h_list_x)[source]
class pyiqa.archs.afine_arch.AFINEDhead(chns=(3, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768), mean=(0.48145466, 0.4578275, 0.40821073), std=(0.26862954, 0.26130258, 0.27577711))[source]

Bases: torch.nn.Module

Fidelity head used by A-FINE.

The module computes similarity statistics between distorted and reference features and outputs a full-reference fidelity term.

Parameters:
  • chns (tuple[int, ...]) – Channel dimensions for each feature level.

  • mean (tuple[float, float, float]) – RGB normalization mean.

  • std (tuple[float, float, float]) – RGB normalization standard deviation.

forward(x, y, h_list_x, h_list_y)[source]
class pyiqa.archs.afine_arch.AFINENLM_NR_Fit(yita1=2, yita2=-2, yita3=3.7833, yita4=7.5676)[source]

Bases: torch.nn.Module

Nonlinear calibration layer for the naturalness branch.

forward(x)[source]
class pyiqa.archs.afine_arch.AFINENLM_FR_Fit_with_limit(yita1=2, yita2=-2, yita3=-24.1335, yita4=8.1093, yita3_upper=-21, yita3_lower=-27, yita4_upper=9, yita4_lower=7)[source]

Bases: torch.nn.Module

Bounded nonlinear calibration layer for the fidelity branch.

forward(x)[source]
class pyiqa.archs.afine_arch.AFINELearnLambda(k=5)[source]

Bases: torch.nn.Module

Adaptive fusion layer for naturalness and fidelity terms.

forward(x_nr, ref_nr, xref_fr)[source]
class pyiqa.archs.afine_arch.AFINE(model_type='afine_all_scale', clip_backbone='ViT-B/32', step=32, num_patch=15, pretrained=True, pretrained_model_path=None, url_key='afine')[source]

Bases: torch.nn.Module

A-FINE inference model.

Parameters:
  • model_type (str) – Output type. Supported values are "afine_all_scale", "afine_all", "afine_fr", and "afine_nr".

  • clip_backbone (str) – CLIP backbone identifier.

  • step (int) – Kept for compatibility with original config interface.

  • num_patch (int) – Kept for compatibility with original config interface.

  • pretrained (bool) – Kept for compatibility. This implementation expects a pretrained checkpoint.

  • pretrained_model_path (str | None) – Local checkpoint path. If None, the model is downloaded from url_key.

  • url_key (str) – Key used to resolve default checkpoint URL.

Example

>>> metric = AFINE(model_type='afine_all_scale')
>>> dis = torch.rand(1, 3, 224, 224)
>>> ref = torch.rand(1, 3, 224, 224)
>>> score = metric(dis, ref)
forward(dis, ref=None)[source]

Run A-FINE scoring.

Parameters:
  • dis (torch.Tensor) – Distorted image tensor with shape (N, 3, H, W).

  • ref (torch.Tensor | None) – Optional reference image tensor with the same shape as dis. If None, dis is reused.

Returns:

Score tensor according to model_type.

Return type:

torch.Tensor

Raises:
  • AssertionError – If image height or width is not divisible by 32.

  • ValueError – If model_type is unsupported.

Notes

Lower values indicate better quality for A-FINE terms.