pyiqa.archs.afine_arch¶

A-FINE architecture for generalized image quality assessment.

Reference:: Chen, D., Wu, T., Ma, K., and Zhang, L. Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption. CVPR 2025.
Project page:: https://github.com/ChrisDud0257/AFINE

This implementation is intended for inference with pretrained checkpoints.

Module Contents¶

pyiqa.archs.afine_arch.default_model_urls[source]¶

pyiqa.archs.afine_arch.scale_finalscore(score, yita1=100, yita2=0, yita3=-1.971, yita4=-2.3734)[source]¶

Map raw A-FINE score to a bounded, human-readable range.

Parameters:

score (torch.Tensor) – Raw score tensor.
yita1 (float) – Upper bound of target range.
yita2 (float) – Lower bound of target range.
yita3 (float) – Logistic midpoint parameter.
yita4 (float) – Logistic scale parameter.

Returns:

Scaled score tensor.

Return type:

torch.Tensor

class pyiqa.archs.afine_arch.AFINEQhead(chns=(3, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768), feature_out_channel=1, input_dim=768, hidden_dim=128, mean=(0.48145466, 0.4578275, 0.40821073), std=(0.26862954, 0.26130258, 0.27577711))[source]¶

Bases: torch.nn.Module

Naturalness head used by A-FINE.

This head aggregates mean and variance statistics from CLIP feature maps and predicts the no-reference naturalness term.

Parameters:

chns (tuple[int, ...]) – Channel dimensions for input image and feature levels.
feature_out_channel (int) – Number of output channels for the score head.
input_dim (int) – Channel dimension of CLIP feature tokens.
hidden_dim (int) – Hidden width for projection layers.
mean (tuple[float, float, float]) – RGB normalization mean.
std (tuple[float, float, float]) – RGB normalization standard deviation.

forward(x, h_list_x)[source]¶

class pyiqa.archs.afine_arch.AFINEDhead(chns=(3, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768, 768), mean=(0.48145466, 0.4578275, 0.40821073), std=(0.26862954, 0.26130258, 0.27577711))[source]¶

Bases: torch.nn.Module

Fidelity head used by A-FINE.

The module computes similarity statistics between distorted and reference features and outputs a full-reference fidelity term.

Parameters:

chns (tuple[int, ...]) – Channel dimensions for each feature level.
mean (tuple[float, float, float]) – RGB normalization mean.
std (tuple[float, float, float]) – RGB normalization standard deviation.

forward(x, y, h_list_x, h_list_y)[source]¶

class pyiqa.archs.afine_arch.AFINENLM_NR_Fit(yita1=2, yita2=-2, yita3=3.7833, yita4=7.5676)[source]¶

Bases: torch.nn.Module

Nonlinear calibration layer for the naturalness branch.

forward(x)[source]¶

class pyiqa.archs.afine_arch.AFINENLM_FR_Fit_with_limit(yita1=2, yita2=-2, yita3=-24.1335, yita4=8.1093, yita3_upper=-21, yita3_lower=-27, yita4_upper=9, yita4_lower=7)[source]¶

Bases: torch.nn.Module

Bounded nonlinear calibration layer for the fidelity branch.

forward(x)[source]¶

class pyiqa.archs.afine_arch.AFINELearnLambda(k=5)[source]¶

Bases: torch.nn.Module

Adaptive fusion layer for naturalness and fidelity terms.

forward(x_nr, ref_nr, xref_fr)[source]¶

class pyiqa.archs.afine_arch.AFINE(model_type='afine_all_scale', clip_backbone='ViT-B/32', step=32, num_patch=15, pretrained=True, pretrained_model_path=None, url_key='afine')[source]¶

Bases: torch.nn.Module

A-FINE inference model.

Parameters:

model_type (str) – Output type. Supported values are "afine_all_scale", "afine_all", "afine_fr", and "afine_nr".
clip_backbone (str) – CLIP backbone identifier.
step (int) – Kept for compatibility with original config interface.
num_patch (int) – Kept for compatibility with original config interface.
pretrained (bool) – Kept for compatibility. This implementation expects a pretrained checkpoint.
pretrained_model_path (str | None) – Local checkpoint path. If None, the model is downloaded from url_key.
url_key (str) – Key used to resolve default checkpoint URL.

Example

>>> metric = AFINE(model_type='afine_all_scale')
>>> dis = torch.rand(1, 3, 224, 224)
>>> ref = torch.rand(1, 3, 224, 224)
>>> score = metric(dis, ref)

forward(dis, ref=None)[source]¶

Run A-FINE scoring.

Parameters:

dis (torch.Tensor) – Distorted image tensor with shape (N, 3, H, W).
ref (torch.Tensor | None) – Optional reference image tensor with the same shape as dis. If None, dis is reused.

Returns:

Score tensor according to model_type.

Return type:

torch.Tensor

Raises:

AssertionError – If image height or width is not divisible by 32.
ValueError – If model_type is unsupported.

Notes

Lower values indicate better quality for A-FINE terms.