pyiqa.archs.maclip_arch ======================= .. py:module:: pyiqa.archs.maclip_arch .. autoapi-nested-parse:: Beyond Cosine Similarity: Magnitude-Aware CLIP for No-Reference Image Quality Assessment @article{liao2025beyond, title={Beyond Cosine Similarity Magnitude-Aware CLIP for No-Reference Image Quality Assessment}, author={Liao, Zhicheng and Wu, Dongxu and Shi, Zhenshan and Mai, Sijie and Zhu, Hanwei and Zhu, Lingyu and Jiang, Yuncheng and Chen, Baoliang}, journal={arXiv preprint arXiv:2511.09948}, year={2025} } Accepted by AAAI 2026. Reference: - Arxiv link: https://arxiv.org/abs/2511.09948 - Official Github: https://github.com/zhix000/MA-CLIP Module Contents --------------- .. py:class:: CustomCLIP(backbone: str, device='cpu') Bases: :py:obj:`torch.nn.Module` Thin wrapper around CLIP image/text encoders used by MACLIP. :param backbone: CLIP backbone identifier. :type backbone: str :param device: Device string used when initializing the model. :type device: str .. py:method:: forward(image, text, pos_embedding=False, text_features=None) Encode image/text and return logits and unnormalized image features. :param image: Image tensor with shape ``(N, 3, H, W)``. :type image: torch.Tensor :param text: Tokenized text tensor. :type text: torch.Tensor :param pos_embedding: Whether to enable positional embedding branch in the custom CLIP visual encoder. :type pos_embedding: bool :param text_features: Optional precomputed text features. :type text_features: torch.Tensor | None :returns: ``(logits_per_image, logits_per_text, image_features_org)``. :rtype: tuple[torch.Tensor, torch.Tensor, torch.Tensor] .. py:class:: MACLIP(model_type='clipiqa', backbone='RN50', pos_embedding=False) Bases: :py:obj:`torch.nn.Module` Magnitude-Aware CLIP for no-reference image quality assessment. :param model_type: Output type identifier. :type model_type: str :param backbone: CLIP backbone name. :type backbone: str :param pos_embedding: Whether to enable visual positional embedding in CLIP image encoding. :type pos_embedding: bool .. rubric:: Notes The current implementation runs on CUDA and is intended for inference. .. py:method:: preprocess(img) Normalize image and build overlapping 224x224 patch set. :param img: Input tensor with shape ``(1, 3, H, W)``. :type img: torch.Tensor :returns: Patch tensor with shape ``(P, 3, 224, 224)``. :rtype: torch.Tensor .. py:method:: box_cox(x, lam=0.5, epsilon=1e-06) Apply Box-Cox-like transform after per-sample standardization. .. py:method:: fusion(cos, norm, base_cos=1.0, base_norm=0.6, alpha=1.0) Fuse cosine and magnitude cues with adaptive softmax weighting. :param cos: Cosine-similarity based quality scores. :type cos: torch.Tensor :param norm: Magnitude-cue scores. :type norm: torch.Tensor :param base_cos: Base weight prior for cosine cue. :type base_cos: float :param base_norm: Base weight prior for magnitude cue. :type base_norm: float :param alpha: Adaptive weight adjustment factor. :type alpha: float :returns: Fused score, cosine weight, and magnitude weight. :rtype: tuple[torch.Tensor, torch.Tensor, torch.Tensor] .. py:method:: forward(x, box_lam=0.5, base_cos=1.0, base_norm=0.6, alpha=1.0) Compute MACLIP score. :param x: Input image tensor with shape ``(1, 3, H, W)``. :type x: torch.Tensor :param box_lam: Lambda for Box-Cox transform. :type box_lam: float :param base_cos: Base weight for cosine cue. :type base_cos: float :param base_norm: Base weight for magnitude cue. :type base_norm: float :param alpha: Adaptive fusion factor. :type alpha: float :returns: Scalar quality score. :rtype: torch.Tensor