pyiqa.archs.qalign_arch
=======================

.. py:module:: pyiqa.archs.qalign_arch

.. autoapi-nested-parse::

   Q-Align: All-in-one Foundation Model for visual scoring.

   Reference:
   @article{wu2023qalign,
     title={Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels},
     author={Wu, Haoning and Zhang, Zicheng and Zhang, Weixia and Chen, Chaofeng and Li, Chunyi and Liao, Liang and Wang, Annan and Zhang, Erli and Sun, Wenxiu and Yan, Qiong and Min, Xiongkuo and Zhai, Guangtai and Lin, Weisi},
     journal={arXiv preprint arXiv:2312.17090},
     year={2023},
     institution={Nanyang Technological University and Shanghai Jiao Tong University and Sensetime Research},
     note={Equal Contribution by Wu, Haoning and Zhang, Zicheng. Project Lead by Wu, Haoning. Corresponding Authors: Zhai, Guangtai and Lin, Weisi.}
   }

   Reference url: https://github.com/Q-Future/Q-Align


Module Contents
---------------

.. py:function:: expand2square(pil_img)

   Pad image to square canvas using CLIP-mean background.

   :param pil_img: Input image.
   :type pil_img: PIL.Image.Image

   :returns: Square padded image.
   :rtype: PIL.Image.Image


.. py:class:: QAlign(dtype='fp16')

   Bases: :py:obj:`torch.nn.Module`


   Q-Align multimodal visual scoring model.

   :param dtype: Inference precision mode. Supported values are
                 ``'fp16'``, ``'4bit'``, and ``'8bit'``.
   :type dtype: str

   .. rubric:: Notes

   The current preprocessing path supports batch size ``1``.


   .. py:method:: preprocess(x)

      Convert input tensor to Q-Align CLIP-processor tensor.

      :param x: Input image tensor with shape ``(1, 3, H, W)``.
      :type x: torch.Tensor

      :returns: Processed image tensor suitable for Q-Align.
      :rtype: torch.Tensor

      :raises AssertionError: If batch size is not ``1``.


   .. py:method:: forward(x, task_='quality', input_='image')

      Run Q-Align scoring.

      :param x: Input tensor with shape ``(1, 3, H, W)``.
      :type x: torch.Tensor
      :param task_: Task prompt. Common options are ``'quality'`` and
                    ``'aesthetic'``.
      :type task_: str
      :param input_: Input type. Currently only ``'image'`` is supported.
      :type input_: str

      :returns: Predicted task score.
      :rtype: torch.Tensor

      :raises NotImplementedError: If ``input_`` is not ``'image'``.