pyiqa.archs.maniqa_swin

Module Contents

class pyiqa.archs.maniqa_swin.Mlp(in_features, hidden_features=None, out_features=None, act_layer=nn.GELU, drop=0.0)[source]

Bases: torch.nn.Module

forward(x)[source]
pyiqa.archs.maniqa_swin.window_partition(x, window_size)[source]
Parameters:
  • x – (B, H, W, C)

  • window_size (int) – window size

Returns:

(num_windows*B, window_size, window_size, C)

Return type:

windows

pyiqa.archs.maniqa_swin.window_reverse(windows, window_size, H, W)[source]
Parameters:
  • windows – (num_windows*B, window_size, window_size, C)

  • window_size (int) – Window size

  • H (int) – Height of image

  • W (int) – Width of image

Returns:

(B, H, W, C)

Return type:

x

class pyiqa.archs.maniqa_swin.WindowAttention(dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Bases: torch.nn.Module

Window based multi-head self attention (W-MSA) module with relative position bias. It supports both of shifted and non-shifted window.

Parameters:
  • dim (int) – Number of input channels.

  • window_size (tuple[int]) – The height and width of the window.

  • num_heads (int) – Number of attention heads.

  • qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True

  • qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set

  • attn_drop (float, optional) – Dropout ratio of attention weight. Default: 0.0

  • proj_drop (float, optional) – Dropout ratio of output. Default: 0.0

forward(x, mask=None)[source]
Parameters:
  • x – input features with shape of (num_windows*B, N, C)

  • mask – (0/-inf) mask with shape of (num_windows, Wh*Ww, Wh*Ww) or None

extra_repr() str[source]
flops(N)[source]
class pyiqa.archs.maniqa_swin.SwinBlock(dim, input_resolution, num_heads, window_size=7, shift_size=0, dim_mlp=1024.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=nn.GELU, norm_layer=nn.LayerNorm)[source]

Bases: torch.nn.Module

Swin Transformer Block.

Parameters:
  • dim (int) – Number of input channels.

  • input_resolution (tuple[int]) – Input resolution.

  • num_heads (int) – Number of attention heads.

  • window_size (int) – Window size.

  • shift_size (int) – Shift size for SW-MSA.

  • mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.

  • qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True

  • qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set.

  • drop (float, optional) – Dropout rate. Default: 0.0

  • attn_drop (float, optional) – Attention dropout rate. Default: 0.0

  • drop_path (float, optional) – Stochastic depth rate. Default: 0.0

  • act_layer (nn.Module, optional) – Activation layer. Default: nn.GELU

  • norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm

forward(x)[source]
class pyiqa.archs.maniqa_swin.BasicLayer(dim, input_resolution, depth, num_heads, window_size=7, dim_mlp=1024, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, norm_layer=nn.LayerNorm, downsample=None, use_checkpoint=False)[source]

Bases: torch.nn.Module

A basic Swin Transformer layer for one stage.

Parameters:
  • dim (int) – Number of input channels.

  • input_resolution (tuple[int]) – Input resolution.

  • depth (int) – Number of blocks.

  • num_heads (int) – Number of attention heads.

  • window_size (int) – Local window size.

  • mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.

  • qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True

  • qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set.

  • drop (float, optional) – Dropout rate. Default: 0.0

  • attn_drop (float, optional) – Attention dropout rate. Default: 0.0

  • drop_path (float | tuple[float], optional) – Stochastic depth rate. Default: 0.0

  • norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm

  • downsample (nn.Module | None, optional) – Downsample layer at the end of the layer. Default: None

  • use_checkpoint (bool) – Whether to use checkpointing to save memory. Default: False.

forward(x)[source]
extra_repr() str[source]
flops()[source]
class pyiqa.archs.maniqa_swin.SwinTransformer(patches_resolution, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], embed_dim=256, drop=0.1, drop_rate=0.0, drop_path_rate=0.1, dropout=0.0, window_size=7, dim_mlp=1024, qkv_bias=True, qk_scale=None, attn_drop_rate=0.0, norm_layer=nn.LayerNorm, downsample=None, use_checkpoint=False, scale=0.8, **kwargs)[source]

Bases: torch.nn.Module

forward(x)[source]