pyiqa.archs.q_align.modeling_attn_mask_utils ============================================ .. py:module:: pyiqa.archs.q_align.modeling_attn_mask_utils Module Contents --------------- .. py:class:: AttentionMaskConverter(is_causal: bool, sliding_window: Optional[int] = None) A utility attention mask class that allows one to: - Create a causal 4d mask - Create a causal 4d mask with slided window - Convert a 2d attention mask (batch_size, query_length) to a 4d attention mask (batch_size, 1, query_length, key_value_length) that can be multiplied with attention scores :param is_causal: Whether the attention mask should be a uni-directional (causal) or bi-directional mask. :type is_causal: `bool` :param sliding_window: Optionally, the sliding window masks can be created if `sliding_window` is defined to a positive integer. :type sliding_window: `int`, *optional* .. py:method:: to_causal_4d(batch_size: int, query_length: int, key_value_length: int, dtype: torch.dtype = torch.float32, device: Union[torch.device, str] = 'cpu') -> torch.Tensor Creates a causal 4D mask of (bsz, head_dim=1, query_length, key_value_length) shape and adds large negative bias to upper right hand triangular matrix (causal mask). .. py:method:: to_4d(attention_mask_2d: torch.Tensor, query_length: int, key_value_length: Optional[int] = None, dtype: torch.dtype = torch.float32) -> torch.Tensor Converts 2D attention mask to 4D attention mask by expanding mask to (bsz, head_dim=1, query_length, key_value_length) shape and by adding a large negative bias to not-attended positions. If attention_mask is causal, a causal mask will be added.