textformer.models.layers

Pre-defined layers.

A package contaning custom layers for all common textformer modules.

class textformer.models.layers.Attention(n_hidden_enc, n_hidden_dec)

Bases: torch.nn.Module

An Attention class is used to provide attention-based mechanisms in a neural network layer.

References

D. Bahdanau, K. Cho, Y. Bengio. Neural machine translation by jointly learning to align and translate. Preprint arXiv:1409.0473 (2014).

__init__(self, n_hidden_enc, n_hidden_dec)

Initialization method.

Parameters
  • n_hidden_enc (int) – Number of hidden units in the Encoder.

  • n_hidden_dec (int) – Number of hidden units in the Decoder.

forward(self, o, h)

Performs a forward pass over the layer.

Parameters
  • o (torch.Tensor) – Tensor containing the encoded outputs.

  • h (torch.Tensor) – Tensor containing the hidden states.

Returns

The attention-based weights.

class textformer.models.layers.MultiHeadAttention(n_hidden, n_heads, dropout)

Bases: torch.nn.Module

A MultiHeadAttention class is used to provide multi-head attention-based mechanisms in a neural network layer.

References

  1. Vaswani, et al. Attention is all you need. Advances in neural information processing systems (2017).

__init__(self, n_hidden, n_heads, dropout)

Initialization method.

Parameters
  • n_hidden (int) – Number of hidden units.

  • n_heads (int) – Number of attention heads.

  • dropout (float) – Dropout probability.

forward(self, query, key, value, mask=None)

Performs a forward pass over the layer.

Parameters
  • q (torch.Tensor) – Tensor containing the queries.

  • k (torch.Tensor) – Tensor containing the keys.

  • v (torch.Tensor) – Tensor containing the values.

  • m (torch.Tensor) – Tensor containing the mask.

Returns

The multi-head attention-based weights.

class textformer.models.layers.PositionWideForward(n_hidden, n_forward, dropout)

Bases: torch.nn.Module

A PositionWideForward class is used to provide a position-wise feed forward layer for a neural network.

References

  1. Vaswani, et al. Attention is all you need. Advances in neural information processing systems (2017).

__init__(self, n_hidden, n_forward, dropout)

Initialization method.

Parameters
  • n_hidden (int) – Number of hidden units.

  • n_forward (int) – Number of forward units.

  • dropout (float) – Dropout probability.

forward(self, x)

Performs a forward pass over the layer.

Parameters

x (torch.Tensor) – Tensor containing the input states.

Returns

The feed forward activations.

class textformer.models.layers.ResidualAttention(n_hidden, n_embedding, scale)

Bases: torch.nn.Module

A ResidualAttention class is used to provide attention-based mechanisms in a neural network layer among residual connections.

References

F. Wang, et al. Residual attention network for image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017).

__init__(self, n_hidden, n_embedding, scale)

Initialization method.

Parameters
  • n_hidden (int) – Number of hidden units.

  • n_embedding (int) – Number of embedding units.

  • scale (float) – Value for the residual learning.

forward(self, emb, c, enc_c, enc_o)

Performs a forward pass over the layer.

Parameters
  • emb (torch.Tensor) – Tensor containing the embedded outputs.

  • c (torch.Tensor) – Tensor containing the decoder convolutioned features.

  • enc_c (torch.Tensor) – Tensor containing the encoder convolutioned features.

  • enc_o (torch.Tensor) – Tensor containing the encoder outputs.

Returns

The attention-based weights, as well as the residual attention-based weights.