textformer.models.layers¶

Pre-defined layers.

A package contaning custom layers for all common textformer modules.

class textformer.models.layers.Attention(n_hidden_enc, n_hidden_dec)¶

Bases: torch.nn.Module

An Attention class is used to provide attention-based mechanisms in a neural network layer.

References

D. Bahdanau, K. Cho, Y. Bengio. Neural machine translation by jointly learning to align and translate. Preprint arXiv:1409.0473 (2014).

__init__(self, n_hidden_enc, n_hidden_dec)¶

Initialization method.

Parameters

n_hidden_enc (int) – Number of hidden units in the Encoder.
n_hidden_dec (int) – Number of hidden units in the Decoder.

forward(self, o, h)¶

Performs a forward pass over the layer.

Parameters

o (torch.Tensor) – Tensor containing the encoded outputs.
h (torch.Tensor) – Tensor containing the hidden states.

Returns

The attention-based weights.

class textformer.models.layers.MultiHeadAttention(n_hidden, n_heads, dropout)¶

Bases: torch.nn.Module

A MultiHeadAttention class is used to provide multi-head attention-based mechanisms in a neural network layer.

References

Vaswani, et al. Attention is all you need. Advances in neural information processing systems (2017).

__init__(self, n_hidden, n_heads, dropout)¶

Initialization method.

Parameters

n_hidden (int) – Number of hidden units.
n_heads (int) – Number of attention heads.
dropout (float) – Dropout probability.

forward(self, query, key, value, mask=None)¶

Performs a forward pass over the layer.

Parameters

q (torch.Tensor) – Tensor containing the queries.
k (torch.Tensor) – Tensor containing the keys.
v (torch.Tensor) – Tensor containing the values.
m (torch.Tensor) – Tensor containing the mask.

Returns

The multi-head attention-based weights.

class textformer.models.layers.PositionWideForward(n_hidden, n_forward, dropout)¶

Bases: torch.nn.Module

A PositionWideForward class is used to provide a position-wise feed forward layer for a neural network.

References

Vaswani, et al. Attention is all you need. Advances in neural information processing systems (2017).

__init__(self, n_hidden, n_forward, dropout)¶

Initialization method.

Parameters

n_hidden (int) – Number of hidden units.
n_forward (int) – Number of forward units.
dropout (float) – Dropout probability.

forward(self, x)¶

Performs a forward pass over the layer.

Parameters: x (torch.Tensor) – Tensor containing the input states.
Returns: The feed forward activations.

class textformer.models.layers.ResidualAttention(n_hidden, n_embedding, scale)¶

Bases: torch.nn.Module

A ResidualAttention class is used to provide attention-based mechanisms in a neural network layer among residual connections.

References

F. Wang, et al. Residual attention network for image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017).

__init__(self, n_hidden, n_embedding, scale)¶

Initialization method.

Parameters

n_hidden (int) – Number of hidden units.
n_embedding (int) – Number of embedding units.
scale (float) – Value for the residual learning.

forward(self, emb, c, enc_c, enc_o)¶

Performs a forward pass over the layer.

Parameters

emb (torch.Tensor) – Tensor containing the embedded outputs.
c (torch.Tensor) – Tensor containing the decoder convolutioned features.
enc_c (torch.Tensor) – Tensor containing the encoder convolutioned features.
enc_o (torch.Tensor) – Tensor containing the encoder outputs.

Returns

The attention-based weights, as well as the residual attention-based weights.