textformer.models.layers¶
Pre-defined layers.
A package contaning custom layers for all common textformer modules.
-
class
textformer.models.layers.
Attention
(n_hidden_enc, n_hidden_dec)¶ Bases:
torch.nn.Module
An Attention class is used to provide attention-based mechanisms in a neural network layer.
References
D. Bahdanau, K. Cho, Y. Bengio. Neural machine translation by jointly learning to align and translate. Preprint arXiv:1409.0473 (2014).
-
__init__
(self, n_hidden_enc, n_hidden_dec)¶ Initialization method.
- Parameters
n_hidden_enc (int) – Number of hidden units in the Encoder.
n_hidden_dec (int) – Number of hidden units in the Decoder.
-
forward
(self, o, h)¶ Performs a forward pass over the layer.
- Parameters
o (torch.Tensor) – Tensor containing the encoded outputs.
h (torch.Tensor) – Tensor containing the hidden states.
- Returns
The attention-based weights.
-
-
class
textformer.models.layers.
MultiHeadAttention
(n_hidden, n_heads, dropout)¶ Bases:
torch.nn.Module
A MultiHeadAttention class is used to provide multi-head attention-based mechanisms in a neural network layer.
References
Vaswani, et al. Attention is all you need. Advances in neural information processing systems (2017).
-
__init__
(self, n_hidden, n_heads, dropout)¶ Initialization method.
- Parameters
n_hidden (int) – Number of hidden units.
n_heads (int) – Number of attention heads.
dropout (float) – Dropout probability.
-
forward
(self, query, key, value, mask=None)¶ Performs a forward pass over the layer.
- Parameters
q (torch.Tensor) – Tensor containing the queries.
k (torch.Tensor) – Tensor containing the keys.
v (torch.Tensor) – Tensor containing the values.
m (torch.Tensor) – Tensor containing the mask.
- Returns
The multi-head attention-based weights.
-
class
textformer.models.layers.
PositionWideForward
(n_hidden, n_forward, dropout)¶ Bases:
torch.nn.Module
A PositionWideForward class is used to provide a position-wise feed forward layer for a neural network.
References
Vaswani, et al. Attention is all you need. Advances in neural information processing systems (2017).
-
__init__
(self, n_hidden, n_forward, dropout)¶ Initialization method.
- Parameters
n_hidden (int) – Number of hidden units.
n_forward (int) – Number of forward units.
dropout (float) – Dropout probability.
-
forward
(self, x)¶ Performs a forward pass over the layer.
- Parameters
x (torch.Tensor) – Tensor containing the input states.
- Returns
The feed forward activations.
-
class
textformer.models.layers.
ResidualAttention
(n_hidden, n_embedding, scale)¶ Bases:
torch.nn.Module
A ResidualAttention class is used to provide attention-based mechanisms in a neural network layer among residual connections.
References
F. Wang, et al. Residual attention network for image classification. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017).
-
__init__
(self, n_hidden, n_embedding, scale)¶ Initialization method.
- Parameters
n_hidden (int) – Number of hidden units.
n_embedding (int) – Number of embedding units.
scale (float) – Value for the residual learning.
-
forward
(self, emb, c, enc_c, enc_o)¶ Performs a forward pass over the layer.
- Parameters
emb (torch.Tensor) – Tensor containing the embedded outputs.
c (torch.Tensor) – Tensor containing the decoder convolutioned features.
enc_c (torch.Tensor) – Tensor containing the encoder convolutioned features.
enc_o (torch.Tensor) – Tensor containing the encoder outputs.
- Returns
The attention-based weights, as well as the residual attention-based weights.
-