textformer.models.decoders¶

Pre-defined decoder architectures.

A package for already-implemented decoder models.

class textformer.models.decoders.BiGRUDecoder(n_output=128, n_hidden_enc=128, n_hidden_dec=128, n_embedding=128, dropout=0.5)¶

Bases: textformer.core.Decoder

A BiGRUDecoder class is used to supply the decoding part of the Attention-based Seq2Seq architecture.

__init__(self, n_output=128, n_hidden_enc=128, n_hidden_dec=128, n_embedding=128, dropout=0.5)¶

Initialization method.

Parameters

n_output (int) – Number of output units.
n_hidden_enc (int) – Number of hidden units in the Encoder.
n_hidden_dec (int) – Number of hidden units in the Decoder.
n_embedding (int) – Number of embedding units.
dropout (float) – Amount of dropout to be applied.

forward(self, x, o, h)¶

Performs a forward pass over the architecture.

Parameters

x (torch.Tensor) – Tensor containing the input data.
o (torch.Tensor) – Tensor containing the encoded outputs.
h (torch.Tensor) – Tensor containing the hidden states.

Returns

The prediction and hidden state.

class textformer.models.decoders.ConvDecoder(n_output=128, n_hidden=128, n_embedding=128, n_layers=1, kernel_size=3, dropout=0.5, scale=0.5, max_length=100, pad_token=None)¶

Bases: textformer.core.Decoder

A ConvDecoder is used to supply the decoding part of the Convolutional Seq2Seq architecture.

__init__(self, n_output=128, n_hidden=128, n_embedding=128, n_layers=1, kernel_size=3, dropout=0.5, scale=0.5, max_length=100, pad_token=None)¶

Initializion method.

Parameters

n_input (int) – Number of input units.
n_hidden (int) – Number of hidden units.
n_embedding (int) – Number of embedding units.
n_layers (int) – Number of convolutional layers.
kernel_size (int) – Size of the convolutional kernels.
dropout (float) – Amount of dropout to be applied.
scale (float) – Value for the residual learning.
max_length (int) – Maximum length of positional embeddings.
pad_token (int) – The index of a padding token.

forward(self, y, c, o)¶

Performs a forward pass over the architecture.

Parameters

y (torch.Tensor) – Tensor containing the true labels.
c (torch.Tensor) – Tensor containing the convolutional features.
o (torch.Tensor) – Tensor containing combined outputs.

Returns

The output and attention values.

class textformer.models.decoders.GRUDecoder(n_output=128, n_hidden=128, n_embedding=128, dropout=0.5)¶

Bases: textformer.core.Decoder

A GRUDecoder class is used to supply the decoding part of the JointSeq2Seq architecture.

__init__(self, n_output=128, n_hidden=128, n_embedding=128, dropout=0.5)¶

Initialization method.

Parameters

n_output (int) – Number of output units.
n_hidden (int) – Number of hidden units.
n_embedding (int) – Number of embedding units.
dropout (float) – Amount of dropout to be applied.

forward(self, x, h, c)¶

Performs a forward pass over the architecture.

Parameters

x_enc (torch.Tensor) – Tensor containing the input data.
h (torch.Tensor) – Tensor containing the hidden states.
c (torch.Tensor) – Tensor containing the context.

Returns

The prediction and hidden state values.

class textformer.models.decoders.LSTMDecoder(n_output=128, n_hidden=128, n_embedding=128, n_layers=1, dropout=0.5)¶

Bases: textformer.core.Decoder

A LSTMDecoder class is used to supply the decoding part of the Seq2Seq architecture.

__init__(self, n_output=128, n_hidden=128, n_embedding=128, n_layers=1, dropout=0.5)¶

Initialization method.

Parameters

n_output (int) – Number of output units.
n_hidden (int) – Number of hidden units.
n_embedding (int) – Number of embedding units.
n_layers (int) – Number of RNN layers.
dropout (float) – Amount of dropout to be applied.

forward(self, x, h, c)¶

Performs a forward pass over the architecture.

Parameters

x_enc (torch.Tensor) – Tensor containing the input data.
h (torch.Tensor) – Tensor containing the hidden states.
c (torch.Tensor) – Tensor containing the cell.

Returns

The prediction, hidden state and cell values.

class textformer.models.decoders.SelfAttentionDecoder(n_output=128, n_hidden=128, n_forward=256, n_layers=1, n_heads=3, dropout=0.1, max_length=100)¶

Bases: textformer.core.Decoder

A SelfAttentionDecoder is used to supply the decoding part of the Transformer architecture.

__init__(self, n_output=128, n_hidden=128, n_forward=256, n_layers=1, n_heads=3, dropout=0.1, max_length=100)¶

Initializion method.

Parameters

n_output (int) – Number of output units.
n_hidden (int) – Number of hidden units.
n_forward (int) – Number of feed forward units.
n_layers (int) – Number of attention layers.
n_heads (int) – Number of attention heads.
dropout (float) – Amount of dropout to be applied.
max_length (int) – Maximum length of positional embeddings.

forward(self, y, y_mask, x, x_mask)¶

Performs a forward pass over the architecture.

Parameters

y (torch.Tensor) – Tensor containing the true labels.
y_mask (torch.Tensor) – Tensor containing the masked labels.
x_enc (torch.Tensor) – Tensor containing the encoded data.
x_mask (torch.Tensor) – Tensor containing the masked data.

Returns

The output and attention values.