textformer.models.decoders

Pre-defined decoder architectures.

A package for already-implemented decoder models.

class textformer.models.decoders.BiGRUDecoder(n_output=128, n_hidden_enc=128, n_hidden_dec=128, n_embedding=128, dropout=0.5)

Bases: textformer.core.Decoder

A BiGRUDecoder class is used to supply the decoding part of the Attention-based Seq2Seq architecture.

__init__(self, n_output=128, n_hidden_enc=128, n_hidden_dec=128, n_embedding=128, dropout=0.5)

Initialization method.

Parameters
  • n_output (int) – Number of output units.

  • n_hidden_enc (int) – Number of hidden units in the Encoder.

  • n_hidden_dec (int) – Number of hidden units in the Decoder.

  • n_embedding (int) – Number of embedding units.

  • dropout (float) – Amount of dropout to be applied.

forward(self, x, o, h)

Performs a forward pass over the architecture.

Parameters
  • x (torch.Tensor) – Tensor containing the input data.

  • o (torch.Tensor) – Tensor containing the encoded outputs.

  • h (torch.Tensor) – Tensor containing the hidden states.

Returns

The prediction and hidden state.

class textformer.models.decoders.ConvDecoder(n_output=128, n_hidden=128, n_embedding=128, n_layers=1, kernel_size=3, dropout=0.5, scale=0.5, max_length=100, pad_token=None)

Bases: textformer.core.Decoder

A ConvDecoder is used to supply the decoding part of the Convolutional Seq2Seq architecture.

__init__(self, n_output=128, n_hidden=128, n_embedding=128, n_layers=1, kernel_size=3, dropout=0.5, scale=0.5, max_length=100, pad_token=None)

Initializion method.

Parameters
  • n_input (int) – Number of input units.

  • n_hidden (int) – Number of hidden units.

  • n_embedding (int) – Number of embedding units.

  • n_layers (int) – Number of convolutional layers.

  • kernel_size (int) – Size of the convolutional kernels.

  • dropout (float) – Amount of dropout to be applied.

  • scale (float) – Value for the residual learning.

  • max_length (int) – Maximum length of positional embeddings.

  • pad_token (int) – The index of a padding token.

forward(self, y, c, o)

Performs a forward pass over the architecture.

Parameters
  • y (torch.Tensor) – Tensor containing the true labels.

  • c (torch.Tensor) – Tensor containing the convolutional features.

  • o (torch.Tensor) – Tensor containing combined outputs.

Returns

The output and attention values.

class textformer.models.decoders.GRUDecoder(n_output=128, n_hidden=128, n_embedding=128, dropout=0.5)

Bases: textformer.core.Decoder

A GRUDecoder class is used to supply the decoding part of the JointSeq2Seq architecture.

__init__(self, n_output=128, n_hidden=128, n_embedding=128, dropout=0.5)

Initialization method.

Parameters
  • n_output (int) – Number of output units.

  • n_hidden (int) – Number of hidden units.

  • n_embedding (int) – Number of embedding units.

  • dropout (float) – Amount of dropout to be applied.

forward(self, x, h, c)

Performs a forward pass over the architecture.

Parameters
  • x_enc (torch.Tensor) – Tensor containing the input data.

  • h (torch.Tensor) – Tensor containing the hidden states.

  • c (torch.Tensor) – Tensor containing the context.

Returns

The prediction and hidden state values.

class textformer.models.decoders.LSTMDecoder(n_output=128, n_hidden=128, n_embedding=128, n_layers=1, dropout=0.5)

Bases: textformer.core.Decoder

A LSTMDecoder class is used to supply the decoding part of the Seq2Seq architecture.

__init__(self, n_output=128, n_hidden=128, n_embedding=128, n_layers=1, dropout=0.5)

Initialization method.

Parameters
  • n_output (int) – Number of output units.

  • n_hidden (int) – Number of hidden units.

  • n_embedding (int) – Number of embedding units.

  • n_layers (int) – Number of RNN layers.

  • dropout (float) – Amount of dropout to be applied.

forward(self, x, h, c)

Performs a forward pass over the architecture.

Parameters
  • x_enc (torch.Tensor) – Tensor containing the input data.

  • h (torch.Tensor) – Tensor containing the hidden states.

  • c (torch.Tensor) – Tensor containing the cell.

Returns

The prediction, hidden state and cell values.

class textformer.models.decoders.SelfAttentionDecoder(n_output=128, n_hidden=128, n_forward=256, n_layers=1, n_heads=3, dropout=0.1, max_length=100)

Bases: textformer.core.Decoder

A SelfAttentionDecoder is used to supply the decoding part of the Transformer architecture.

__init__(self, n_output=128, n_hidden=128, n_forward=256, n_layers=1, n_heads=3, dropout=0.1, max_length=100)

Initializion method.

Parameters
  • n_output (int) – Number of output units.

  • n_hidden (int) – Number of hidden units.

  • n_forward (int) – Number of feed forward units.

  • n_layers (int) – Number of attention layers.

  • n_heads (int) – Number of attention heads.

  • dropout (float) – Amount of dropout to be applied.

  • max_length (int) – Maximum length of positional embeddings.

forward(self, y, y_mask, x, x_mask)

Performs a forward pass over the architecture.

Parameters
  • y (torch.Tensor) – Tensor containing the true labels.

  • y_mask (torch.Tensor) – Tensor containing the masked labels.

  • x_enc (torch.Tensor) – Tensor containing the encoded data.

  • x_mask (torch.Tensor) – Tensor containing the masked data.

Returns

The output and attention values.