22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Model Configuration

1 class SquareModel(nn.Module):

2 def __init__(self, n_features, hidden_dim, n_outputs):

3 super(SquareModel, self).__init__()

4 self.hidden_dim = hidden_dim

5 self.n_features = n_features

6 self.n_outputs = n_outputs

7 self.hidden = None

8 # Simple RNN

9 self.basic_rnn = nn.RNN(self.n_features,

10 self.hidden_dim,

11 batch_first=True)

12 # Classifier to produce as many logits as outputs

13 self.classifier = nn.Linear(self.hidden_dim,

14 self.n_outputs)

15

16 def forward(self, X):

17 # X is batch first (N, L, F)

18 # output is (N, L, H)

19 # final hidden state is (1, N, H)

20 batch_first_output, self.hidden = self.basic_rnn(X)

21

22 # only last item in sequence (N, 1, H)

23 last_output = batch_first_output[:, -1]

24 # classifier will output (N, 1, n_outputs)

25 out = self.classifier(last_output)

26

27 # final output is (N, n_outputs)

28 return out.view(-1, self.n_outputs)

"Why are we taking the last output instead of the final hidden state?

Aren’t they the same?"

They are the same in most cases, yes, but they are different if you’re using

bidirectional RNNs. By using the last output, we’re ensuring that the code will

work for all sorts of RNNs: simple, stacked, and bidirectional. Besides, we want to

avoid handling the hidden state anyway, because it’s always in sequence-first shape.

Recurrent Neural Networks (RNNs) | 617

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!