22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Let’s pause for a moment here. First, the reset gate returns a tensor of size two

because we have two hidden dimensions. Second, the two values may be different

(duh, I know!). What does it mean?

The reset gate may scale each hidden dimension independently.

It can completely suppress the values from one of the hidden

dimensions while letting the other pass unchallenged. In

geometrical terms, this means that the hidden space may shrink

in one direction while stretching in the other. We’ll visualize it

shortly in the journey of a (gated) hidden state.

The reset gate is an input for the candidate hidden state (n):

n = candidate_n(initial_hidden, first_corner, r)

n

Output

tensor([[-0.8032, -0.2275]], grad_fn=<TanhBackward>)

That would be the end of it, and that would be the new hidden state if it wasn’t for

the update gate (z):

z = update_gate(initial_hidden, first_corner)

z

Output

tensor([[0.2984, 0.3540]], grad_fn=<SigmoidBackward>)

Another short pause here—the update gate is telling us to keep 29.84% of the first

and 35.40% of the second dimensions of the initial hidden state. The remaining

70.16% and 64.6%, respectively, are coming from the candidate hidden state (n).

So, the new hidden state (h_prime) is computed accordingly:

Gated Recurrent Units (GRUs) | 633

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!