22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Considering this, the not "unrolled" ("rolled" doesn’t sound right!) representation is

a better characterization of the internal structure of an RNN.

Let’s dive deeper into the internals of an RNN cell and look at it at the neuron level:

Figure 8.7 - RNN cell at neuron level

Since one can choose the number of hidden dimensions, I chose two dimensions,

simply because I want to be able to easily visualize the results. Hence, two blue

neurons are transforming the hidden state.

The number of red neurons transforming the data point will

necessarily be the same as the chosen number of hidden

dimensions since both transformed outputs need to be added

together. But this doesn’t mean the data points must have the

same number of dimensions.

Coincidentally, our data points have two coordinates, but even if

we had 25 dimensions, these 25 features would still be mapped

into two dimensions by the two red neurons.

The only operation left is the activation function, most likely the hyperbolic

tangent, which will produce the updated hidden state.

"Why hyperbolic tangent? Isn’t ReLU a better activation function?"

The hyperbolic tangent has a "competitive advantage" here since it maps the feature

space to clearly defined boundaries: the interval (-1, 1). This guarantees that, at

Recurrent Neural Networks (RNNs) | 595

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!