22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

• training the Transformer to tackle our sequence-to-sequence problem

• understanding that the validation loss may be much lower than the training

loss due to regularizing effect of dropout

• training another model using PyTorch’s (norm-last) Transformer class

• using the Vision Transformer architecture to tackle an image classification

problem

• splitting an image into flattened patches by either rearranging or embedding

them

• adding a special classifier token to the embeddings

• using the encoder’s output corresponding to the special classifier token as

features for the classifier

Congratulations! You’ve just assembled and trained your first Transformer (and

even a cutting-edge Vision Transformer!): This is no small feat. Now you know what

"layers" and "sub-layers" stand for and how they’re brought together to build a

Transformer. Keep in mind, though, that you may find slightly different

implementations around. It may be either norm-first or norm-last or maybe yet

another customization. The details may be different, but the overall concept

remains: It is all about stacking attention-based "layers."

"Hey, what about BERT? Shouldn’t we use Transformers to tackle NLP

problems?"

I was actually waiting for this question: Yes, we should, and we will, in the next

chapter. As you have seen, it is already hard enough to understand the Transformer

even when it’s used to tackle such a simple sequence-to-sequence problem as ours.

Trying to train a model to handle a more complex natural language processing

problem would only make it even harder.

In the next chapter, we’ll start with some NLP concepts and techniques like tokens,

tokenization, word embeddings, and language models, and work our way up to

contextual word embeddings, GPT-2, and BERT. We’ll be using several Python

packages, including the famous HuggingFace :-)

[146] https://github.com/dvgodoy/PyTorchStepByStep/blob/master/Chapter10.ipynb

[147] https://colab.research.google.com/github/dvgodoy/PyTorchStepByStep/blob/master/Chapter10.ipynb

[148] https://arxiv.org/abs/1906.04341

[149] https://arxiv.org/abs/1706.03762

Recap | 877

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!