22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

• learning that a language model is used to estimate the probability of a token,

pretty much like filling in the blanks in a sentence

• understanding the general idea behind the Word2Vec model and its common

implementation, the CBoW (continuous bag-of-words)

• learning that word embeddings are basically a lookup table to retrieve the

vector corresponding to a given token

• using pre-trained embeddings like GloVe to perform embedding arithmetic

• loading GloVe embeddings and using them to train a simple classifier

• using a Transformer encoder together with GloVe embeddings to classify

sentences

• understanding the importance of contextual word embeddings to distinguish

between different meanings for the same word

• using flair to retrieve contextual word embeddings from ELMo

• getting an overview of ELMo’s architecture and its hidden states (the

embeddings)

• using flair to preprocess sentences into BERT embeddings and train a

classifier

• learning about WordPiece tokenization used by BERT

• computing BERT’s input embeddings using token, position, and segment

embeddings

• understanding BERT’s pre-training tasks: masked language model (MLM) and

next sentence prediction (NSP)

• exploring the different outputs from BERT: hidden states, pooler output, and

attentions

• training a classifier using pre-trained BERT as a layer

• fine-tuning BERT using HuggingFace’s models for sequence classification

• remembering to always use matching pre-trained model and tokenizer

• exploring and using the Trainer class to fine-tune large models using gradient

accumulation

• combining tokenizer and model into a pipeline to easily deliver predictions

• loading pre-trained pipelines to perform typical tasks, like sentiment analysis

Recap | 1017

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!