22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

N-grams

The structure, in the examples above, is composed of three words and a blank: a

four-gram. If we were using two words and blank, that would be a trigram, and, for a

given number of words (n-1) followed by a blank, an n-gram.

Figure 11.8 - N-grams

N-gram models are based on pure statistics: They fill in the blanks using the most

common sequence that matches the words preceding the blank (that’s called the

context). On the one hand, larger values of n (longer sequences of words) may yield

better predictions; on the other hand, they may yield no predictions since a

particular sequence of words may have never been observed. In the latter case, one

can always fall back to a shorter n-gram and try again (that’s called a stupid back-off,

by the way).

For a more detailed explanation of n-gram models, please check

the "N-gram Language Models" [178] section of Lena Voita’s

amazing "NLP Course | For You." [179]

These models are simple, but they are somewhat limited because they can only

look back.

"Can we look ahead too?"

Sure, we can!

918 | Chapter 11: Down the Yellow Brick Rabbit Hole

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!