22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

[215]

. For a demo of GPT-2’s capabilities, please check AllenNLP’s Language

Modeling Demo, [216] which uses GPT-2’s medium model (345 million parameters).

You can also check GPT-2’s documentation [217] and model card, [

218]

available at HuggingFace, for a quick overview of the model

and its training procedure.

For a general overview of GPT-2, see this great post by Jay

Alammar: "The Illustrated GPT-2 (Visualizing Transformer

Language Models)." [219]

To learn more details about GPT-2’s architecture, please check

"The Annotated GPT-2" [220] by Aman Arora.

There is also Andrej Karpathy’s minimalistic implementation of

GPT, minGPT, [221] if you feel like trying to train a GPT model from

scratch.

Let’s load the GPT-2-based text generation pipeline:

text_generator = pipeline("text-generation")

Then, let’s use the first two paragraphs from Alice’s Adventures in Wonderland as

our base text:

base_text = """

Alice was beginning to get very tired of sitting by her sister on

the bank, and of having nothing to do: once or twice she had peeped

into the book her sister was reading, but it had no pictures or

conversations in it, `and what is the use of a book,'thought Alice

`without pictures or conversation?' So she was considering in her

own mind (as well as she could, for the hot day made her feel very

sleepy and stupid), whether the pleasure of making a daisy-chain

would be worth the trouble of getting up and picking the daisies,

when suddenly a White Rabbit with pink eyes ran close by her.

"""

The generator will produce a text of size max_length, including the base text, so this

value has to be larger than the length of the base text. By default, the model in the

GPT-2 | 1005

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!