22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Training Arguments

1 from transformers import TrainingArguments

2 training_args = TrainingArguments(

3 output_dir='output',

4 num_train_epochs=1,

5 per_device_train_batch_size=1,

6 per_device_eval_batch_size=8,

7 evaluation_strategy='steps',

8 eval_steps=300,

9 logging_steps=300,

10 gradient_accumulation_steps=8,

11 )

"Batch size ONE?! You gotta be kidding me!"

Well, I would, if it were not for the gradient_accumulation_steps argument. That’s

how we can make the mini-batch size larger even if we’re using a low-end GPU

that is capable of handling only one data point at a time.

The Trainer can accumulate the gradients computed at every training step (which

is taking only one data point), and, after eight steps, it uses the accumulated

gradients to update the parameters. For all intents and purposes, it is as if the

mini-batch had size eight. Awesome, right?

Moreover, let’s set the logging_steps to three hundred, so it prints the training

losses every three hundred mini-batches (and it counts the mini-batches as having

size eight due to the gradient accumulation).

"What about validation losses?"

The evaluation_strategy argument allows you to run an evaluation after every

eval_steps steps (if set to steps like in the example above) or after every epoch (if

set to epoch).

"Can I get it to print accuracy or other metrics too?"

Sure, you can! But, first, you need to define a function that takes an instance of

EvalPrediction (returned by the internal validation loop), computes the desired

metrics, and returns a dictionary:

996 | Chapter 11: Down the Yellow Brick Rabbit Hole

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!