22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

"Which BERT is that? DistilBERT?!"

DistilBERT is a smaller, faster, cheaper, and lighter version of BERT, introduced by

Sahn, V. et al. in their paper "DistilBERT, a distilled version of BERT: smaller, faster,

cheaper and lighter." [209] We’re not going into any details about it here, but we’re

using this version because it’s also friendlier for fine-tuning in low-end GPUs.

We need to change the labels to float as well so they will be compatible with the

nn.BCEWithLogitsLoss() we’ll be using:

Data Preparation

1 train_dataset_float = train_dataset.map(

2 lambda row: {'labels': [float(row['labels'])]}

3 )

4 test_dataset_float = test_dataset.map(

5 lambda row: {'labels': [float(row['labels'])]}

6 )

7

8 train_tensor_dataset = tokenize_dataset(train_dataset_float,

9 'sentence',

10 'labels',

11 auto_tokenizer,

12 **tokenizer_kwargs)

13 test_tensor_dataset = tokenize_dataset(test_dataset_float,

14 'sentence',

15 'labels',

16 auto_tokenizer,

17 **tokenizer_kwargs)

18 generator = torch.Generator()

19 train_loader = DataLoader(

20 train_tensor_dataset, batch_size=4,

21 shuffle=True, generator=generator

22 )

23 test_loader = DataLoader(test_tensor_dataset, batch_size=8)

"Batch size FOUR?!"

Yes, four! DistilBERT is still kinda large, so we’re using a very small batch size such

that it will fit a low-end GPU with 6 GB RAM. If you have more powerful hardware

at your disposal, by all means, try larger batch sizes :-)

BERT | 987

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!