22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Model Configuration & Training

We’re using pretty much the same Sequential model as before, except that it

doesn’t have an embedding layer anymore, and we’re using only three hidden units

instead of 128:

Model Configuration

1 torch.manual_seed(41)

2 model = nn.Sequential(

3 # Classifier

4 nn.Linear(bert_doc.embedding_length, 3),

5 nn.ReLU(),

6 nn.Linear(3, 1)

7 )

8 loss_fn = nn.BCEWithLogitsLoss()

9 optimizer = optim.Adam(model.parameters(), lr=1e-3)

"Isn’t that too few? Three?! Really?"

Really! It isn’t too few—if you try using 128 like in the previous model, it will

immediately overfit over a single epoch. Given the embedding length (768), the

model gets overparameterized (a situation where there are many more

parameters than data points), and it ends up memorizing the training set.

This is a simple feed-forward classifier with a single hidden layer. It doesn’t get

much simpler than that!

Model Training

1 sbs_doc_emb = StepByStep(model, loss_fn, optimizer)

2 sbs_doc_emb.set_loaders(train_loader, test_loader)

3 sbs_doc_emb.train(20)

fig = sbs_doc_emb.plot_losses()

964 | Chapter 11: Down the Yellow Brick Rabbit Hole

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!