22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

discussing it, let me illustrate it.

Ball Dataset and Block Model

Let’s use a dataset of 1,000 random points drawn from a ten-dimensional ball (this

seems fancier than it actually is; you can think of it as a dataset with 1,000 points

with ten features each) such that each feature has zero mean and unit standard

deviation. In this dataset, points situated within half of the radius of the ball are

labeled as negative cases, while the remaining points are labeled positive cases. It is a

familiar binary classification task.

Data Generation

1 X, y = load_data(n_points=1000, n_dims=10)

Next, we can use these data points to create a dataset and a data loader (no minibatches

this time):

Data Preparation

1 ball_dataset = TensorDataset(

2 torch.as_tensor(X).float(), torch.as_tensor(y).float()

3 )

4 ball_loader = DataLoader(ball_dataset, batch_size=len(X))

The data preparation part is done. What about the model configuration? To

illustrate the vanishing gradients problem, we need a deeper model than the ones

we’ve built so far. Let’s call it the "block" model: It is a block of several hidden

layers (and activation functions) stacked together, every layer containing the same

number of hidden units (neurons).

Instead of building the model manually, I’ve created a function, build_model(), that

allows us to configure a model like that. Its main arguments are the number of

features, the number of layers, the number of hidden units per layer, the activation

function to be placed after each hidden layer, and if it should add a batch

normalization layer after every activation function or not:

Vanishing and Exploding Gradients | 563

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!