20.03.2021 Views

Deep-Learning-with-PyTorch

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

86 CHAPTER 4 Real-world data representation using tensors

# In[16]:

total_sulfur_threshold = 141.83

total_sulfur_data = data[:,6]

predicted_indexes = torch.lt(total_sulfur_data, total_sulfur_threshold)

predicted_indexes.shape, predicted_indexes.dtype, predicted_indexes.sum()

# Out[16]:

(torch.Size([4898]), torch.bool, tensor(2727))

This means our threshold implies that just over half of all the wines are going to be

high quality. Next, we’ll need to get the indexes of the actually good wines:

# In[17]:

actual_indexes = target > 5

actual_indexes.shape, actual_indexes.dtype, actual_indexes.sum()

# Out[17]:

(torch.Size([4898]), torch.bool, tensor(3258))

Since there are about 500 more actually good wines than our threshold predicted, we

already have hard evidence that it’s not perfect. Now we need to see how well our predictions

line up with the actual rankings. We will perform a logical “and” between our

prediction indexes and the actual good indexes (remember that each is just an array

of zeros and ones) and use that intersection of wines-in-agreement to determine how

well we did:

# In[18]:

n_matches = torch.sum(actual_indexes & predicted_indexes).item()

n_predicted = torch.sum(predicted_indexes).item()

n_actual = torch.sum(actual_indexes).item()

n_matches, n_matches / n_predicted, n_matches / n_actual

# Out[18]:

(2018, 0.74000733406674, 0.6193984039287906)

We got around 2,000 wines right! Since we predicted 2,700 wines, this gives us a 74%

chance that if we predict a wine to be high quality, it actually is. Unfortunately, there

are 3,200 good wines, and we only identified 61% of them. Well, we got what we

signed up for; that’s barely better than random! Of course, this is all very naive: we

know for sure that multiple variables contribute to wine quality, and the relationships

between the values of these variables and the outcome (which could be the actual

score, rather than a binarized version of it) is likely more complicated than a simple

threshold on a single value.

Indeed, a simple neural network would overcome all of these limitations, as would

a lot of other basic machine learning methods. We’ll have the tools to tackle this problem

after the next two chapters, once we have learned how to build our first neural

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!