22.02.2024 Views

Daniel Voigt Godoy - Deep Learning with PyTorch Step-by-Step A Beginner’s Guide-leanpub

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

• What does the right-most point in the PR curve represent?

• If I raise the threshold, how do I move along the curve?

You should be able to answer all of these questions by referring to the "Metrics"

section. But, if you are eager to get the answers, here they are:

• The threshold of zero corresponds to the right-most point in both curves.

• The threshold of one corresponds to the left-most point in both curves.

• The right-most point in the PR curve represents the proportion of positive

examples in the dataset.

• If I raise the threshold, I am moving to the left along both curves.

Now, let’s double-check our curves with Scikit-Learn’s roc_curve() and

precision_recall_curve() methods:

fpr, tpr, thresholds1 = roc_curve(y_val, probabilities_val)

prec, rec, thresholds2 = \

precision_recall_curve(y_val, probabilities_val)

Figure 3.18 - Scikit-Learn’s curves

Same shapes, different points.

"Why do these curves have different points than ours?"

Simply put, Scikit-Learn uses only meaningful thresholds; that is, those thresholds

that actually make a difference to the metrics. If moving the threshold a bit does

not modify the classification of any points, it doesn’t matter for building a curve.

Also, notice that the two curves have a different number of points because

different metrics have different sets of meaningful thresholds. Moreover, these

254 | Chapter 3: A Simple Classification Problem

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!