Advanced Data Analytics Using Python_ With Machine Learning, Deep Learning and NLP Examples ( 2023)

Recommendations

Info

Chapter 3Supervised Learning Using Pythony = df[target column ]X = df[ col1, col2 ..]X=sm.add_constant(X)logit=sm.Logit(y,X)result=logit.fit()result.summary()Classification and RegressionClassification and regression may be applied on the same problem. Forexample, when a bank approves a credit card or loan, it calculates a creditscore of a candidate based on multiple parameters using regression andthen sets up a threshold. Candidates having a credit score greater thanthe threshold are classified as potential nonfraud, and the remaining areconsidered as potential fraud. Likewise, any binary classification problemcan be solved using regression by applying a threshold on the regressionresult. In Chapter 4, I discussed in detail how to choose a thresholdvalue from the distribution of any parameter. Similarly, some binaryclassifications can be used in place of logistic regression. For instance,say an e-commerce site wants to predict the chances of a purchase orderbeing converted into a successful invoice. The site can easily do so usinglogistic regression. The Naive Bayes classifier can be directly appliedon the problem because it calculates probability when it classifies thepurchase order to be in the successful or unsuccessful class. The randomforest algorithm can be used in the problem as well. In that case, amongthe N decision tree, if the M decision tree says the purchase order will besuccessful, then M/N will be the probability of the purchase order beingsuccessful.70
Chapter 3Supervised Learning Using PythonIntentionally Bias the Model to Over-Fit orUnder-FitSometimes you need to over- or under-predict intentionally. In anauction, when you are predicting from the buy side, it will always be goodif your bid is little lower than the original. Similarly, on the sell side, itis desired that you set the price a little higher than the original. You cando this in two ways. In regression, when you are selecting the featuresusing correlation, over- predicting intentionally drops some variable withnegative correlation. Similarly, under-predicting drops some variable withpositive correlation. There is another way of dealing with this. When youare predicting the value, you can predict the error in the prediction. Toover-predict, when you see that the predicted error is positive, reduce theprediction value by the error amount. Similarly, to over-predict, increasethe prediction value by the error amount when the error is positive.Another problem in classification is biased training data. Supposeyou have two target classes, A and B. The majority (say 90 percent) oftraining data is class A. So, when you train your model with this data, allyour predictions will become class A. One solution is a biased samplingof training data. Intentionally remove the class A example from training.Another approach can be used for binary classification. As class B is aminority in the prediction probability of a sample, in class B it will alwaysbe less than 0.5. Then calculate the average probability of all points cominginto class B. For any point, if the class B probability is greater than theaverage probability, then mark it as class B and otherwise class A.71
Page 1 and 2:
AdvancedData AnalyticsUsing PythonW
Page 3 and 4:
Advanced Data Analytics Using Pytho
Page 5 and 6:
Table of ContentsAbout the Authorxi
Page 7 and 8:
Table of ContentsNaive Bayes Classi
Page 9 and 10:
Table of ContentsTime-Series Analys
Page 11 and 12:
About the Technical ReviewerSundar
Page 13 and 14:
CHAPTER 1IntroductionIn this book,
Page 15 and 16:
Chapter 1IntroductionOOP in PythonB
Page 17 and 18:
Chapter 1Introductionif pageFile.ge
Page 19 and 20:
Chapter 1Introductiondef parseSoupT
Page 21 and 22:
Chapter 1Introductionsuper(AirLineR
Page 23 and 24:
Chapter 1Introductionif "Value" in
Page 25 and 26:
Chapter 1Introductionavailable in R
Page 27 and 28:
Chapter 1Introductiong.scores = Tab
Page 29 and 30:
Chapter 1Introductionif f == 'clien
Page 31 and 32: Chapter 1Introductionmetadata1 = Me
Page 33 and 34: Chapter 1Introduction(pred4,prob4)
Page 35 and 36: CHAPTER 2ETL with Python(Structured
Page 37 and 38: Chapter 2ETL with Python (Structure
Page 61 and 62: CHAPTER 3Supervised LearningUsing P
Page 63 and 64: Chapter 3Supervised Learning Using
Page 71 and 72: Decision TreeChapter 3 Supervised L
Page 81: Chapter 3Supervised Learning Using
Page 89 and 90: CHAPTER 4UnsupervisedLearning: Clus
Page 91 and 92: Chapter 4Unsupervised Learning: Clu
Page 95 and 96: General and Euclidean DistanceThe d
Page 109 and 110: Chapter 4Graph Theoretical Approach
Page 111 and 112: CHAPTER 5Deep Learningand Neural Ne
Page 113 and 114: Chapter 5Deep Learning and Neural N
Page 133 and 134:
Chapter 6Time SeriesFigure 6-1. A t
Page 135 and 136:
Chapter 6Time SeriesA trend can be
Page 137 and 138:
Chapter 6Time SeriesThe simple movi
Page 139 and 140:
Chapter 6Time SeriesIrregular Fluct
Page 141 and 142:
Chapter 6Time SeriesFigure 6-2 show
Page 143 and 144:
Chapter 6Time Seriesìïïr ( k)=í
Page 145 and 146:
Chapter 6Time SeriesThe autocovaria
Page 147 and 148:
Chapter 6Time SeriesIn this case, r
Page 149 and 150:
Chapter 6Time SeriesHere is how to
Page 151 and 152:
Chapter 6 Time SeriesThe Fourier Tr
Page 153 and 154:
Chapter 6Time Seriesmodel provides
Page 155 and 156:
CHAPTER 7Analytics at ScaleIn recen
Page 157 and 158:
Chapter 7Analytics at Scalealphabet
Page 159 and 160:
Chapter 7Analytics at Scalepublic a
Page 161 and 162:
Chapter 7Analytics at ScaleHere is
Page 163 and 164:
Chapter 7Analytics at Scaleimport o
Page 165 and 166:
Chapter 7Analytics at ScaleRootBDAS
Page 167 and 168:
Chapter 7Analytics at ScaleTo test
Page 169 and 170:
Chapter 7Analytics at ScaleHDFS Fil
Page 171 and 172:
Chapter 7Analytics at ScaleJoin Pat
Page 173 and 174:
Chapter 7Analytics at Scale}{}Strin
Page 175 and 176:
Chapter 7Analytics at Scale}}String
Page 177 and 178:
Chapter 7Analytics at ScaleSpark Co
Page 179 and 180:
Chapter 7Analytics at Scalespeech p
Page 181 and 182:
Chapter 7Analytics at Scaley=height
Page 183 and 184:
Chapter 7Analytics at Scaleif not s
Page 185 and 186:
Chapter 7Analytics at Scaleelse:(pr
Page 187 and 188:
Chapter 7Analytics at Scalefor futu
Page 189 and 190:
Chapter 7Analytics at ScaleYou can
Page 191 and 192:
IndexCollaborative filtering, 52Com
Page 193 and 194:
IndexNeo4j, 34Neo4j REST, 35Neural
Page 195:
IndexTime series (cont.)transformat
show all

Advanced Data Analytics Using Python_ With Machine Learning, Deep Learning and NLP Examples ( 2023)

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?