12.07.2015 Views

I t + - Technion moodle

I t + - Technion moodle

I t + - Technion moodle

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Demand Estimation• What will happen to quantity demanded,total revenue and profit if we increaseprices?• What will happen to demand if consumerincomes increase or decrease due to aneconomic expansion or contraction?• What affect will a tuition increase have onMarquette’s revenue?


Practical Example: Port AuthorityTransit Case• How will the fareprice increase affectdemand and overallrevenues?• What other factors,besides fares, affectdemand?


Demand Estimation Using MarketResearch Techniques• How do we estimate the Demand Function?Econometric Techniques (Your Project)Non-econometric Techniques• Look first at Non-econometric Approaches• What are these?


Consumer Surveys: Just AskThem• Question customers toestimate demand “How many bags of chipswould you buy if theprice was $2.29/bag?” “How many cases of beerwould you buy if theprice of beer was$11.99/case?”• Compare differentindividuals’ responses• WES example• Advantages: Flexible Relativelyinexpensive toconduct• Disadvantages Many potentialbiases Strategic Information Hypothetical Interviewer


Market Experiments• Firms vary prices and/or advertising and compareconsumer behavior Over time (e.g., before and after rebate offer) Over space (e.g., compare Milwaukee andMinneapolis consumption when prices are variedbetween two regions)• Potential Problems Control of other factors not guaranteed. “Playing” with market prices may be risky. Expensive


Consumer Clinics and FocusGroups• Simulated marketsetting in whichconsumers are givenincome to spend on avariety of goods• The experimenterscontrol income, prices,advertising,packaging, etc.• Advantages Flexibility• Disadvantages Selectivity bias Very expensive


Econometrics• “Economic Measurement”• Collection of statistical techniques availablefor testing economic theories by empiricallymeasuring relationships among economicvariables.• Quantify economic reality – bridge the gapbetween abstract theory and real worldhuman activity.


Practical Example• How does the state ofWisconsin set abudget?• What is the process?


The Econometric Modeling Process1. Specification of the theoretical model2. Identification of the variables3. Collection of the data4. Estimation of the parameters of the modeland their interpretation5. Development of forecasts (estimates)based on the model


Numbers Instead of Symbols!• Normal model of consumer demand• Q = f(P, P s , I d )• Q = quantity demanded of good, P = goodprice, P s = price of substitute good, I d =disposable income• Econometrics allows us to estimate therelationship between Q and P, P s and I dbased on past data for these variables


Q = 31.5 – 0.73P + 0.11P s +0.23Y d• Instead of just expecting Q to “increase” ifthere is an increase in I d – we estimate thatQ will increase by 0.23 units per 1 dollar ofincreased disposable income• 0.23 is called an estimated regressioncoefficient• The ability to estimate these coefficients iswhat makes econometrics useful


Regression Analysis• One econometric approach• Most popular among economists, businessanalysis and social sciences• Allows quantitative estimates of economicrelationships that previously had beencompletely theoretical• Answer “what if” questions


Regression Analysis Continued• Regression analysis is a statistical technique thatattempts to “explain” movements in one variable,the dependent variable, as a function ofmovements in a set of other variables, called theindependent (or explanatory) variables, throughthe quantification of a single equation.• Q = f(P, P s , Y d )• Q = dependent variable• P, P s , Y d = independent variables• Deals with the frequent questions of cause andeffect in business


What is Regression ReallyDoing?• Regression is the fitting of curves to data.PMore later!Q


Gathering Data• Once the model is specified, we must collect data. Time-series data e.g., sales for my company over time. What most of you will be using in yourprojects. Cross-sectional data e.g., sales of 10 companies in the foodprocessing industry at one point in time.


Garbage In, Garbage Out• Your empirical estimates will be only asreliable as your data.Look at the two quotes from Stamp andValavanis that follow.• You will want to take particular care indeveloping your databases.


Sir Josiah Stamp“Some Economic Factors in Modern Life”• The government are very keen on amassing statistics.They collect them, add them, raise them to the n’thpower, take the cube root and prepare wonderfuldiagrams. But you must never forget that every oneof those figures comes in the first instance from thevillage watchman, who just puts down what he damnwell pleases.• Moral: Know where your data comes from!


Valavanis• “Econometric theory is like an exquisitelybalanced French recipe, spelling outprecisely with how many turns to mix thesauce, how many carats of spice to add, andfor how many milliseconds to bake themixture at exactly 474 degrees oftemperature.”


Valavanis - continued• “But when the statistical cook turns to raw materials, hefinds that hearts of cactus fruit are unavailable, so hesubstitutes chunks of cantaloupe; where the recipe callsfor vermicelli he uses shredded wheat; and hesubstitutes green garment dye for curry, ping-pongballs for turtle’s eggs, and, for Chaligougnac vintage1883, a can of turpentine.”• Moral: Be careful in your choice of proxy variables


Economic Data• You are in the process of gatheringeconomic data.• Some will come from your firm.• Some may come from trade publications.• Some will come from the government.• Must be of the same time scale (monthly,quarterly, yearly, etc.)


Always be Skeptical• Always approach your data with a criticaleye.Remember the quotesJust because something appears in a tablesomewhere, does not mean it isnecessarily correct.Government data revisions.Does your data pass the “smell test”?


How to Begin the DataExercise• First question you should ask yourself is:“If money were no object, what would be theperfect data for my demand model?”• From that basis, you can then start finding whatactual data you can get your hands on.There will be compromises that you have tomake. These are called proxy variables! Remember the Valavanis quote.


How to Choose a Good Proxy• Proxy variables should be variables whosemovements closely mirror the desired variable forwhich you do not have a measure.• For example: Tastes of consumers are difficult tomeasure. May use a time trend variable if you suspectthese are changing over time. May include demographic characteristics of thepopulation.


Dummy Variables• Binary Variable• Take on a “1” or a “0”• Example: Trying to model salaries• 1 if you have a college degree, 0 if youdon’t• Example: Model effect of Harley-Davidsonreunion years on demand• 1 for reunion years, 0 otherwise


Back to Regression Analysis• Theoretical Model: Y = 0 + 1 X + • Y is dependent variable• X is independent variable• Linear Equation (no powers greater than 1)• ’s are coefficients – determine coordinates of thestraight line at any point• 0 is the constant term – value of Y when X is 0(more on this later - no economic meaning butrequired)• 1 is the slope term – amount Y will change whenX increases by one unit (can be 2 … n ) holds allother ’s constant (except those not in model!)• More about , the error term, later


The Error Term• Y = 0 + 1 X + • is purely theoretical• Stochastic Error Term Needed Because:‣ Minor influences on Y are omitted from equation(data not available)‣ Impossible not to have some measurement error inone of the equation’s variables‣ Different functional form (not linear)‣ Pure randomness of variation (remember humanbehavior!)


Example of Error• Trying to estimate demand for SUV’s• Demand may fall because of uncertainty about theeconomy (what data do we use for uncertainty?)• Other independent variables may be omitted• Demand function may be non-linear• Demand for SUV’s is determined by humanbehavior – some purely random variation• All end up in error term


The Estimated RegressionEquation• Theoretical Regression Equation:Y = 0 + 1 X + • Estimated Regression Equation:Y^ = 103.40 + 6.38X + e• Observed, real word X and Y values are used tocalculate coefficient estimates 103.40 and 6.38• Estimates are used to determine Y-hat, the fittedvalue of Y• “Plug-in” X and get estimate of Y


Differences Between Theoreticaland Estimated RegressionEquations• 0, 1 replaced with estimates 0^, 1^(103.40 and6.38)• Can’t observe true coefficients, we make estimates• Best guesses given data for X and Y• Y^ is estimated value of Y – calculated from theregression equation (line through Y data)• Residual e = Y – Y^• Residual is difference between Y (data) and Y^(estimated Y with regression)• Theoretical model has error, estimated model hasresidual


A Simple Regression Example inEviews• Demand for Ford Taurus


Ordinary Least SquaresRegression• OLS Regression• Most Common• Easy to use• Estimates have usefulcharacteristics


How Does Ordinary LeastSquares Regression Work?• We attempt to find the curve that best fitsthe data among all possibilities• While there are a number of ways of doingthis, OLS minimizes the sum of the squaredresiduals


Finding Best Fitting Line usingOrdinary Least SquaresY = 0 + 1 X + Y^ = 0^ + 1^X + e“hat” is sample estimate oftrue valueP_PActual data points aredependent variable (Y’s)OLS minimizes: e 2e = (Y – Y^)OLS minimizes (Y-Y^) 2_QQBest possible linear line through data


True vs. Estimated RegressionLine• No one knows the parameters of the trueregression line:Y t = + X t + t (theoretical)• We must come up with estimates.Y^t = ^ + ^X t + e t (estimated)


So how does OLS work?• OLS selects theestimates of 0 and 1that minimize thesquared residuals• Minimize differencebetween Y and Y^• Statistical Software• Complex math behindthe scenes


OLS Regression CoefficientInterpretation• Regression coefficients (’s) indicate thechange in the dependent variable associatedwith a one-unit increase in the independentvariable in question holding constant theother independent variables in theequations (but not those not in the equation)• A controlled economic experiment?


Another Example• The demand for beef• B = 0 + 1 P + 2 Y d• B = per capita consumption of beef per year• Y d = per capita disposable income per year• P = price of beef (cents/pound)• Estimate this using Eviews


Overall Fit of the Model• Need a way to evaluate model• Compare one model with another• Compare one functional form with another• Compare combinations of independentvariables• Use coefficient of determination r 2


2 – The Coefficient of Determination• Reported by Eviews every time you run aregression• Between 0 and 1• The larger the better• Close to one shows an excellent fit• Near zero shows failure of estimated regression toexplain variance in Y• Relative term• r 2 = .85 says that 85% of the variation in thedependent variable is explained by theindependent variables


Graphical r 2• r 2 = 0• r 2 = .95• r 2 = 1


The Adjusted r 2• Problem with r 2 : Adding another independentvariable never decreases r 2• Even a nonsensical variable• Need to account for a decrease in “degrees offreedom”• Degrees of freedom = data observations –coefficients estimated• Example: 100 years of data, 3 variables estimated(including constant)• DF = 97


Adjusted r 2• Slightly negative to 1• Accounts for degrees of freedom• Better estimate of fit• Don’t rely on any one statistic• Common sense and theory more important• Same interpretation as r 2• Use adjusted r 2 from now on!


The Classical Linear Regression(CLR) Model• These are some basic assumptions whichwhen met, make the Ordinary Least Squaresprocedure the “Best Linear UnbiasedEstimator” (aka BLUE).• When one or more of these assumptions isviolated, it is sometimes necessary to makeadjustments to our model.


Assumptions(Y t = X 1t + X 2t +...+ t )• Linearity in coefficients and error term• has zero population mean• All independent variables are independent of • Error term observations are uncorrelated with eachother (no serial correlation)• has constant variance (no heteroskedasticity)• No independent variables are perfectly correlated(multicollinearity)Will come back to some of these when we test our models


1st Assumption: Linearity• We assume that the model is linear (additive) in thecoefficients and in the error term, and specificationis correct. e.g., Y t = X 1 + X 2 + is is linear in both,whereas Y t = X 1 + X 2 + is not.• Some nonlinear models can be transformed intolinear models. e.g., Y t = X 1X 2 We showed this can be transformed using logsto:lnY t =ln lnX 1 + lnX 2 + ln


Hypothesis Testing• In statistics we cannot“prove” a theory iscorrect• Can “reject” ahypothesis with acertain degree ofconfidence


Common Hypothesis Test• H 0 : = 0 – Null Hypothesis• H A : 0 – Alternative Hypothesis• Test whether or not the coefficient isstatistically significantly different from zero• Does the coefficient affect demand?• Two-tailed test


Does Rejecting the Null HypothesisGuarantee that the Theory is Correct?• NO! It is possible that we arecommitting what is known as a TypeI error. A Type I error is rejecting that Nullhypothesis when it is in fact correct.• Likewise, we may also commit aType II error A Type II error is failing to rejectthe Null hypothesis when thealternative hypothesis is correct.


Type I and Type II ErrorExample• Presumption of innocence until provenguilty• H 0 : The defendant is innocent• H A : The defendant is guilty• Type I error: sending an innocent defendantto jail• Type II error: freeing a guilty defendant


The t-Test, and the t-Statistic• We can use the t-Test to do hypothesistesting on individual coefficients.• Given the linear regression model:Y t = X 1 + X 2 +...+ tWe can calculate the t-statistic foreach estimated value of (i.e., hat ),and test hypotheses on that estimate.


Setting up the Null andAlternative Hypotheses• H 0 : 1 = 0(i.e., X 1 is not important)• H A : 1 0(i.e., X 1 is important, eitherpositively or negatively)


Testing the Hypothesis• Set up null and alternative hypothesis• Run regression and generate t-score.• Look up the critical value of the t-Statistic (t c ),given the degrees of freedom (n-k) in a twotailedtest using X% level of significance (1%,5%, 10%)• n = sample size, k = estimated coefficients(including intercept)• Reject null (=0) if abs(t k )> t c• t Statistic Table on Page 754 of Hirschey• Interpretation of level of significance: 5%means only 5% chance estimate is actuallyequal to zero or not significant statistically(this is a 95% confidence)


Example• Taurus example with t-stats


Limitations of the t-Test1. Does not indicate theoreticalvalidity2. Does not test Importance of thevariable.The size of the coefficient doesthis.


F-test and the F-statistic• You can also test whether agroup of coefficients isstatistically significant.• Look at the F-test for all ofthe independent variablecoefficients.• First set up the null andalternative hypotheses.


H 0 and H A for the F-test• H 0 : = = =...= k =0i.e., all of the slope coefficients aresimultaneously zero.• H A : not H 0i.e., at least one, if not more slopecoefficients, are nonzero.Note: It does not indicate which one or onesof the coefficients are nonzero.


The Critical F• As with the t-statistic, you must compare theactual value of F with its critical value (F c ):• Actual value from EVIEWS orF k-1, n-k = [r 2 /(k-1)]/[(1-r 2 )/(n-k)]F C must be looked up in a table, using theappropriate degrees of freedom for thenumerator (k-1) and the denominator (n-k)Table on Page 751 (10%), 752 (5%) and753 (1%) of Hirschey


The F-Test•If FF C then you reject H 0(all coefficients not equal tozero)•If F


Specification Errors• Suppose that your make a mistake inyour choice of independent variables.There are 2 possibilities:You omit an important variableYou include an extraneous variable• There are consequences in both cases.


Omitting an Important Variable• Suppose your true regression model is:Q t = P t + I t + t• Suppose you specify the model as:Q t = P t + t*• Thus, error term of the misspecified modelcaptures the influence of income, I t . t* = I t + t


Consequences• Prevents you from getting a coefficientfor income• Causes bias in the price estimate• Violates classical assumption of errorterm not being correlated with anexplanatory (independent) variable


Inclusion of an IrrelevantVariable• A variable that is included, that does notbelong in your model also has consequences.• Does NOT bias the other coefficients.• Lowers t-scores of other coefficients (so youmight reject)• Will raise r 2 but will likely decrease theadjusted r 2 (help you identify)


Example• Annual Consumption of Chicken• Y = consumption of chicken, PC = price ofchicken, PB = price of beef, I = disposable income• Y^ = 31.5 – 0.73PC + 0.11PB + 0.23I• PC t-stat = -9.12, PB t-stat = 2.50, I t-stat = 14.22• Adjusted r 2 = 0.986• Interpretation?


Example• Add interest rate to the equation, R• Y^ = 30 – 0.73PC + 0.12PB + 0.22YD + 0.17R• PC t-stat = -9.10, PB t-stat = 2.08, YD t-stat =11.05, R t-stat = 0.82• Adjusted r 2 = .985• Lowers t-stats and adjusted r 2• t-stat suggests rejection and so does the adjusted r 2


How do you decide whether avariable should be included?• Trial and Error – Many EVIEWS runs!• Start with THEORY! Use your judgement here!• If theory does not provide a clear answer, then: Look at t-test Look at adjusted r 2 Look at whether other coefficients appear to bebiased when you exclude the variable from themodel.


Inclusion of Lagged Variables• Some independent variables influence demand with alag. For example, advertising may primarily influencedemand in the following month, rather than thecurrent month. Thus, Q t = 0 + 1 P t + 2 I t + 3 A t-1 + t• When there is a good reason to suspect a lag (i.e.,when theory suggests a lagged relationship), you caninvestigate this option.


Eviews Lagged Variable• Unemployment in previous time periodsimportant to current demand for Taurus?


Functional Form• Don’t forget the constant term – no meaning butrequired for classical assumptions• Linear Form• Double Log Form• There are many others that we won’t discuss


Linear Form• Y = 0 + 1 X + • What we have looked at thus far• Constant slope is assumed• Y/X = 1


Double-Log Form• Second most common• Natural log of Y is independent variable andnatural log of X’s are dependent variables• lnY = 0 + 1 lnX 1 + 2 lnX 2 + • Elasticities of the model are constant• Elasticity Y,Xk = %Y/ %X 1 = 1 = constant• Interpretation of coefficients: if X 1 increases by1% while the other X 2 is held constant, Y willchange by 1 %• Can’t be any negative or 0 observations in yourdata set (natural log not defined)


Violations of the Classical Model• Multicollinearity• Serial Correlation• Others


Problem of Multicollinearity• Recall the CLR assumption that theindependent variables are not perfectlycorrelated with each other• This is called “perfect multicollinearity”Easy to detectOLS cannot estimate parameters in thissituation (put in the same independenttwice and Eviews can’t do it)• Look at problem of imperfectmulticollinearity


Imperfect Multicollinearity• This occurs when two or moreindependent variables are highly, butnot perfectly correlated with eachother!If this is severe enough, it can influencethe estimation of the ’s in the model.


How to Detect the Problem• There are some formal testsBeyond scope of this course• Look for the tell-tale signs of the problem:High adjusted r 2 , high F-statistics, and low t-scores on suspected collinear variables.Eviews example with Taurus


Remedies• Possibly do nothing!If t-scores are at or near significancelevels, you may want to “live with it”.• Drop one or more collinear variables.Let the remaining variable pick up thejoint impact.This is ok if you have redundancies.


Remedies - continued• Form a new variable:e.g., if income and population arecorrelated, you could form per capitaincome: I/Pop.Other solutions I can help with on yourprojects


The Problem ofSerial Correlation• The fourth assumption of the CLR model is:“Observations of the error term areuncorrelated with each other”• When this is not satisfied, we have aproblem known as serial correlation.


Examples of Serial Correlation• Positive SerialCorrelation• Negative SerialCorrelationQQPP


Consequences of SerialCorrelation• Pure serial correlation does not bias theestimates• Serial correlation tends to distort t-scores• Serial correlation results in a pattern ofobservations in which OLS gives a better fitto the data than would be obtained in theabsence of the problem (t scores higher).• Uses error to explain dependent variable


QUESTION:Why is this a problem?• This suggests that t-statistics areoverestimated!• Type I error: You may falsely reject thenull hypothesis, when it is in fact true.• Neither, F-statistics nor t-statistics can betrusted in the presence of serial correlation.


Detection:The Durbin-Watson d-test• This is a test for first order serial correlationThis is the most common type in economicmodels.Note that there are other tests (Q-test,Breusch-Godfrey LM test), but we will notcover them here.• The d-statistic is derived from the regressionresiduals (e).


Theoretical range of d-statistic• If there is perfect positive serial correlation thend=0.• If there is perfect negative serial correlation thend=4.• If there is no serial correlation, then d=2• Check this statistic in Eviews on your project• If near 2 no problem, if different than 2 then …


Correction for Serial Correlationusing GLS• Adding an autoregressive term solves serial correlationproblem• Details is outside scope of class• Soviet Defense spending model• If your original regression model was:LS SDH C USD SY SPDW=0.62 a problem• Simply add an AR(1) term to your command line:LS SDH C USD SY SP AR(1)DW=1.97 problem solved


Summary Steps for Project• Think about theoretical model: what independentvariables make sense based on theory? (alreadydoing this)• Collect data and examine it (already doing this)• Choose a functional form (likely linear)• Run regression models in Eviews• Examine adjusted r 2 , t-stats, F-stat and exclude orinclude variables based on these and theory• Do you need lagged variables?• Look for evidence of (and correct for)multicollinearity or serial correlation


Summary Steps for Project• Interpret your results• Use model to forecast demand (next topic)• I’ll do a “sample project” next time usingthe Taurus data


Problem Set #4 (DUE Monday, May24 th )• Go to classwork directory and get BEEF2.wf1• B = demand for beef (pounds/person)• P = price of beef (cents/pound)• Yd = Per capita disposable income ($1000’s)• Estimate the equation with Beef as the dependentvariable.• Interpret your results (coefficients of P and Yd)• Are they statistically significant (t-stats)?• At what levels?• Interpret r 2


Homework Continued• Interpret Adjusted r 2• Which is a better measure of overall fit? Why?• Is F-Stat Significant? What does it mean?• Any evidence of serial correlation?• How could this be corrected for?• Estimate the equation as a log-log model• Interpret the results• Is beef a normal good?• Is demand elastic or inelastic (for price andincome)?

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!