This is the most crucial assumption because a mistake in specifying the

equation for regression is the responsibility of the statistician. One

cannot blame the nature of the data for this problem. One type of

specification bias is the use of an incorrect functional form. For example,

you have a specification bias if you use a linear function when a

logarithmic or exponential function should be used.

The other type of specification bias is when the model does not include a

relevant data series. This is the most common type of error of oversight

by because of the incorrect habit in creating a hypothesis only after

looking at the available data. This approach may result in the exclusion

of an important series that may not be in the available data set.

216

Chapter 12: Regression

Remember that a regression is based on a hypothesis ” you always define

the hypothesis first. After that, look for data that can capture all of the

variables in the hypothesis. If you do not find the data to represent an

important factor, then you should not use regression analysis. Another

bad habit is the dropping of variables from a model if the coefficient is

seen to have no impact on the dependent series. It is better to have an

irrelevant or excess series, then to drop a relevant series. In fact, the

result that a factor has no impact on the dependent series often provides

compelling insight.

Assumption 5e: The disturbance terms have a Normal Density Function

The use of the F-test for validating the model and the T-tests for

validating individual coefficients is predicated on the presumption that

the disturbance terms follow a Normal Density Function.

12.1.F ASSUMPTION 6: THERE ARE NO STRONG LINEAR

RELATIONSHIPS AMONG THE INDEPENDENT VARIABLES

If the relationships are strong, then the regression estimation will not be

able to isolate the impact of each independent series. Related to this is

another rule: there should be no endogenity in the model. This means

that none of the independent variables should be dependent on other

variables. An independent series should not be a function of another

independent series.

Every estimate in a regression is not only a point estimate of the

parameter of the expected value of the parameter. The regression

estimates the expected value (mean) of the parameter, its variance, and

its Density Function (the assumption of normality provides the shape of

217

Statistical Analysis with Excel

the Density Function). The mean and standard error are estimated by the

model. There is a pair of such estimates for each coefficient (each BETA),

each disturbance term, and each predicted value of the dependent series.

Note: The dependent series is that whose values you are trying to predict

(or whose dependence on the independent variables is being studied). It is

also referred to as the “Explained” or “Endogenous” series, or as the

“Regressand.”

The independent variables are used to explain the values of the dependent

series. The values of the independent variables are not being

explained/determined by the model ” thus, they are “independent” of the

model. The independent variables are also called “Explanatory” or

“Exogenous” variables. They are also referred to as “Regressors.”

I do not show the details of regression analysis. Please refer to

our book “Interpreting regression Output” available at

http://www.vjbooks.net.

CONDUCTING THE REGRESSION

12.2

Go to the menu option TOOLS/DATA ANALYSIS33. Select the option

“Regression” as shown in Figure 154.

If you do not see this option, then use TOOLS / ADD-INS to activate the Add-In for

33

data analysis. Refer to section 41.4.

218

Chapter 12: Regression

Figure 154: Selecting the regression procedure

Choose the exact cell references for the Y and X ranges. So do not choose

“C:D;” instead, choose C1:D235, as shown in Figure 155.

Other restrictions:

“ All the X variables have to be in adjacent columns

and

“ The data cannot have missing values

Choose all other options as shown in Figure 155.

219

Statistical Analysis with Excel

Figure 155: The completed Regression dialog