Chapter 3: Least Squares Regression#
Note
The purpose of these examples is to demonstrate the applications found in William Greene’s Econometric Analysis. They follow, as directly as possible, the steps in the textbook and do not always present the most efficient manner to implement these techniques in GAUSS.
Application 3.2.2 An Investment Equation#
This example demonstrates how to manually compute least squares estimates from the multivariate macroeconomic linear equation:
Getting Started#
To run this example on your own you will need to install the greeneLib package. This package houses all examples and associated data.
How to#
Step One: Loading data#
To start, load the relevant variables from the dataset using loadd()
and a formula string.
To replicate Table 3.1 and compute the regression coefficients manually we will load the following variables:
Constant
Trend
Real Investment
Interest Rate
Inflation Rate
RealGNP
// Load the relevant data
// Create filename with full path
fname = getGAUSShome() $+ "pkgs/GreeneLib/examples/TableF3-1-mod.csv";
// Load data
invest_data = loadd(fname, "date(Year, %Y) + Real Investment + Constant + Trend + Real GDP +
Interest Rate + Inflation Rate + RealGNP");
// Print the dataframe
invest_data;
Year Real Constant Trend Real Interest Inflation Real
Investment GDP Rate Rate GNP
1999 2.484 1 1 87.1 9.23 3.40 87.1
2000 2.311 1 2 88.0 6.91 1.60 88.0
2001 2.265 1 3 89.5 4.67 2.40 89.5
2002 2.339 1 4 92.0 4.12 1.90 92.0
2003 2.556 1 5 95.5 4.34 3.30 95.5
2004 2.750 1 6 98.7 6.19 3.40 98.7
2005 2.828 1 7 101.4 7.96 2.50 101.4
2006 2.717 1 8 103.2 8.05 4.10 103.2
2007 2.445 1 9 102.9 5.09 0.10 102.9
2008 1.878 1 10 100.0 3.25 2.70 100.0
2009 2.076 1 11 102.5 3.25 1.50 102.5
2010 2.168 1 12 104.2 3.25 3.00 104.2
2011 2.356 1 13 106.5 3.25 1.70 105.6
2012 2.482 1 14 108.1 3.25 1.50 109.0
2013 2.637 1 15 110.7 3.25 0.80 111.6
Step Two: Transforming data#
The computations of multivariate coefficients require that we first compute the deviations of our variables from their means. This can be done in GAUSS using the meanc()
procedure.
// Calculate means
y_bar = meanc(invest_data[., "Real Investment"]);
t_bar = meanc(invest_data[., "Trend"]);
g_bar = meanc(invest_data[., "RealGNP"]);
// Calculate deviations from the mean
y = invest_data[., "Real Investment"] - y_bar;
t = invest_data[., "Trend"] - t_bar;
g = invest_data[., "RealGNP"] - g_bar;
The results y, t, and g correspond to the in-text variables \(y\) , \(t\), and \(g\), respectively.
Step Three: Computing coefficients#
The coefficients \(b_2\), and \(b_3\) are computed following Eq. 3-8:
// Calculate b2
b2 = ((t'y)*(g'g) - (g'y)*(t'g))/((t't)*(g'g) - (g't)^2);
print "b2 :"; b2;
// Calculate b3
b3 = ((g'y)*(t't) - (t'y)*(t'g))/((t't)*(g'g) - (g't)^2);
print "b3 :"; b3;
Once \(b_2\), and \(b_3\) are calculated, when can compute \(b_1\) following Eq. 3-7:
// Calculate b1
b1 = y_bar - b2*t_bar - b3*g_bar;
print "b1 :"; b1;
This prints the computed coefficients to the Program Input/Output window:
b2 :
-0.18002371
b3 :
0.10778411
b1 :
-6.8490543
Step Four: Estimating the full model#
It is worth noting that though we just computed the coefficients manually, GAUSS has built-in procedures for least squares regression. For example, we will use olsmt()
to compute the full model:
We will continue with our example from above and use the previously defined fname to estimate our the model using olsmt()
:
call olsmt(fname, "Real Investment ~ Trend + RealGNP + Interest Rate + Inflation Rate");
Standard Prob Standardized Cor with
Variable Estimate Error t-value >|t| Estimate Dep Var
-------------------------------------------------------------------------------------------------
CONSTANT -6.21967 1.93045 -3.22188 0.009 --- ---
Trend -0.160885 0.0472355 -3.40603 0.007 -2.7478 -0.103635
RealGNP 0.0990842 0.024132 4.10592 0.002 2.84769 0.14879
Interest Rate 0.0201716 0.0336915 0.598714 0.563 0.160339 0.553021
Inflation Rate -0.0116592 0.0397682 -0.293179 0.775 -0.0486547 0.191923
Using internal GAUSS procedures, like olsmt()
greatly reduces time and effort for estimation.
Note
When calling olsmt()
we don’t need to include the Constant variable. A constant is automatically included in the regression unless otherwise specified.
Exercise 3.1 Partial Correlations#
This example compares the least squares coefficients estimates with simple correlation and partial correlation.
Getting Started#
To run this example on your own you will need to install the greeneLib package. This package houses all examples and associated data.
How to#
Step One: Loading data#
To start, load the relevant variables from the dataset using loadd()
and a formula string.
To replicate the results in Table 3.2 we will load the following variables:
Constant
Trend
Real Investment
Interest Rate
Inflation Rate
RealGNP
// Filename
fname = getGAUSSHome $+ "pkgs/GreeneLib/examples/TableF3-1-mod.csv";
// Load data
invest_data = loadd(fname, "date(Year, %Y) + Real Investment + Constant + Trend + Real GDP + Interest Rate + Inflation Rate + RealGNP");
Step Two: Estimate least squares regression#
Next, we estimate the OLS and store the results using olsmt()
. We will use the stored coefficients and standard errors for computing the partial correlations.
// Declare o_out to be an olsmtOut structure to hold estimation results
struct olsmtOut o_out;
// Estimate linear model using least squares and store results
o_out = olsmt(fname, "Real Investment ~ Trend + RealGNP + Interest Rate + Inflation Rate");
Standard Prob Standardized Cor with
Variable Estimate Error t-value >|t| Estimate Dep Var
-------------------------------------------------------------------------------------------------
CONSTANT -6.21967 1.93045 -3.22188 0.009 --- ---
Trend -0.160885 0.0472355 -3.40603 0.007 -2.7478 -0.103635
RealGNP 0.0990842 0.024132 4.10592 0.002 2.84769 0.14879
Interest Rate 0.0201716 0.0336915 0.598714 0.563 0.160339 0.553021
Inflation Rate -0.0116592 0.0397682 -0.293179 0.775 -0.0486547 0.191923
Step Three: Extract the simple correlations#
Note that the printed output table includes the correlations between the independent variables and the dependent variables. These are stored in the olsmtOut structure in the o_out.cx member. Let’s extract these to include in our comparison table:
/*
** The simple correlations
** between the dependent and
** independent variables are
** computed and stored when
** olsmt is called
*/
simple_cor = o_out.cx[1:4, cols(o_out.cx)];
Step Four: Compute the partial correlations#
To compute the partial correlations we need to :
Compute the t ratios for the variables using the stored estimates and standard errors.
Calculate the partial correlations using Eq. 3-22
Setting the signs of the partial correlations to be the same as the estimates.
/*
** Calculate the partial
** correlations using equation 3-22
*/
// Find t ratio using olsmt results
t_stats = o_out.b ./ o_out.stderr;
// Calculate partial correlations using equation 3-22
df = 10;
p_cor = sqrt((t_stats.^2) ./ (t_stats.^2 + df));
Coeff t ratio Simple Corr Partial Corr
Trend -0.16089 -3.41 -0.10363 -0.73284
RealGDP 0.09908 4.11 0.14879 0.79226
Interest 0.02017 0.60 0.55302 0.18603
Inflation -0.01166 -0.29 0.19192 -0.09232