# Chapter 3: Simple Linear Regression: Estimation of an Optimal Hedge Portfolio¶

## Example 1: Estimate Univariate Regression of Spot on Futures¶

This example demonstrates how to compute ordinary least squares (OLS) estimates of the equation:

$\text{Spot} = \alpha + \beta_1\text{Futures} + \epsilon$

### Getting Started¶

To run this example on your own you will need to install the BrooksEconFinLib package. This package houses all examples and associated data.

### How to¶

To start, load the relevant variables from the dataset using loadd() and a formula string.

To replicate this example, we will load the following variables:

• Date

• Spot

• Futures

// Create file name with full path
data_set = getGAUSSHome() $+ "pkgs/BrooksEcoFinLib/examples/Sandphedge.csv"; // Use formula string to specify the variables to load and to tell // GAUSS that Date is a date variable data = loadd(data_set, "date(Date) + Spot + Futures"); // Print the first 5 observations of all columns of our data head(data);   Date Spot Futures 1979-09-01 947.28003 954.50000 1979-10-01 914.62000 924.00000 1979-11-01 955.40002 955.00000 1979-12-01 970.42999 979.25000 1980-01-01 980.28003 987.75000  Since CSV files do not keep track of variable types, we surround the name of our date variable in date() so that GAUSS treats it as a date variable. The date variable is in a standard date format that GAUSS figures out automatically. For the cases when you need to read an uncommon date format, GAUSS allows you to specify it in your formula string. You can read more about this in the Programmatic Data Import section of the GAUSS Data Management Guide. #### Step Two: Perform OLS estimation¶ We pass the dataframe, data and a formula string to the olsmt() procedure to perform the estimation and print an output table. The call keyword tells GAUSS to not return any data, so it just prints the report. // Perform OLS estimation and print report call olsmt(data, "Spot ~ Futures");  Valid cases: 247 Dependent variable: Spot Missing cases: 0 Deletion method: None Total SS: 47960127.957 Degrees of freedom: 245 R-squared: 1.000 Rbar-squared: 1.000 Residual SS: 11692.797 Std error of est: 6.908 F(1,245): 1004666.955 Probability of F: 0.000 Standard Prob Standardized Cor with Variable Estimate Error t-value >|t| Estimate Dep Var ------------------------------------------------------------------------------- CONSTANT -2.83784 1.48897 -1.9059 0.058 --- --- Futures 1.00161 0.000999277 1002.33 0.000 0.999878 0.999878  ## Example 2: Estimate Univariate Regression Spot and Futures Returns¶ This example demonstrates how to transform the variables into logarithmic returns and estimate the equation: $\text{Ret_Spot} = \alpha + \beta_1\text{Ret_Futures} + \epsilon$ ### Getting Started¶ To run this example on your own you will need to follow the data loading steps from the above example. ### How to¶ #### Step One: Compute log returns¶ Our first step is to define the procedure we will use to compute the log returns and apply it to our data. Our blog, The Basics of GAUSS Procedures explains everything you need to know to understand this procedure.  // Define procedure to compute log returns proc (1) = lnDiff(x); local x_diff; // Compute log returns x_diff = 100 * ln(x ./ lagn(x, 1)); // Remove all rows with missing values x_diff = packr(x_diff); retp(x_diff); endp; // Create new dataframe that contains the log difference of our variables ret_data = lnDiff(data[., "Spot" "Futures"]);  #### Step Two: Change variable names¶ We could have combined this with the previous step, but we will do each step separately for clarity. // Create a 2x1 string array using the string concatenation operator names = "ret_spot"$| "ret_futures";

// Set variable names
ret_data = dfname(ret_data, names);


#### Step Three: Compute descriptive statistics¶

We can compute descriptive statistics on our new dataframe with the dstatmt() procedure as shown below.

// Compute descriptive statistics and print them
call dstatmt(ret_data);


will print the following:

--------------------------------------------------------------------------------------------
Variable            Mean     Std Dev      Variance     Minimum     Maximum     Valid Missing
--------------------------------------------------------------------------------------------

ret_spot          0.4168       4.333         18.78      -18.56       10.23       246    0
ret_futures        0.414       4.419         19.53      -18.94       10.39       246    0


#### Step Four: Estimate linear model on return data¶

Finally, we regress ret_spot on ret_futures.

// Estimate the linear model and print the results
call olsmt(ret_data, "ret_spot ~ ret_futures");


will print the following:

Valid cases:                   246      Dependent variable:            ret_spot
Missing cases:                   0      Deletion method:                   None
Total SS:                 4600.534      Degrees of freedom:                 244
R-squared:                   0.989      Rbar-squared:                     0.989
Residual SS:                51.684      Std error of est:                 0.460
F(1,244):                21474.923      Probability of F:                 0.000

Standard                 Prob   Standardized  Cor with
Variable        Estimate      Error      t-value     >|t|     Estimate    Dep Var
----------------------------------------------------------------------------------

CONSTANT       0.0130773   0.0294729    0.443707     0.658       ---         ---
ret_futures     0.975077  0.00665385     146.543     0.000    0.994367    0.994367