dummy#

Purpose#

Creates a set of dummy (0/1) variables by breaking up a variable into specified categories. The highest (rightmost) category is unbounded on the right.

Format#

y = dummy(x, v)#

Parameters:

x (Nx1 vector) – Data that is to be broken up into dummy variables
v ((K-1)x1 vector) – Specifies the \(K-1\) breakpoints (these must be in ascending order) that determine the \(K\) categories to be used. These categories should not overlap.

Returns:

y (NxK matrix) – contains the K dummy variables.

Examples#

// Set seed for repeatable random numbers
rndseed 135345;

// Create uniform random integers between 1 and 9
x = ceil(9*rndu(5, 1));

// Set the breakpoints
v = { 1, 5, 7 };

dm = dummy(x, v);

The code above produces four dummies based upon the breakpoints in the vector v:

x <= 1
< x <= 5
< x <= 7
< x

which look like:

1 0 0       2
0 0 1       9
dm = 0 1 0 0   x = 4
0 1 0       7
0 0 0       1

Remarks#

Missings are deleted before the dummy variables are created.
All categories are open on the left (i.e., do not contain their left boundaries) and all but the highest are closed on the right (i.e., do contain their right boundaries). The highest (rightmost) category is unbounded on the right. Thus, only \(K-1\) breakpoints are required to specify \(K\) dummy variables.
The function dummybr() is similar to dummy(), but in that function the highest category is bounded on the right. The function dummydn() is also similar to dummy(), but in that function a specified column of dummies is dropped.

Source#

datatran.src