strreplace#
Purpose#
Replace all matches of a substring with a replacement string.
Format#
- str_new = strreplace(str, search, replace)#
- Parameters:
str (string, string array, or categorical variable of a dataframe) – to be searched and modified.
search (string) – the substring to search for in str.
replace (string) – the substring with which to replace all instances of search found in str.
- Returns:
str_new (string) – new string, string array of categorical variable of a dataframe which is the same as str, except that all instances of search have been replaced with replace.
Examples#
Basic search and replace#
// String to be searched in
str = "My doctor recommends more chocolates, because chocolates are healthy.";
// String to be searched for
search = "chocolate";
// String to be replaced with
replace = "vegetable";
// Build new string
new_str = strreplace(str, search, replace);
After the code above, new_str will be set to:
My doctor recommends more vegetables, because vegetables are healthy.
Regularize addresses in string array#
// String array to be searched
str = "100 Main Ave" $|
"112 Charles Avenue" $|
"49 W State St" $|
"24 Third Avenue";
// String to search for
search = "Avenue";
// String to replace with
replace = "Ave";
// Build new string
new_str = strreplace(str, search, replace);
After the code above, new_str will be set to:
"100 Main Ave"
"112 Charles Ave"
"49 W State St"
"24 Third Ave"
Change the name of categories in a dataframe#
// Create 5x1 string array
states = "CA" $| "FL" $| "California" $| "California" $| "FL";
// Convert the string array to a dataframe
// with the variable name 'States'
df_states = asdf(states, "States");
print df_states;
States
CA
FL
California
California
FL
When asdf()
created the dataframe df_states
from the string array, states
, it made a categorical variable and assigned a separate category to each different string it found in states
.
We can use the getcollabels()
function to show the categories and labels found.
// Get the category labels and keys from the
// first (and only) variable in 'df_states'
{ label, keys } = getcollabels(df_states, 1);
After the above code:
labels = "CA" keys = 0
"California" 1
"FL" 2
// Replace the "California" label with "CA"
// and remove the "California" category
df_states = strreplace(df_states, "California", "CA");
print df_states;
States
CA
FL
CA
CA
FL
// Get the new category labels and keys
{ label, keys } = getcollabels(df_states, 1);
As we see below, the observations that previously had the label "California"
and a key value of 1, have now been merged with the "CA"
category.
labels = "CA" keys = 0
"FL" 2
See also
Functions recodecatlabels()
, reordercatlabels()
, setbasecat()
, strrindx()
, strsect()