Splitting a string variable in Stata is generally easy to do. However, in my case, I have trouble reorganizing the order of these values. The variable represents a list of characteristics associated with an observation and looks like this:
Variable_Name
No Phosphates
No Perfumes; No Phosphates; Private Label
No Perfumes; Private Label
Private Label
If I use the code split Variable_Name, p("; "), I get
Variable_Name1 Variable_Name2 Variable_Name2
No Phosphates
No Perfumes No Phosphates Private Label
No Perfumes Private Label
Private Label
How to rearrange the values so that it looks something like this?
Variable_Name1 Variable_Name2 Variable_Name3
No Phosphates
No Phosphates No Perfumes Private Label
No Perfumes Private Label
Private Label
In other words, how to group the same characteristics under the same column?
Here is a full code:
clear
input str50 Variable_Name
"No Phosphates"
"No Perfumes; No Phosphates; Private Label"
"No Perfumes; Private Label"
"Private Label"
end
split Variable_Name, p("; ")
The challenge is that I have an unknown number of characteristics. It will be impossible for me to manually identify and sort them into columns by hand, or looking up certain string values.