Splitting a string variable in Stata, and placing values in order

Question

Splitting a string variable in Stata is generally easy to do. However, in my case, I have trouble reorganizing the order of these values. The variable represents a list of characteristics associated with an observation and looks like this:

Variable_Name
No Phosphates
No Perfumes; No Phosphates; Private Label
No Perfumes; Private Label
Private Label

If I use the code split Variable_Name, p("; "), I get

Variable_Name1      Variable_Name2      Variable_Name2 
No Phosphates              
No Perfumes         No Phosphates       Private Label
No Perfumes         Private Label       
Private Label

How to rearrange the values so that it looks something like this?

Variable_Name1      Variable_Name2        Variable_Name3        
No Phosphates              
No Phosphates       No Perfumes            Private Label
                    No Perfumes            Private Label      
                                           Private Label

In other words, how to group the same characteristics under the same column?

Here is a full code:

clear
input str50 Variable_Name 
"No Phosphates"
"No Perfumes; No Phosphates; Private Label"
"No Perfumes; Private Label"
"Private Label"
end

split Variable_Name, p("; ")

The challenge is that I have an unknown number of characteristics. It will be impossible for me to manually identify and sort them into columns by hand, or looking up certain string values.

Nick Cox · Accepted Answer · 2016-06-04 22:19:48Z

2

See here for some reshape technique. Note that this will be entirely sensitive to small differences in spelling, etc.

clear 
input str100 what 
"No Phosphates"
"No Perfumes; No Phosphates; Private Label"
"No Perfumes; Private Label"
"Private Label"
end 
split what, p(;) 
rename what original 
gen id = _n
reshape long what, i(id) 
replace what = trim(what) 
egen group = group(what) 
drop if missing(group) 
drop _j 
reshape wide what, i(id) j(group) 
list

answered Jun 4, 2016 at 22:19

Nick Cox

37.4k6 gold badges37 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Splitting a string variable in Stata, and placing values in order

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related