0

I have a nested loop in Stata with four levels of foreach statements. With this loop, I am trying to create a new variable named strata that ranges from 1 to 40.

    foreach x in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 {
         foreach r in 1 2 3 4 5 {
             foreach s in 1 2 {
                 foreach a in 1 2 3 4 {
                    gen strata= `x' if race==`r' & sex==`s' & age==`a'
                }
            }
    }
}

I get an error :

"variable strata already defined"

Even with the error, the loop does assign strata = 1, but not the rest of the strata. All other cells are missing/empty.

Example data:

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(age sex race)
1 2 2
1 2 1
1 1 1
1 1 1
1 2 1
2 2 1
2 2 1
4 2 1
1 2 1
4 2 1
3 2 1
2 2 1
4 2 1
4 2 2
3 2 1
4 1 3
4 2 1
4 2 1
2 1 2
4 2 1
2 2 1
3 2 1
3 2 1
1 2 3
4 2 1
1 2 5
4 2 1
4 2 1
4 2 2
4 2 1
2 2 1
4 1 1
3 2 1
1 2 1
2 2 1
4 2 1
1 2 2
2 2 3
1 1 3
4 2 1
2 2 3
1 2 1
1 1 1
2 2 3
1 2 1
1 1 3
1 2 1
2 2 1
3 2 1
1 2 1
4 2 1
1 2 2
1 2 1
2 2 1
4 2 1
4 2 1
1 2 1
1 2 1
4 2 1
2 2 1
4 2 1
1 2 1
1 1 3
2 2 1
1 1 1
4 1 1
3 2 1
2 2 1
1 2 1
1 1 1
2 2 3
4 2 2
2 2 1
2 2 1
3 2 1
2 2 2
3 2 1
2 1 1
1 1 1
3 2 1
1 2 3
4 2 1
4 2 1
2 2 1
1 2 1
1 1 1
3 2 1
4 2 1
2 2 3
1 2 3
4 2 1
3 2 1
2 2 1
4 2 1
3 2 1
2 1 1
1 2 1
2 2 1
2 2 3
1 1 1
end
label values sex sex
label def sex 1 "male (1)", modify
label def sex 2 "female (2)", modify
label values race race
label def race 1 "non-Hispanic white (1)", modify
label def race 2 "black (2)", modify
label def race 3 "AAPI/other (3)", modify
label def race 5 "Hispanic (5)", modify

1 Answer 1

1

generate is for generating new variables. The second time your code reaches a generate statement, the code fails for the reason given.

One answer is that you need to generate your variable outside the loops and then replace inside.

For other reasons your code can be rewritten in stages.

First, integer sequences can be more easily and efficiently specified with forvalues, which can be abbreviated: I tend to write forval.

gen strata = . 
forval x = 1/40 {
    forval r = 1/5 {
        forval s = 1/2 {
            forval a = 1/4 {
                replace strata = `x' if race==`r' & sex==`s' & age==`a'
            }
        }
    }
}

Second, the code is flawed any way. Everything ends up as 40!

Third, you can do allocations much more directly, say by

gen strata = 8 * (race - 1) + 4 * (sex - 1) + age  

This is a self-contained reproducible demonstration:

clear
set obs 5
gen race = _n
expand 2
bysort race : gen sex = _n
expand 4
bysort race sex : gen age = _n
gen strata  = 8 * (race - 1) + 4 * (sex - 1) + age
isid strata

Clearly you can and should vary the recipe for a different preferred scheme.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.