Adding second variable in foreach command - Stata

Question

I have datasets like this:

C:\temp\SalesFigures FY13.dta
C:\temp\SalesFigures FY14.dta
C:\temp\SalesFigures FY15.dta
etc.

Each file contains sales data from 50 states. I often need to run a block of code for just some of the states in these files. I specify those states in a file called StatesToRun.dta (e.g., AK, CA, WA) and use a foreach command to loop through each state. I also use a macro to specify the FY .dta file I want to use.

For example:

* Specify file to run. 
local FY "FY14"

* Run code only for the states I list in StatesToRun.dta.
use "C:/temp/StatesToRun.dta", clear
levelsof state, local(statelist)

foreach MyState of local statelist 

{

use "C:/temp/SalesFigures 'FY'.dta", clear
keep if state == `"`MyState'"' 
* etc. ...

}

THE NEED

I sometimes need to run my code for several of the FY files in C:\temp. So I'd like to create a loop for that, too. For example, if I wanted to run the code for AK, CA, and WA, for the FY14 and FY15 .dta files, I'd enter "AK", "CA", and "WA" for state in StatesToRun.dta, and "FY14" and "FY15" for a variable I could call "FY" in StatesToRun.dta. I'm just not sure how to incorporate this second variable into the loop. I read you can nest foreach statements, but I'm not sure if that's the best approach.

Being rather new to Stata, this is my best guess:

* Run code only for the states and FYs I list in StatesToRun.dta.
use "C:/temp/StatesToRun.dta", clear
levelsof state, local(statelist)
levelsof FY, local(FYlist)

foreach MyState of local statelist {
foreach MyFY of local FYlist {

use "C:/temp/SalesFigures 'MyFY'.dta", clear
keep if state == `"`MyState'"' 
* etc. ...

}
}

Am I on the right path?

The first example won't run; the open curly brace must be on the same line as the foreach. — Roberto Ferrer
– Roberto Ferrer, Commented Feb 27, 2016 at 6:31
It seems needlessly indirect to put e.g. AK CA WA in a dataset just to take them out again. Why not type them directly? — Nick Cox
– Nick Cox, Commented Feb 27, 2016 at 10:04
Roberto - correct. That was a typo. Nick - The list of states to run is used for about 8 other routines and separate syntaxes, so it's easy to store them all in one place (an external file), so each routine can reference that single file. — Larry
– Larry, Commented Feb 27, 2016 at 18:24

Roberto Ferrer · Accepted Answer · 2016-02-27 06:21:21Z

1

You don't need a loop (nor a macro) to keep observations, as dictated by some "list" in another dataset. You can use merge:

clear
set more off

*----- example file with list of interest ----

sysuse auto
keep make
drop in 6/69

list

tempfile MakesToRun
save "`MakesToRun'"

*---- work with selected observations ----

clear
set more off

sysuse auto
keep make price mpg rep78

list

// keep observations that only appear in list of interest
merge 1:1 make using "`MakesToRun'", keep(matched)

list

Check help merge and the corresponding manual entry to get a good grasp of its working.

You can do this for multiple files using a loop.

Maybe there's a better way to setup the whole thing, but we don't have enough information.

answered Feb 27, 2016 at 6:21

Roberto Ferrer

11.1k1 gold badge24 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Larry Over a year ago

The full code inside the loop is a few hundreds line long. The keep if command is just the first of many different commands that make use of the MyState value (like tempfile holding, save 'holding', opening another file, dropping the MyState, appending, replacing, etc. MyState is also used later for saving files with the state name (e.g., save "... "Outcomes for MyState'.dta."`), etc.

Collectives™ on Stack Overflow

Adding second variable in foreach command - Stata

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related