0

I have datasets like this:

C:\temp\SalesFigures FY13.dta
C:\temp\SalesFigures FY14.dta
C:\temp\SalesFigures FY15.dta
etc. 

Each file contains sales data from 50 states. I often need to run a block of code for just some of the states in these files. I specify those states in a file called StatesToRun.dta (e.g., AK, CA, WA) and use a foreach command to loop through each state. I also use a macro to specify the FY .dta file I want to use.

For example:

* Specify file to run. 
local FY "FY14"

* Run code only for the states I list in StatesToRun.dta.
use "C:/temp/StatesToRun.dta", clear
levelsof state, local(statelist)

foreach MyState of local statelist 

{

use "C:/temp/SalesFigures 'FY'.dta", clear
keep if state == `"`MyState'"' 
* etc. ...

} 

THE NEED

I sometimes need to run my code for several of the FY files in C:\temp. So I'd like to create a loop for that, too. For example, if I wanted to run the code for AK, CA, and WA, for the FY14 and FY15 .dta files, I'd enter "AK", "CA", and "WA" for state in StatesToRun.dta, and "FY14" and "FY15" for a variable I could call "FY" in StatesToRun.dta. I'm just not sure how to incorporate this second variable into the loop. I read you can nest foreach statements, but I'm not sure if that's the best approach.

Being rather new to Stata, this is my best guess:

* Run code only for the states and FYs I list in StatesToRun.dta.
use "C:/temp/StatesToRun.dta", clear
levelsof state, local(statelist)
levelsof FY, local(FYlist)

foreach MyState of local statelist {
foreach MyFY of local FYlist {

use "C:/temp/SalesFigures 'MyFY'.dta", clear
keep if state == `"`MyState'"' 
* etc. ...

}
}

Am I on the right path?

3
  • The first example won't run; the open curly brace must be on the same line as the foreach. Commented Feb 27, 2016 at 6:31
  • It seems needlessly indirect to put e.g. AK CA WA in a dataset just to take them out again. Why not type them directly? Commented Feb 27, 2016 at 10:04
  • Roberto - correct. That was a typo. Nick - The list of states to run is used for about 8 other routines and separate syntaxes, so it's easy to store them all in one place (an external file), so each routine can reference that single file. Commented Feb 27, 2016 at 18:24

1 Answer 1

1

You don't need a loop (nor a macro) to keep observations, as dictated by some "list" in another dataset. You can use merge:

clear
set more off

*----- example file with list of interest ----

sysuse auto
keep make
drop in 6/69

list

tempfile MakesToRun
save "`MakesToRun'"

*---- work with selected observations ----

clear
set more off

sysuse auto
keep make price mpg rep78

list

// keep observations that only appear in list of interest
merge 1:1 make using "`MakesToRun'", keep(matched)

list

Check help merge and the corresponding manual entry to get a good grasp of its working.

You can do this for multiple files using a loop.

Maybe there's a better way to setup the whole thing, but we don't have enough information.

Sign up to request clarification or add additional context in comments.

1 Comment

The full code inside the loop is a few hundreds line long. The keep if command is just the first of many different commands that make use of the MyState value (like tempfile holding, save 'holding', opening another file, dropping the MyState, appending, replacing, etc. MyState is also used later for saving files with the state name (e.g., save "... "Outcomes for MyState'.dta."`), etc.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.