Stata: Efficient way to replace numerical values with string values

Question

I have code that currently looks like this:

replace fname = "JACK" if id==103
replace lname = "MARTIN" if id==103

replace fname = "MICHAEL" if id==104
replace lname = "JOHNSON" if id==104

And it goes on for multiple pages like this, replacing an ID name with a first and last name string. I was wondering if there is a more efficient way to do this en masse, perhaps by using the recode command?

Nick Cox · Accepted Answer · 2013-10-01 23:07:43Z

I will echo the other answers that suggest a merge is the best way to do this.

But if you absolutely must code the lines item-wise (again, messy) you can generate a long list ("pages") of replace commands by using MS Excel to "help" you write the code. Here is a picture of your Excel sheet with one example, showing the MS Excel formula:

        columns:
          A         B      C     D
row: 1  last      first    id   code
     2  MARTIN    JACK    103   ="replace fname=^"&B2&"^ if id=="&C2

You type that in, make sure it looks like Stata code when the formula calculates (aside from the carets), and copy the formula in column D down to the end of your list. Then copy the whole block of Stata code in column D generated by the formulas into your do-file, and do a find and replace (be careful here if you are using the caret elsewhere for mathematical uses!!) for all ^ to be replaced with ", which will end up generating proper Stata syntax.

(This is truly a brute force way of doing this, and is less dynamic in the case that there are subsequent changes to your generation list. All--apologies in advance for answering a question here advocating use of Excel :) )

Nick Cox · Accepted Answer · 2013-09-30 07:50:56Z

0

You don't explain where the strings you want to add come from, but what is generally the best technique is explained at

http://www.stata.com/support/faqs/data-management/group-characteristics-for-subsets/index.html

answered Sep 30, 2013 at 7:50

Nick Cox

37.4k6 gold badges37 silver badges51 bronze badges

3 Comments

Parseltongue Over a year ago

I'm not comfortable with putting it as a separate file-- it increases the likelihood of error with colleagues. Do you know how to implement Ram's strategy?

Peter Dutton Over a year ago

A merge process is probably the quickest solution. You can always save and provide the merged data files for your colleagues nullifying the need for them to repeat this.

Nick Cox Over a year ago

I don't see how using a separate file is any more error-prone than typing in the strings directly. Either way, this is exactly @Ram's strategy.

Nick Cox · Accepted Answer · 2013-10-01 23:05:56Z

0

Create an associative array of ids vs Fname,Lname

103 => JACK,MARTIN
104 => MICHAEL,JOHNSON
...

Replace id => hash{id} ( fname & lname )

The efficiency of doing this will be taken care by the programming language used

edited Oct 1, 2013 at 23:05

Nick Cox

37.4k6 gold badges37 silver badges51 bronze badges

answered Sep 30, 2013 at 6:32

Ram

1,19417 silver badges38 bronze badges

3 Comments

Parseltongue Over a year ago

How do you create 'associative arrays' in Stata? I've never seen this syntax.

Nick Cox Over a year ago

The principle is good. Note that this is not, and is not presented as, any kind of Stata syntax.

Fr. Over a year ago

The principle works efficiently only if you have a vectorized paste function to concatenate without looping. Can't remember if Stata does that -- it's not in h string_functions. All of this, of course, is to escape the good "dictionary" strategy (Nick's answer).

Collectives™ on Stack Overflow

Stata: Efficient way to replace numerical values with string values

3 Answers 3

Comments

3 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

3 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related