0

I have data in the following format:

data have;
input id rtl_apples rtl_oranges rtl_berries;
    datalines;
1 50 60 10
2 10 30 80
3 40 8 1
;

I'm trying to create new variables that represent the percent of the sum of the RTL variables, PCT_APPLES, PCT_ORANGES, PCT_BERRIES. The problem is I'm doing this within a macro so the names and number of RTL variables with vary with each iteration so the new variable names need to be generated dynamically.

This data step essentially gets what I need, but the new variables are in the format PCT1, PCT2, PCTn format so it's difficult to know which RTL variable the PCT corresponds too.

data want;
set have;
array rtls[*] rtl_:;
total_sales = sum(of rtl_:);
call symput("dim",dim(rtls));
array pct[&dim.];
do i=1 to dim(rtls);
    pct[i] = rtls[i] / total_sales;
end;
drop i;
run;

I also tried creating the new variable name by using a macro variable, but only the last variable in the array is created. In this case, PCT_BERRIES.

data want;
set have;
array rtls[*] rtl_:;
total_sales = sum(of rtl_:);
do i=1 to dim(rtls);
    var_name = compress(tranwrd(upcase(vname(rtls[i])),'RTL','PCT'));
    call symput("var_name",var_name);
    &var_name. = rtls[i] / total_sales;
end;
drop i var_name;
run;

I have a feeling I'm over complicating this so any help would be appreciated.

4
  • What is the source of the existing names? Are the PCT_BERRIES etc names being generated by a program? Or are they just created by some external process that is not under your control? Commented Nov 16, 2021 at 17:49
  • The apples, oranges, berries are outside of my control, but I'm adding the RTL prefix like proc transpose prefix=RTL_ Commented Nov 16, 2021 at 17:55
  • So you have the list in data already? Just use that to make a macro variable with the list of names. Commented Nov 16, 2021 at 18:08
  • @Tom I just realized that as I was posting the comment. I was hoping to do it all within the data step but that's just as easy. Thanks! Commented Nov 16, 2021 at 18:10

3 Answers 3

2

If you have the list of names in data already then use the list to create the names you need for your arrays.

proc sql noprint;
  select distinct cats('RTL_',name),cats('PCT_',name)
  into :rtl_list separated by ' '
     , :pct_list separated by ' '
  from dataset_with_names
  ;
quit;

data want;
  set have;
  array rtls &rtl_list;
  array pcts &pct_list;
  total_sales = sum(of rtls[*]);
  do index=1 to dim(rtls);
    pcts[index] = rtls[index] / total_sales;
  end;
  drop index ;
run;
Sign up to request clarification or add additional context in comments.

Comments

2

You can't create variables while a data step is executing. This program uses PROC TRANSPOSE to create a new data using the RTL_ variables "renamed" PCT_.

data have;
   input id rtl_apples rtl_oranges rtl_berries;
   datalines;
1 50 60 10
2 10 30 80
3 40 8 1
;;;;
   run;
proc transpose data=have(obs=0) out=names;
   var rtl_:;
   run;
data pct;
   set names;
   _name_ = transtrn(_name_,'rtl_','PCT_');
   y = .;
   run;
proc transpose data=pct out=pct2;
   id _name_;
   var y;
   run;
data want;
   set have;
   if 0 then set pct2(drop=_name_);
   array _rtl[*] rtl_:;
   array _pct[*] pct_:;
   call missing(of _pct[*]);
   total = sum(of _rtl[*]);
   do i = 1 to dim(_rtl);
      _pct[i] = _rtl[i]/total*1e2;
      end;
   drop i;
   run;

proc print;
   run;

enter image description here

Comments

2

You may want to just report the row percents

  proc transpose data=&data out=&data.T;
    by id;
    var rtl_:;
  run;

  proc tabulate data=&data.T;
    class id _name_;
    var col1;
    table 
      id=''
    , _name_='Result'*col1=''*sum=''
      _name_='Percent'*col1=''*rowpctsum=''
    / nocellmerge;
  run;

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.