This is a newbie SAS question. I have a dataset with numerical variables v1-v120, V and a categorical variable Z(with say three possible values). For each possible value of Z, I would like to get another set of variables w1-w120, where w{i}=sum(v{i}}/V, where the sum is a sum over a given value of Z. Thus I am looking for 3*120 matrix in this case. I can do this in data step, but would like to do it by Proc SQL or Proc MEANS, as the number of categorical variables in the actual dataset is moderately large. Thanks in advance.
Add a comment
|
2 Answers
Here's a solution using proc sql. You could probably also do something similar with proc means using an output dataset and a 'by' statement.
data t1;
input z v1 v2 v3;
datalines;
1 2 3 4
2 3 4 5
3 4 5 6
1 7 8 9
2 4 7 9
3 2 2 2
;
run;
%macro listForSQL(varstem1, varstem2, numvars);
%local numWithCommas;
%let numWithCommas = %eval(&numvars - 1);
%local i;
%do i = 1 %to &numWithCommas;
mean(&varstem1.&i) as &varstem2.&i,
%end;
mean(&varstem1.&numvars) as &varstem2.&numvars
%mend listForSQL;
proc sql;
create table t2 as
select
z,
%listForSQL(v, z, 3)
from t1
group by z
;
quit;
1 Comment
user227290
Thank you ..I thought we need to use Macro for this
It's easy to do this with proc means. Using the t1 data set from Louisa Grey's answer:
proc means data=t1 nway noprint;
class z;
var v1-v3;
output out=t3 mean=w1-w3;
run;
This creates an table of results that match the SQL results.
2 Comments
user227290
Proc means will work.. but if you need to create a new variable (within the proc statement) using two existing variables then proc means fails
itzy
Not sure what you mean -- are you using two variables in the question example? You can always you a data view to generate variables without actually using I/O resources.