0

I have a person level dataset with three categorical variables V1, V2 and V3. I want to use Proc Tabulate to calculate means of variable X1, X2, and X3 by the three categories listed above as well as a count of persons and the percentage out of V1 and V2 (i.e when V3 is all). Here is my first attempt.

Proc tabulate date = in_data
              Out = out_data;
     Var X1 X2 X3; 
     Class V1 V2 V3;
     Table (V1 all) * (V2 all) * (V3 all), N mean pctn<V1 V2>;
Run;

This gives me the error message “Statistics other than N was requested without analysis variable in the following nesting V1*V2*V3*Mean”. I don’t think I have the syntax quite right. Any ideas on how I can fix it? Thanks.

1 Answer 1

1

You need to include the variables in the table statement. I think this should work:

Proc tabulate date = in_data
          Out = out_data;
 Var X1 X2 X3; 
 Class V1 V2 V3;
 Table (V1 all) * (V2 all) * (V3 all), (X1 X2 X3)*(N mean);
Run;

This works for me:

Proc tabulate data = sashelp.class
          Out = out_data;
 Var age weight height; 
 Class sex;
 Table (sex all), (age weight height)*( N mean);
Run;

EDIT:

Your issue is specific to your data somehow, you'll have to include a sample data or there's something else going on.

Here's a reproduction with a value of 0's and no issues in the summary.

data have;
do i=1 to 1000;
v1=rand('bernoulli', 0.4);
v2=rand('bernoulli', 0.7);
x1=rand('uniform')*3+1;
x2=rand('uniform')*9+1;
output;
end;
drop i;
run;

proc print data=have(obs=10);
run;

proc tabulate data=have out=check;
class v1 v2;
var x1 x2;
table (v1 all) (v2 all), (x1 x2)*(n mean);
run;
Sign up to request clarification or add additional context in comments.

7 Comments

I discovered that the numbers did not come out right when I had more than one class. For a single class, as you provided in the example above, it worked fine but not when you have multiple classes. For some reason, the numbers are smaller than what they are supposed to be. But there was no clear reason/pattern why. Any ideas? Thanks.
Can you post sample data that demonstrates your issue?
I think the root of the problem seems to be that it's excluding observations where V2 ="0" (that's why my totals are lower). Even though there are a number of observations where V2="0" in my data set, it doe not appear in the output data set. Any ideas as to why that might be the case?
Is it 0 or missing? Proc tabulate deletes missing listwise for all statistics by default.
It's actually 0. The all rows are less that the expected value precisely by the number of records where Var2 if 0.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.