0

I am using proc tabulate to create an output dataset with statistics (n mean std min max p25 p75 median) for a variable with a long name (close to the 32 character maximum). The output dataset will add _n, _std, etc to our variable name, but the median variable is just named "Median" because the variable name with "_median" added to the end, the resulting variable name would be >32 characters.

Is there a way to specify the name of the variables in the output dataset from within the proc tabulate step? I am looping through 1000s of variable for this procedure, so it's not feasible to rename each variable in a data step. Also, it must be proc tabulate and not proc freq because we need to output a row for every possible value of each variable, not just those values that exist in the data.

proc tabulate data=DATA out=OUT ; 
var VERY_LONG_VARIABLE_NAME;
table VERY_LONG_VARIABLE_NAME *(n mean std min max p25 p75 median)/printmiss;
run;
1
  • Proc tabulate is what I would call a reporting procedure, designed to output tables to a report not create summary tables. PROC MEANS and FREQ are summary proc designed to create summary tables. Commented Jan 6, 2017 at 5:02

1 Answer 1

1

Unfortunately I don't know of a way to override the tabulate names. Even transposing the tabulate doesn't fix that - you still get the same result, sadly.

My suggestion is to use a different proc. Almost all of the procs you might use have a way to get what you want - the PRINTMISS equivalent; for example, PROC FREQ has the SPARSE option which does basically the same thing (despite its odd name), and PROC SUMMARY or PROC MEANS might be even better (with COMPLETETYPES on the class statement), just depending on your data.

Alternately, you could reshape your data, or reshape your process. For example, if you're really looping through thousands of variables, that's horribly inefficient; better would be to reshape to variable|value structure (vertical) and then do one proc tabulate; that would fix your issue right there (as it would make 'varname' be a CLASS or BY variable itself not a contributor to the output variable name) and make your process faster.

You could also add a VIEW step before the tabulate that performs the rename for you; that would cost very little even in a macro loop.

Either way, supply some sample data and an example of the total process you're doing and likely you can get a better answer.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.