Author: Bailey DeBarmore Short post today on how to use the MEAN function in SAS 9.4. Let's get started. It seems like every time I need to calculate a mean variable in SAS, I find myself looking up which CALL functions deal with missing values in this way, and which in that way. For example, blood pressure readings are often taken 3 times, and then we average those 3 readings together for a mean value. In some code I ran earlier this morning, I kept getting negative values in my "avg_bp" variable. What's up with that? The code I had used was: avg_bp=mean(bp1-bp3) Looks OK right? Then I tried: avg_bp=average(bp1-bp3) Guess what? There's no average function in SAS. Oops! Then I tried: avg_bp=mean(of bp1-bp3) and ta da! Perfecto! Syntax for MEAN functionThe syntax of the MEAN statement varies depending on how you list your variables. 1) If you use a dash list, like I did, you need to include "of" 2) If you separate variables with a comma, you do not need to include 'of'. SUM versus MEANSo why wouldn't you just use avg_bp=SUM(of bp1-bp3)/3 Well, with SUM, if there is a missing value, SUM treats it as a zero. Usually if we only have 2 out of 3 values, when we calculate the average, we would do bp1+bp2 divided by 2, right? If you use SUM and a fixed divisor, you need to be sure that you have NO MISSINGNESS. In contrast, the MEAN function will manage missing values appropriately. That is, if you're missing 2 of 3 values, it will divide by 2 (not 3). NB: Don't just exclude all your missing values! That's a post for another day, but briefly, if you exclude observations with missing values you can introduce bias. Persons with all missing values versus 1 versus 2 out of the 3 may be different in some systematic way. You're also reducing your precision by excluding those. Take Home MessageIn summary, to calculate the mean value, be sure to use the MEAN function so that it will manage missing values appropriately, and be sure to use the correct syntax to avoid wonky results (use "of" with a - list, otherwise use a comma separated list, see below). NEWVAR = mean(of VAR1-VAR3) That's all for today - quick and dirty, but useful. Now every time I look this up, I'll just come back to my own blog post! Just kidding - hopefully writing this post for you guys has solidified the concept. Bailey About the Author
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
Practical solutions for conducting great epidemiology methods. Transparency in code. Attitude of constant improvement. Appreciate my stuff?
All
March 2021
|