BAILEY DEBARMORE
  • Home
  • Productivity
    • Blog
    • My Recs
  • EPI
    • EPICODE
    • #EpiWritingChallenge >
      • About the EWC
      • All Posts
  • Wellness
    • Health Blog
    • My Recs
  • Freebies

Calculate Mean or Average Value in SAS

6/14/2018

0 Comments

 
Author: Bailey DeBarmore
​Short post today on how to use the MEAN function in SAS 9.4. Let's get started.

It seems like every time I need to calculate a mean variable in SAS, I find myself looking up which CALL functions deal with missing values in this way, and which in that way. 

For example, blood pressure readings are often taken 3 times, and then we average those 3 readings together for a mean value. In some code I ran earlier this morning, I kept getting negative values in my "avg_bp" variable. What's up with that? 
The code I had used was:
avg_bp=mean(bp1-bp3)
Looks OK right?
​
Then I tried:
avg_bp=average(bp1-bp3)

​Guess what? There's no average function in SAS. Oops!
​
​Then I tried:
avg_bp=mean(of bp1-bp3)

​and ta da! Perfecto!

Syntax for MEAN function

The syntax of the MEAN statement varies depending on how you list your variables. 
1) If you use a dash list, like I did, you need to include "of"
2) If you separate variables with a comma, you do not need to include 'of'. 

SUM versus MEAN

So why wouldn't you just use
avg_bp=SUM(of bp1-bp3)/3
Well, with SUM, if there is a missing value, SUM treats it as a zero. Usually if we only have 2 out of 3 values, when we calculate the average, we would do bp1+bp2 divided by 2, right? If you use SUM and a fixed divisor, you need to be sure that you have NO MISSINGNESS. In contrast, the MEAN function will manage missing values appropriately. That is, if you're missing 2 of 3 values, it will divide by 2 (not 3). 

NB: Don't just exclude all your missing values! That's a post for another day, but briefly, if you exclude observations with missing values you can introduce bias. Persons with all missing values versus 1 versus 2 out of the 3 may be different in some systematic way. You're also reducing your precision by excluding those. 

Take Home Message

In summary,  to calculate the mean value, be sure to use the MEAN function so that it will manage missing values appropriately, and be sure to use the correct syntax to avoid wonky results (use "of" with a - list, otherwise use a comma separated list, see below). ​
NEWVAR = mean(of VAR1-VAR3)
​NEWVAR = mean(VAR1, VAR2, VAR3)

That's all for today - quick and dirty, but useful. Now every time I look this up, I'll just come back to my own blog post! Just kidding - hopefully writing this post for you guys has solidified the concept. ​​


​Bailey

​

About the Author
Picture

​Bailey DeBarmore is a doctoral student at the University of North Carolina at Chapel Hill studying epidemiology. Find her on Twitter @BaileyDeBarmore and blogging for the American Heart Association on the Early Career Voice blog. 
0 Comments

Your comment will be posted after it is approved.


Leave a Reply.

    Picture
    Picture

    Practical solutions for conducting great epidemiology methods. Transparency in code. Attitude of constant improvement.

    Appreciate my stuff?

    Picture

    Picture
    Picture
    Picture
    Picture

    Picture
    Picture

    All
    Bailey DeBarmore
    Data Visualization
    Excel
    IPW
    Paul Zivich
    P Values
    Python
    R
    Regression
    SAS
    Stata
    ZEpid


    Picture

    March 2021
    September 2020
    April 2019
    September 2018
    August 2018
    July 2018
    June 2018
    May 2018


    RSS Feed

BLOGS

Work & Productivity
Health and Nutrition
EPICODE

About

About Bailey
CV and Resume
CONTACT

RD EXAM

Study Smarter Method
RD Exam Resources

FIND ME ON

Facebook
LinkedIn
Twitter
Google Scholar
Research Gate
Terms & Conditions | Privacy Policy | Disclaimers
Copyright Bailey DeBarmore © 2020
  • Home
  • Productivity
    • Blog
    • My Recs
  • EPI
    • EPICODE
    • #EpiWritingChallenge >
      • About the EWC
      • All Posts
  • Wellness
    • Health Blog
    • My Recs
  • Freebies