Week 6 Homework
SAS Arrays (Help in doing repetitive operations)
Question
Is there a difference in the receipt of coronary artery bypass surgery between men and women with ischemic heart disease?
Procedure
Sub-setting the ‘qtr1’ SAS file.
Taking a sample of 5000 records and only a few variables.
Selecting only ICD-9 codes 410.00 to 414.99 (Ischemic heart disease).
Identifying how many persons had bypass graphs (the usual way).
Using the SAS Array to do the same thing
Taking a Sample and Selecting the Variables
We will use a number of commands to do this.
Proc surveyselect to select our survey of 10,000 records.
Use the ‘if’ statement to select only those with ischemic heart disease (ICD-9 codes 410.00 to 414.99) and who are residents of the state of Florida.
Use the keep statement to keep only a limited number of records for this example.
Proc Survey Select
In this procedure we established a library called ‘phc’.
Then via a simple random sample we selected a sample of 10,000 records.
Selecting Records
So here we established a new data set called ihd (ischemic heart disease).
We restricted records to those for whom the principal diagnosis was between 410.00 and 414.99.
And we keep only 33 variables and 206 records (for this example the result was a vastly reduced file size.
Identifying Bypass Procedures
The procedure code for a bypass graph is between 36.10 and 36.20.
So, we want to know if this code is present in any of the procedure codes from prinproc to othproc30 (that’s 31 places)
Currently there are 35 records where the principal procedure code is bypass graph.
So, What’s Next
The logical thing would be to write code to identify bypass codes in each of the 31 variables.
We use if-then statements (many of them).
THIS IS A LOT OF CODE AND PLENTY OF OPPORTUNITIES…………………………………………………………....FOR A MISTAKE.
Results?
OR
We can use an array to do repetitive operations.
ARRAYS are constructed in the data step.
Because of the repetitive nature of the array process a ‘do loop’ is required.
Structure
An array must have a name.
An array must be told how many times to repeat a command.
An array must be told what the command is.
An array must be told to end.
To repeat our 31 commands to identify 37 records with a bypass graph we shall use an array.
The Array
What is here?
Our variable for having a graph is called cabg2 and set the original value to ‘0’. Everyone is coded ‘0’ initially.
We then tell SAS that the name of the array is graph and that there are 31 variables to be analyzed.
Then we tell SAS to assign a ‘1’ to the cabg2 if the condition on line 60 are met.
SAS will do this for each of the 31 variables (1 prinproc and 30 othproc).
Then we tell SAS to end.
Then we tell SAS to drop the variable called ‘i’. If we don’t it’s no biggie. ‘I’ will be a variable on our datafile.
10 LINES OF CODE VS. 34!
Result
So, Back to Our Original Question
Is there a difference in the receipt of bypass graphs by gender?
Let’s first format our variables.
Then we run our code.
Ta Da, The Answer