There are so many different ways to enter data and some of us like to find short cuts to minimize typing and making errors🙂 A common example of cutting down the data entry is to use one line of data to hold several individual measures. In one example, we may have data collected on 4 animals that are housed in the same cage and exposed to the same treatment. For our trial we have 18 cages, each with 4 animals. Rather than entering 72 lines of data, why not enter 18 lines of data, each line containing data for the 4 animals. Your data would appear as follows:
Recalling that each measurement taken under Fibre1, Fibre2, Fibre3, and Fibre4 refer to measures taken on individual animals. However, to do any analysis we need to read this data in a way where we can add a new variable called Animal and a variable called Fibre so that we have 72 lines of data rather than the original 18.
To accomplish this we take advantage of some DATA step programming and use the “@”. The “@” essentially tells SAS to stop reading the data line and hold its spot, wherever that may be in the dataline. Let’s try a DATA step and walk through it step by step.
input CageID Treatment $ @;
do Animal = 1 to 4;
101 DietA 5 8 9 6
102 DietA 6 7 8 5
201 DietB 10 9 11 14
202 DietB 15 12 11 10
Also remember from previous SASsy Fridays sessions that the INPUT statement itself acts like a DO LOOP.
Input CageID Treatment$ @;
CageID = 101
Treatment = DietA
@ = hold your spot in the dataline and move to the next line in the SAS program
Do animal = 1 to 4;
Animal = 1
SAS sees that INPUT statement and goes back to where it stopped reading from the first INPUT line – in other words it goes back to where it stopped previously reads the next value and stops again.
Fibre = 5
With the output statement – we are now saving the new dataline 101 DietA 1 5 in the dataset called fibretrial.
End; tells us to go back to our DO LOOP
Animal = 2
Fibre = 8
Save new dataline 101 DietA 2 8
Repeat for the next 2 data values.
So we now have a dataset that looks like this:
To recap – the “@” is used in the INPUT statement and simply tells SAS to hold its spot in the dataline – you may have more work for SAS to do before you want it to go back and continue reading the data.
Another shortcut we might take when entering data is to create columns of data rather than one long list. As an example:
Rather than creating a dataset that has 8 lines, we’ve created a short dataset with only 4 lines but the information is presented in 2 columns. Now we need to create the dataset with 8 lines in order to be able to analyse it appropriately. Yes! you can copy and paste in Excel – but why? Let’s use SAS to do it for us
input ID age height weight @@;
10 35 175 200 11 37 173 150
12 31 180 195 13 29 174 180
14 33 171 210 15 39 171 175
16 39 185 225 17 36 172 188
If we remove the “@@” SAS will read from the first line of data, as an example:
SAS will then jump to the next line to read
The input statement makes it clear that you are only reading those 4 variables. When you add the “@@” at the end of the INPUT statement you are letting SAS know that there is more data beyond the first set of variables. By adding the “@@” you are telling SAS to stay on the same dataline and keep reading across until you find no more data. SAS will continue reading until it reaches the end of the line.
There are SO many ways to enter data intially, so don’t restrict yourself to only entering the data in a columnar format. SAS can transform into the format you need for analysis by using some of these tools available to you in the DATA step