Sas Lab Session 1

  • Uploaded by: gaurav gupta
  • 0
  • 0
  • July 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Sas Lab Session 1 as PDF for free.

More details

  • Words: 2,595
  • Pages: 12
SOR 202

SAS Laboratory Session 1

Introduction to SAS: Tutorial 1 SAS (Statistical Analysis System) is a statistical analysis and data management software package. SAS can take data from almost any type (and size) of file. The analyses that can be performed by the SAS software are extensive ranging from the generation of basic descriptive statistics to complex statistical analyses. The software is capable of producing tabulated reports, charts, and plots of distributions and trends with ease.

SAS for Windows To open SAS either: •

Double-click the shortcut icon on the Desktop (if available),



From the Start Menu select:

or

Start

Programs

General Apps

SAS

SAS 9.1 (English)

Usually a “Getting Started with SAS” dialog box appears on opening SAS. Selecting “Start Guides” will open up the “SAS Help and Documentation” window. Feel free to explore these “Getting Started with SAS” help-pages (select “New SAS Programmer (quick-start guide)” from the drop-down menu).

NB: While these hand-outs aim to give an introductory description of using SAS, the help facilities within SAS will provide a more complete description of all of the procedures that can be utilized within it, and contain many illustrative examples. The SAS Help and Documentation Window can be opened from the main SAS window through the “Help” menu, selecting “SAS Help and Documentation”.

1

SOR 202

SAS Laboratory Session 1

Overview of SAS Windows When you first start SAS, the five main SAS windows open, namely the: • • • • •

Explorer, Results, Program Editor or Editor, Log, and Output windows.

This quick walkthrough shows you how each of these windows is used.

The results and explorer windows are usually ‘docked’ on the left-hand side. NB: If you accidentally close any of these windows they can be reopened through the “View” Menu. Also, an individual window can be docked by first selecting it and then selecting “Docked” from the “Windows” menu.

2

SOR 202

SAS Laboratory Session 1

The Explorer Window In the Explorer window, you can view and manage your SAS files and create shortcuts to files that are not formatted by SAS. Use this window to: • • • •

create new SAS libraries and SAS files open any SAS file perform most file management tasks such as moving, copying, and deleting files create file shortcuts. Double clicking the “Libraries” icon allows you to view the libraries (and within these the datasets) you are able to access. The Work library is the default library that new datasets are stored to (when no specified library is named). Note however the data stored in this library is removed at the end of each SAS session. The Sashelp library is a permanent library that contains sample data and other files that control how SAS works at your site. This is a read-only library. The Sasuser library is a permanent library, and is a convenient place to store your own files.

Use the “Up one level” icon to navigate through the libraries as necessary (also available through the “View” menu).

NB: The following icon is used to illustrate a SAS dataset:

3

SOR 202

SAS Laboratory Session 1

The Editor Window You can use the Editor Window to enter, edit, and submit SAS programs:

The initial Editor Window title is Editor - Untitledn. When you open a file or save the contents of the Editor window to a file (to a *.sas file), the window title changes to reflect that file name. When the contents of the Editor window are modified, an asterisk is added to the title to indicate the contents have not been saved in their current form. You can have multiple Editor windows open at the same time. The Log Window The Log Window displays messages about your SAS session and any SAS programs that you submit.

4

SOR 202

SAS Laboratory Session 1

The Output Window The Output Window displays the output from SAS programs that you submit. It automatically opens or moves to the front of your display when you create output.

The Results Window The Results Window helps you navigate and manage output from SAS programs that you submit. You can view, save, and print individual items of output.

5

SOR 202

SAS Laboratory Session 1

Data Entry in SAS Before you can work with your data in SAS, it must be in a special form called a SAS dataset. So understanding SAS datasets is the first step in learning about SAS programming. Importing Data If you have PC database files such as Microsoft Excel spreadsheets, or Microsoft Access files, you can use SAS to import these files and create SAS datasets. Once you have the data in SAS datasets, you can process them as needed in SAS. (NB: In a similar way, you can also export SAS data to a number of PC file formats.) To read PC database files, you use the IMPORT procedure. PROC IMPORT reads the input file and writes the data to a SAS dataset, with the SAS variables defined based on the input records. There is a SAS Import Wizard available from the main “File” Menu. Example: import the Excel file ExamResults.xls into SAS using the Import Wizard. • • •

ExamResults.xls can be downloaded from Queen’s Online and saved within your home directory (or a sub-directory of your home directory e.g. a sub-directory called “SOR202”). To import this data into SAS select “Import Data…” from the “File” menu. The data source type in this case is: Microsoft Excel 97, 2000 or 2002 Workbook”. Select “Next” once is has been chosen from the dropdown menu.

6

SOR 202 • • •



SAS Laboratory Session 1 Select the “Browse” button and navigate to where you have stored the Excel file within your home directory. The workbook contains a single worksheet named “Results”, and this is the table you wish to import (keep default settings within “Options…”). The next step is to indicate to SAS where you wish to import this data to within SAS. Selecting the Work library will store the SAS dataset in this temporary library. The name to be given to the dataset is indicated under member (here the dataset has been named “EXAMRESULTS”).

The final component of the Import Wizard enables the appropriate SAS code to be generated - in case you wish to import this data again, without having to use the Import Wizard again. Save the code within your home directory to a *.sas file e.g. ImportResults.sas. Select “Finish” in order to import the data.

There are several things to note now this data has been imported: • •

The Log Window provides details of whether the dataset was successfully created or not. Navigating through the Libraries via the Explorer Window you should now be able to see the dataset “Examresults” within the “Work” library.

7

SOR 202

SAS Laboratory Session 1 •

Double-clicking on the “Examresults” dataset icon enables you to view the contents of the imported data within the dataset.

NB: This Viewtable MUST be closed if you wish to perform other manipulations on the dataset. •



It is possible to delete this dataset from the work library by rightclicking on the “Examresults” dataset icon and selecting “Delete” (ensure the dataset is not being viewed as a Viewtable at the same time). The code used to import the Excel data into the SAS dataset “ExamResults” can be viewed by opening the .sas file. Select “Open Program…” under the File menu, and navigate to where the .sas file ImportResults.sas has been stored in your home directory.



To re-generate the dataset “ExamResults” highlight the portion of sas code which you wish to run (see below) and then select the submit icon:

8

SOR 202

SAS Laboratory Session 1

The dataset “ExamResults” should be re-generated through this code (the Log Window will indicate if the code was successful) and can be viewed via the Explorer Window in the Work library.

Creating a New Dataset using the DATA statement Data can also be input as a SAS dataset from the keyboard or read from a file using a DATA statement. How the SAS code will look depends upon the structure of the data being input. Example: An example piece of SAS code is given below for creating a SAS dataset from the data stored in the (space-delimited) text file cancer.txt (downloadable from Queen’s Online): data cancer; infile "H:\SOR202\cancer.txt" firstobs=2; input obs_no id time status stain $ ; run;

• The first line of the portion of code above indicates to SAS you want to generate a SAS dataset called cancer (use an appropriate name for SAS datasets you create e.g. trial, company, drug – a name with 32 or fewer characters) • NB: the semicolon ; at the end of each of the statements. • The INFILE command indicates the file that the data can be retrieved from and is of the form “pathname\filename”. • NB: It is good programming practice to indent lines between the data statement and the run statement. Noteto execute any series of commands or statements within SAS you must include a RUN statement. • The firstobs=2 command lets SAS know that the first observation for this data occurs in row 2 of the text file (row 1 of the text file contains the variable labels).

9

SOR 202

SAS Laboratory Session 1

• The INPUT statement provides SAS with details of the variables contained within the file: here, this dataset contains 5 variables. • The $ sign following the final variable stain indicates that this is an alphanumeric variable. Variables in a dataset Variable names can contain from 1-32 characters – they can contain numbers, but names must begin with a letter. • Here SAS assumes that the variables of the observations are separated in the text file by blank spaces (this is its default assumption). However this can be changed using the delimiter option within the INFILE statement. For example the text file cancer_commas.txt (downloadable from Queen’s Online) contains the same data as cancer.txt, but the variables in each row are separated by commas. To load this text file into SAS the INFILE statement would need to be modified as follows: data cancer2; infile "H:\SOR202\cancer_commas.txt" firstobs=2 delimiter=","; input obs_no id time status stain $ ; run;

Other types/formats of Data i) Fixed Format In the case of files where variables are in a fixed format (particular columns of the file correspond to particular variables, and this holds for every row of the file), the INPUT statement seen above can be modified in order to clarify where variables can be found in each row of the file. Example: An example piece of SAS code is given below for creating a SAS dataset from the data stored in the text file hypernephroma.txt (downloadable from Queen’s Online): data hyper; infile "H:\SOR202\hypernephroma.txt" firstobs=2; input treatment $1-14 status 17 time 25-27 age $32-36; run;

• •

The input statement above indicates the treatment variable is located in columns 1 to 14 in each of the rows (and is alphanumeric) The status variable is located in column 17 of the text file etc…..

ii) More than one line per observation If a datafile contains more than one line per observation, the input statement needs to indicate the line number (using a # symbol) before specifying the variables on that line e.g. input id 1-3 company 8-10 #2 insal 6-10 finalsal 18-23 #3 retire 1519;

10

SOR 202

• • •

SAS Laboratory Session 1

The above input statement informs SAS that there are 3 lines of variables for each observation. In the first row of each observation the variable id is located in columns 1 to 3 while the variable company is located in columns 8 to 10. In the second row of each observation, the variable insal is located in columns 6 to 10 while the variable finalsal is located in columns 18 to 23 etc…..

iii) Mixed style Input statements can also be written in a shorter form with a mixed style e.g. input id 1-2 sex $ 3 (exp school) (1.) (C1-C10) (1.) (M1-M10) (1.) (MATHSCOR COMPSCOR) (2.);





For the above statement the variable id is read from columns 1-2 and sex from column 3. The next two variables exp and school have a width of 1 column each and start at column 4. The variables C1-C10 (10 variables in sequential order) have a width of one column each (in columns 6-15 of the datafile). The variables M1-M10 have a width of one column each (in columns 16-25 of the datafile). The last two variables MATHSCOR and COMPSCOR have a width of two columns each starting at column 26. If you wish to skip data within a datafile (e.g. in the above only read in the variables id and the last two variables MATHSCOR and COMPSCOR) you could use the @symbol within the INPUT statement (the @ moves the pointer to column 26 in this example):

input id 1-2 @26 (MATHSCOR COMPSCOR) (2.);

Data input directly from the keyboard To input data directly from within SAS, the command datalines is utilised. For example the commands below read in a dataset called hsb10 which contains 10 records, 11 variables, 10 of which are numeric, and 1 is of type character. data hsb10; input id female race ses schtype $ prog read write math science socst; datalines; 147 1 1 3 pub 1 47 62 53 53 61 108 0 1 2 pub 2 34 33 41 36 36 18 0 3 2 pri 3 50 33 49 44 36 153 0 1 2 pub 3 39 31 40 39 51 50 0 2 2 pub 2 50 59 42 53 61 51 1 2 1 pub 2 42 36 42 31 39 102 0 1 1 pub 1 52 41 51 53 56 57 1 1 2 pub 1 71 65 72 66 56 160 1 1 2 pub 1 55 65 55 50 61 136 0 1 2 pub 1 65 59 70 63 51 ; run;

11

SOR 202

SAS Laboratory Session 1

Other Useful Statements Label Statement You can use a LABEL statement to give labels to variables – while a SAS variable name is limited to 32 characters, the label (which is used in any output for this variable) can have up to 256 characters including blanks. Labels should be enclosed in quotes and the LABEL step terminated by a semicolon e.g. add the following label statements to the hsb10 dataset: label schtype="School type"; label math="Mathematics score"; label science="Science score";

Proc Format Statement These associate formats with variables in a dataset. For example in the hsb10 dataset the variable female has two values – 1 indicates the person is a female, while 0 indicates the person is male. To associate these values with appropriate value labels a proc format statement is used, which is then referenced within the data statements by a format statement. Note that the proc format statement (associating the values) must be run prior to the data step using the formats. proc format; value female 1="female" 0="male"; value $schtype "pub"="public school" "pri"="private school"; run; data hsb10; input id female race ses schtype $ prog read write math science socst; format female female. schtype $schtype.; …………etc

Comment Statements It is good programming practice to place comments in your codes for documentation purposes. Statements enclosed in /* ……… */ are ignored by SAS upon executing a program e.g. /* This is a comment */ /* So is this */ /* This comment spans several lines */

12

Related Documents

Sas Lab Session 1
July 2020 6
Sas Session 1
April 2020 2
Lab Session 2
July 2020 9
Session - Ii Lab
May 2020 5
Session - X Lab
May 2020 6
Session - Xiii Lab
May 2020 9

More Documents from "agilan"

Latefine.pdf
May 2020 4
Lab Session 2
July 2020 9
Clinical Data Entry
July 2020 7
Oracle And Sas
July 2020 4
Sas Lab Session 1
July 2020 6
Latefine.pdf
May 2020 4