An Introduction to UNIX SAS Version 8.1 at WPI

About This Document

This document has several purposes:

Unix SAS Update and Quickstart contains information on the latest changes in Unix SAS software and its implementation at WPI. By following the steps in this section, new users will configure their Unix accounts for running SAS and accessing SAS data sets and macros used in statistics course work at WPI.

WPI is currently running release 8.1 of SAS. An Introduction to SAS/EIS and An Introduction to SAS/INSIGHT I: Elementary Concepts are tutorials designed to introduce the novice SAS user to two components of SAS statistical software: SAS/INSIGHT, a graphical environment for interactive data analysis, and SAS/EIS, a component of SAS that is used as an interface between the user and a set of SAS programs, called macros, that are used in statistics courses at WPI. Users who want to do more with SAS/INSIGHT are invited to the continuation of the tutorial found at An Introduction to SAS/INSIGHT II: Advanced Concepts.

 

Unix SAS Update and Quickstart

The purpose of this section is to get you up and running quickly, and ready to begin learning to use SAS statistical software. Unix SAS is available on many DEC Alpha servers on campus, including statlab, mathlab, wpi, and reno. It may be accessed on these machines from any workstation, X-terminal or PC on the WPI network.

Unix Update

Unix is the operating system used on workstations at WPI. For those of you familiar with PCs, Unix is to workstations what DOS (or Windows NT or 2000) is to PCs. Users unfamiliar with Unix will find links to introductory information in the Basic Unix Information. section of this document.

Unix SAS Quickstart

In order to run SAS version 8.1 as supported at WPI, you need to (1) copy several files into your home directory and (2) configure SAS for printing and accessing macros. The following eight steps will accomplish (1) and (2).

  • Step 1: Log in to a computer that runs Unix SAS 8.1. If you log in from the stat lab, you will be on the server statlab, which runs Unix SAS 8.1. Other computers that run Unix SAS 8.1 include mathlab, wpi, and reno. If you are logging in from a PC, you must use an X-Windows emulator such as PC-Xware.

  • Step 2: Copy SAS configuration files. In this step, you will copy three files to your home directory. These files will (1) initialize Unix SAS to allow you to find existing libraries of SAS data files, (2) set graphics parameters, and (3) set up your default printer. About half of the class (determined by birth date) will be asked to copy one set of files and the rest of the class will be asked to copy another set. The first set of files assigns one of the printers in the KH 207 as default for printing from the SAS Graphics Window, and the other set assigns the second printer. This procedure will help to avoid long waits for printed output. The section Changing Your Graphics Printer, found later in this document, tells how to change your default printer for printing from the SAS Graphics Window.

    If your birthday is between the dates July 1 and December 31 inclusive, do a. Otherwise do b.

    a. Type

    > /math/mathlab/bin/sasetup8a

    (NOTE: the ">" is the prompt supplied by the computer; you just type the "/math/mathlab/bin/sasetup8a" part and hit <enter>.) If the computer responds with anything other than just the prompt, you've probably done this step incorrectly and you should seek help.

    b. Type

    > /math/mathlab/bin/sasetup8b

    (Note: the ">" is the prompt supplied by the computer; you just type the "/math/mathlab/bin/sasetup8b" part and hit <enter>.) If the computer responds with anything other than just the prompt, you've probably done this step incorrectly and you should seek help.

  • Step 3: Start SAS. You are now ready to access Unix SAS 8.1. To start Unix SAS, type

    > sas &

    (NOTE: the ">" is the prompt supplied by the computer; you just type the "sas &" part and hit <enter>.)

  • Step 4: Set up SAS Printing You now need to configure SAS to print text from the SAS Output Window and graphs from SAS/INSIGHT (in Step 2, you configured printing from the SAS Graphics Window).

    There are two printers in the stat lab. Their names are stat1 and stat2. We recommend that you set up printing for both, so that you can direct output to the less busy one (or the one that hasn't run out of paper). Follow the steps below to set up printing for one of them, and then repeat the instructions with appropriate changes to set up printing for the other. Items a.-c. below will will set up printing to one printer, and will enable you to print to that printer from from the EDITOR, OUTPUT, LOG, or RESULTS windows as well as from SAS/INSIGHT. You will have to do these items a second time to set up printing to both stat1 and stat2.

    The same steps will set up printing from the two printers in the math lab in SH 306: math3 and math3b.

    a. Choose either the EDITOR, OUTPUT, LOG, or RESULTS window.
    b. From the chosen window, click on File: Print Setup.
    c. A SAS: Print Setup window will appear. Click on New. You will have to navigate through a sequence of five dialog boxes. Respond to them as follows:
    Box 1: Give a name to identify the printer: stat1 or stat2. Click Next.
    Box 2: Select the model to be HP LaserJet 4 PCL. Click Next.
    Box 3: Tell where to route the output.
    i. First select Device type Pipe.
    ii. Then give a path to the printer:
    • If you defined stat1 as the printer in Box 1, the path is /usr/ucb/lpr -h -Pstat1
    • If you defined stat2 as the printer in Box 1, the path is /usr/ucb/lpr -h -Pstat2
    Click Next
    Box 4: Leave the previewer input field blank. Click Next.
    Box 5: Click Finish.

    To exit the setup, click OK.

  • Step 5: Practice printing from the GRAPHICS window. Some graphical output, such as those produced by macros you will use in labs, is displayed in the SAS/Graphics window. Printing from this window requires no new setup, but is a little different than printing from the other windows or from SAS/INSIGHT. The following will take you through the steps needed to print from the graphics window.
    a. Generate a graphics window by clicking on View: Graph from any of the original SAS windows. (Note: In real applications, the graphics window will appear automatically with a graph in it.)
    b. If a window appears with the words Catalog not found. Do you want to create it? click on Yes.
    c. A graphics window will appear.
    d. Click on File:Print from the graphics window.
    e. A SAS Print dialog box will appear. Make sure Use SAS/Graph Drivers is selected. There will be a Driver selection window immediately below. Clicking on the triangle to the right will display a list of drivers. If you want to print your graph in portrait mode, select the driver PS (PostScript devices) from the list. If you want to print your graph in landscape mode, select PSLL (PostScript devices-landscape n).
    f. At this point, if you were really printing a graph, you would click OK, but for the purposes of this tutorial, just click Cancel
    When printing from the graphics window, you will always have to go through steps d.-f.

    You cannot change your printer at the SAS/GRAPHICS window, as you can in SAS/INSIGHT, or in the other SAS windows. Directions for changing your printer for printing from the SAS/GRAPHICS window are found in the section Changing Your Graphics Printer.

  • Step 6: Set up EIS applications.
    1. Choose Solutions:EIS/OLAP Application Builder from the menu bar on any of the original SAS windows (PROGRAM EDITOR, LOG, OUTPUT, RESULTS or EXPLORER).
    2. From the resulting window, select Applications.
    3. A small SAS/EIS: Applications window will appear. Click on Set applications....
    4. A SAS/EIS: Set Applications window will appear. Click on Primary private application....
    5. A SAS/EIS: Primary Private Application window will appear. Click on the Library: field and input eisapps, then hit the tab key. In the Application database: field input eisapp, then hit return. Click on OK, and then Goback twice. This returns you to the SAS/EIS main menu.
    6. Close the SAS:EIS window by choosing File: Close.
    Note: You have to do this setup only once, ever.

  • Step 7: Access SAS/INSIGHT. From the menu bar on any of the original SAS windows select Solutions: Analysis: Interactive Data Analysis. A SAS: SAS/INSIGHT: Open window will appear.
    If you want to key in a set of data, select New. A blank SAS/INSIGHT spreadsheet will appear. For details on entering data into SAS/INSIGHT, see An Introduction to SAS/INSIGHT.
    If you want to read an existing SAS data set into SAS/INSIGHT for analysis, select Open. This will bring up a dialog box with choices for SAS data libraries. Among these are:
    o The WORK library. This is the temporary library. All SAS data sets in this area will be erased when you finish your SAS session.
    o The SASUSER library. This is your personal permanent library.
    o The SASDATA library. This is a permanent library of data sets which are accessible to all users. Data for homework assignments will often be found in the SASDATA library.
  • Step 8: Log out of SAS. When you are finished with your SAS session, please log out of SAS. You may do this by using the "SAS Session Management" icon located on your desktop. Double clicking on this icon, or clicking once on this icon and then on "Restore", will produce the SAS Session Management window. Click on the "terminate" box in the SAS Session Management window to terminate your session.

    Other Important Stuff

    Where to learn more about SAS

    For those of you taking a statistics course, your instructor and TA are primary sources for information about SAS. Another resource is email to ma-questions.

    All SAS documentation is available online at http://www.math.wpi.edu/saspdf in the form of pdf files. For those of you taking the introductory courses, MA 2611/12, the only one of these documents likely to be of interest or use is the SAS/INSIGHT User's Guide.

    So SAS isn't behaving, and you don't know why?

    In general, SAS is a pretty reliable program, but there are times when it just seems to have behavior problems. By behavior problems, we don't mean things you may have done, like click on the wrong button, or input the wrong type of response in a macro (like 37.5 when a Y/N was asked for). We mean really strange unexplained phenomena, like the recent example of a student who typed sas & at the unix prompt and got the message

    ERROR: Generic critical error.

    and no SAS windows. Only when she typed sas & a second time did the SAS windows appear. In cases like these, you can try a number of approaches, among them:
    a. If you are doing a lab, flag down a TA.
    b. Send a message to ma-questions.
    c. If host printing is giving you troubles, try the following:
    i. Make sure the printer that it selected is the one you are trying to print to.
    ii. If trying to print to stat1 or stat2, make sure you are running SAS from statlab. The printers stat1 and stat2 cannot be accessed from most other computers.
    iii. Check the print setup to make sure you have selected the proper destination for the printer (e.g., /usr/ucb/lpr -h -Pstat1 (or -Pstat2)), and that the device type is Pipe.
    d. Finally, if all else fails,
    i. Exit SAS.
    ii. Delete your profile file (the file profile.sas7bcat in the directory /home/yourid/sasuser.800).
    iii. Restart SAS.

    Bailing out

    Sometimes you run a SAS program or procedure that you realize is both wrong (perhaps you gave it a wrong input) and long. To bail out of the program or procedure, you can use the "SAS Session Management" icon (the one on your desktop with the lightning bolt). Clicking on this icon will produce the SAS Session Management window. Click on the "interrupt" box in the window to stop the program or procedure without ending your SAS session. When all else fails, click on "terminate" to bail out of SAS.

    Where Next?

    You are now ready to begin the SAS tutorial.

    gif Introduction to SAS/EIS, which you'll use to run SAS macros (programs) for labs and specialized applications.

    gif Introduction to SAS/INSIGHT, a graphically-oriented data analysis system

    gif An overview of the Statistics Multimedia Computer Classroom. (for new users)

    gif An overview of the Unix operating system (for new users).

    An Introduction to SAS/EIS  

    EIS stands for Enterprise Information System. SAS/EIS is a component of SAS that enables users to summarize, integrate and display information in easily accessed and easily understood reports. In the introductory statistics courses at WPI, you will use only one of its many capabilities: that of calling SAS macros.

    SAS macros are programs written in the SAS programming language which perform special tasks, some of which are not otherwise available to novice SAS users, and some of which are not otherwise available to any SAS users. Some macros have been written expressly to support computer labs for the introductory statistics courses at WPI. Some provide statistical functions or procedures of interest to general users. EIS provides a simple, menu-driven interface for SAS users of these macros. In addition, the macros themselves are written with a windows interface for data entry and output.

    Running Macros from EIS

    To run macros from EIS proceed as follows:

    1. Choose Solutions:EIS/OLAP Application Builder from the menu bar on any of the original SAS windows (PROGRAM EDITOR, LOG, OUTPUT, RESULTS or EXPLORER).
    2. From the resulting window, select Applications.
    3. A small SAS/EIS: Applications window will appear. Click on Run private applications....
    4. A SAS/EIS Run Private Application window will pop up with the titles and description of each available application. To run an application, click on its name.

    Why not try an application now? Scroll down to the application called ORACLE (To scroll, place the pointer on the slider bar, click the left mouse button, and move the mouse. You can scroll more slowly by clicking with the left mouse button on the arrows at the top or bottom of the scroll bar.). When you see the word ORACLE, click on it (ONCE ONLY, PLEASE!). A window should appear asking for your question. Ask whatever is on your mind, then press enter.

    NOTE: On some programs requiring data entry, a message such as

    ERROR: Required data needed at row 7 column 50

    will appear in red at the upper left of the data entry window when you enter a piece of data. This occurs under certain circumstances when you hit the tab key instead of the enter key after typing some needed input. This message affects nothing, and you can safely ignore it.

    Changing Your Graphics Printer  

    The two printers in the statistics classroom are named 'stat1' and 'stat2'. The file you copied into the autoexec.sas800 file in your home directory contains either the line:

    filename gsasfile pipe '/usr/ucb/lpr -h -Pstat1';

    (if your birthday is in the first 6 months of the year), or the line:

    filename gsasfile pipe '/usr/ucb/lpr -h -Pstat2';

    otherwise. The first designates the printer 'stat1' as the printer for your graphics output; the second designates the printer 'stat2'. Splitting up the default printer assignment was done to avoid overloading one printer in the classroom.

    Sometimes, however, particularly just before a lab or homework is due, the queue for one of the printers can get very long, resulting in delays. You can get a list of jobs in the printer queue for 'stat1' by typing:

    > lpq -Pstat1


    with a similar command for 'stat2'. If the queue is long, you may want to try using the other printer in the classroom. Or, if you are in the math lab in SH 306 you may want to access one of the printers there: 'math3' or 'math3b'. Or if you are at CCC, you may want (and be willing to pay) to get a copy printed there on the printer 'lps20' (the stat lab may be closed, for example).

    Using EIS you can change the printer for your graphics output (i.e., the graph in the SAS/Graphics Window) to any of 'stat1', 'stat2', 'math3' 'math3b' or 'lps20', or to any other printer accessible by you from the unix machine on which you are running SAS. To do so, follow the directions for invoking EIS applications given in the last section and select the application PCHANGE.

    Saving Graphics Window Contents to a File

    While we're on the topic of the graphics window, we will mention here that you can save the contents of the SAS/Graphics Window to a file in a number of formats (among them: gif, PostScript, bitmap). This is convenient if you like to include graphs in documents. For example, you might want to save a graph in gif format for later inclusion in a lab report written in MS Word.

    To save the graph appearing in the SAS/GRAPHICS Window, choose File: Export as Image. In the resulting dialog box, choose a directory, file name and file type, and then click OK.

    Tips on Using EIS

    While EIS is a convenient way to access SAS macros, it can be finicky at times. Experience with student use of EIS has taught us the following lessons:

    1. Have PATIENCE with EIS. SAS is a very large program and it may tax the system when many users run it at the same time. Consequently, response time for EIS may be slow. Experience shows that students who proceed slowly and patiently through EIS, often have no problems getting it to run correctly. Students who try to rush it (e.g. clicking impatiently on an application-see the next item) often lock up the program and have to start over.
    2. When calling up a macro from EIS, click only once on the name of the macro. Clicking twice before the macro runs can lock up SAS.
    3. Keep your eyes and ears open and FOLLOW DIRECTIONS. (e.g. if a window in an EIS application says "HIT RETURN TO PROCEED", activate that window and hit return).
    4. If you are running a macro from EIS that produces input or output windows and SAS locks up:
      a.
      Try enlarging the window.
      b.
      If that doesn't work, click on the SAS Session Management icon and on "interrupt". Then click on the command line at the top of the locked-up window, type in "end" and hit "return". If this doesn't work the first time, try once again. This may kill the window and allow you to start the macro over without exiting SAS.
      c.
      If neither of these works, kill SAS (using SAS Session Management) and restart.

    5. Take care when using SAS/INSIGHT and EIS at the same time. If an EIS macro uses a SAS data set, and that data set is open in SAS/INSIGHT, the macro may fail to run properly.

    An Introduction to SAS/INSIGHT I: The Basics  

    SAS/INSIGHT is an environment for interactive analysis of data. Its focus is on interactive graphics: graphics which the user can modify at the screen. An example of this is the ability to click on a data point (an unusual observation, for example) on a plot and have it identified with its corresponding observation number. Or to reverse this process, a subset of the data points on an existing plot (say all males) could be easily highlighted. SAS/INSIGHT also has many data-handling and data-analytic capabilities to complement its graphical capabilities.

    This very brief introduction covers only the barest essentials of SAS/INSIGHT. Its goal is to get beginners up and running in the SAS/INSIGHT environment, and to provide a guide to some basic tasks. Full documentation is found online at http://www.math.wpi.edu/saspdf/insight/pdfidx.htm. In addition, SAS/INSIGHT has a very good help system of its own, as will be explained below.

    Invoking SAS/INSIGHT

    To access SAS/INSIGHT, select the "Solutions" entry from the menu bar on any of the original SAS windows: PROGRAM EDITOR, LOG, OUTPUT, RESULTS, or EXPLORER. Then select the "Analysis" and "Interactive Data Analysis" entries in succession. Try this now. A small window entitled "SAS: SAS/INSIGHT: Open" will appear on your screen. We will call all activities you perform in SAS/INSIGHT from the time this window appears until you exit SAS/INSIGHT, a session. The box before you is the initial dialog box. By pressing the "Open" button at the bottom, you may read an existing SAS data set into SAS/INSIGHT. You will be asked to do this later in this tutorial. To begin, however, you will be asked to create your own SAS data set using SAS/INSIGHT. To begin this process, click on the "New" button.

    A new data window, entitled something like "SAS: WORK.A" will appear. This means that the SAS data set you will be creating will be found in the SAS data library "WORK", which is a storage area for temporary SAS data sets (data sets that will be erased when you exit from the current SAS session). The data window is divided into a number of rows and columns of rectangles. Each rectangle, which we will call a cell, will hold one piece of data. The upper left cell should be highlighted, which indicates that it is selected and ready to accept data entry. You will begin entering data soon. First, however, a few details about getting around.

    Choosing from Menus

    In SAS/INSIGHT, operations you can perform include creating graphs and analyses, transforming variables, fitting curves and saving results. These operations are chosen by pulling down a menu from a menu bar. The menu bar is located at the top of SAS/INSIGHT windows (the one on the data window has the items File Edit Analyze Help). To pull down a menu, click on the item of interest from the menu bar. A pop-up menu will appear. Continue holding the mouse button down while you drag it down the pop-up menu until you reach the desired item. If another pop-up menu appears, continue holding the mouse button down and drag to the desired item. Release the mouse button when you have arrived at the desired operation.

    For example, select the "Help" item on the menu bar. A pop-up window will appear. Drag the mouse down the items to "Reference tex2html_wrap_inline753 ". Another pop-up window will appear. Move the pointer to the first item, "Data", and release the mouse button. This activates a help window which explains about the data windows in SAS/INSIGHT. The sequence of steps by which you brought up this data window can be written in shorthand and italicized as Help:Reference:Data. This shorthand and italicized notation will be used in the rest of this tutorial to describe how to move through the menus.

    If you find you have made a mistake and don't want the pop-up menu you've opened, click on some neutral area of the window, such as blank space on the menu bar.

    Help

    You have just seen an example of the extensive help system available in SAS/INSIGHT (and, for that matter, all of SAS). You should use this resource both to learn more about SAS/INSIGHT and to try to figure out what to do when you are stuck. Specifically, Help:Introduction gives an overview of SAS/INSIGHT, Help:Techniques tells how to perform various tasks, Help:Reference will give you detailed information and Help:Index contains a list of all help topics. Take a cruise through these help windows; it will be worth your while.

    There is also context-sensitive help available. For example, if you are displaying a bar chart (a subject considered later in this tutorial) and you want some question answered about bar charts, you can put the pointer on the bar chart and press the F1 key on the keyboard.

    This tutorial will not attempt to duplicate the information found in the help windows. Instead it will focus on some of the features in SAS/INSIGHT which are unique or particularly easy to use.

    Creating New Data

    For this section of the primer we will assume that a project team consisting of three professors has just run the helicopter experiment introduced in Lab 1.2 of the book. If you aren't yet familiar with it, the helicopter experiment consists of timing how long it takes a paper helicopter to drop a specified distance. The experiment requires someone to release the helicopter (the RELEASER) and someone to time the helicopter's stay in the air (the TIMER). The resulting data need to be entered into SAS:

    tabular271

    If your team has already run the helicopter experiment, you should follow along in this section but enter your team's data instead of the above data.

    Now begin entering the data. Click on the upper-leftmost cell in the data window to select it for the first data value. Type "Moe <enter>"; (note: (1) <keyname> means press the key named keyname on the computer keyboard. On some computers the enter key has the name return. (2) Type what is within the quotes, not the quotes themselves.) The name "Moe" should appear in the selected cell as you type, and <keyname> should select the next cell down. Now in succession type "Moe < enter>";, "Moe <enter>";, and "Curley < enter>";. You are on your way to entering the data!

    Defining Variables

    You may have already noticed that the letters "Nom" appeared at the top of the first column, and below them the letter "A". A is the name SAS has given the first variable, and Nom indicates it is a nominal variable. A nominal variable is one which "names". Because the values you have input consist of letters, SAS has concluded (correctly) that the first variable is nominal. We want to name the first variable "RELEASER". To do this click on the triangle in the upper left corner of the data window (right below the "File" entry at the top of the window) with the left mouse button (always select with the left mouse button unless told otherwise). A popup menu will appear. Click on the menu entry "Define Variables...". A "SAS: Define Variables" dialog box will appear. Click on the "A" to the right of "Name:", enter the name "Releaser" (without the quotes), and click on the "OK" button. The name of the variable will now be "RELEASER".

    Notation

    Before we go on, two things. First, a word about notation. In what follows, we will denote the triangle you first clicked on with the symbol tex2html_wrap_inline779 . As we go through this tutorial, this triangle button will appear in a variety of windows and locations, but no matter where it appears, it will be referred to as tex2html_wrap_inline779 . Thus two mouse selections you used in changing the name of the variable would be described as "choose tex2html_wrap_inline779 : Define Variables...".

    Second, a few comments about the data window. The window should now have four names entered under the variable named RELEASER. Notice the number 1 is to the right of the triangle and the number 4 is below it. The first tells the number of variables (columns) in the data set (there is only RELEASER) and the second tells how many observations. The left column contains small squares. These are the symbols used in plotting. The column to the right of these contains the observation number of each observation.

    Finishing Entry of the Helicopter Data

    Now enter the rest of the data. You may continue entering the rest of the releaser names as you have been doing, or you may click on any cell to enter the value of a single observation, or you may enter rows of data. Let's try the latter. Click on the cell at the upper left containing the first data value you entered. Now press <Tab>. The next cell to the right should be highlighted. Enter "Larry". Tab over once more, enter "2.15" and press <enter>. Now enter "1.34", and press <shift-tab> (i.e. hold down the "shift" and "tab" keys simultaneously). This will enter the "1.34" and move one column to the left. You may now enter "Larry", press <Enter> to move one row down, and continue. You may use this or the column entry you began with to complete entry of the data, or you may devise some other method of your own.

    When you have finished data entry, name the second and third variables TIMER and TIME. Notice that TIMER is a nominal variable, but TIME is an interval variable, which is the default for numerical measurements.

    Creating a SAS Data Set

    So far, the data you have entered are accessible only to SAS/INSIGHT and only during this session. If you exit INSIGHT the data will be lost. However, you can save these data in a SAS data set.

    SAS data sets contain data and information about data such as variable names. They are created by SAS and are readable only by SAS. There are both temporary and permanent SAS data sets. Temporary data sets disappear after you finish your SAS session. They are stored in a library called WORK. Permanent data sets are stored in SAS data libraries in your directory, and may be accessed later. The default data library is SASUSER. Many SAS data sets have been created and stored for your use in the data library SASDATA.

    To save your data to a SAS data set, from the data window choose File: Save: Data. A dialog box will appear offering you your choice of libraries to save to and allowing you to choose a name for the data set. If you want to create a temporary data set, select the library WORK. If you want to create a permanent data set, select the library SASUSER. In either case, call the data set COPTER.

    Accessing an Existing SAS Data Set

    It may be that you want to use SAS/INSIGHT to analyze data in an existing SAS data set. Data from an existing SAS data set are entered into SAS/INSIGHT through the initial dialog box, which is automatically brought up when entering SAS/INSIGHT. The initial dialog box may also be accessed if you are already in SAS/INSIGHT, by choosing File: Open. Whichever method you use, bring up the initial dialog box now.

    To enter a SAS data set into SAS/INSIGHT, click on the name of the library where the data set resides and then on the data set name. One or both these actions may involve scrolling the names in a window. To scroll, place the pointer on the slider bar, hold down the left mouse button, and move the mouse. You can scroll more slowly by clicking with the left mouse button on the arrows at the top or bottom of the scroll bar.

    For this tutorial, select the library SASDATA and then the data set BASEBALL. A data window containing this data set will appear. Use your mouse to enlarge this window and view its contents.

    This data set consists of performance measures and salary levels for regular hitters and leading substitute hitters in major league baseball for the year 1986 (a year that will live in infamy for all Red Sox fans). The variables are:

    tabular298

    You may access more than one SAS data set from SAS/INSIGHT at the same time. However, as you may have noticed, when the data window appeared, the initial dialog box window disappeared. To enter other data sets, choose File: Open. The initial dialog box will reappear to allow you to access another data set.

    Exiting SAS/INSIGHT

    To close any SAS/INSIGHT window, choose File:End. When a data window is closed, all windows generated from that window are also closed. When you have closed all data windows, you exit SAS/INSIGHT.

    Selecting and Choosing

    In SAS/INSIGHT, all operations you may want to perform are listed in menus. So to perform any task, you point with the mouse and click the buttons to select objects and choose operations from menus.

    Selecting Objects

    You select an object to indicate that it is an object you want to work with. Objects you can select in a data set in SAS/INSIGHT include variables (such as NAME or NO_ATBAT in the baseball data set), observations (such as all data for Wade Boggs), and individual values (such as Bill Buckner's number of errors). You can also select the results of analyses you conduct in SAS/INSIGHT, such as graphs, curves and tables. Selected objects become highlighted on the display.

    To select an object move the pointer to it with the mouse and click (i.e. press and then release) the leftmost mouse button . To select multiple objects, click and drag by pressing and holding the left mouse button down while moving the pointer across the objects of interest, then releasing the mouse button. This selects all objects touched by the pointer while the mouse button was held down.

    Try these techniques now on the baseball data. Select the variable NAME by clicking on it. Select observation 2 (Alan Ahsby) by clicking on the number 2 next to Alan's name. Select Andre Dawson's number of hits by clicking on the 141 in the appropriate box. Select the observations for the first 6 players by clicking and dragging in the leftmost column.

    When objects are far apart, it is convenient to use modifier keys with the mouse button. The shift key can be used to make an extended selection. For example, to select the observations for the first 100 players, click on the number 1 next to Andy Allenson's name, scroll down to player 100 (Eddie Milner), and click on the number 100 while holding down the shift key.

    To make a non-contiguous selection, use the Ctrl key in a similar way. For example, select the variables NAME, NO_HITS and CR_HOME by clicking on any one of them first, then on a second while holding down the Ctrl key, and again on the third while holding down the Ctrl key. Try it yourself.

    De-selecting Objects

    As you've noticed, selecting another object de-selects previously selected objects.

    Manipulating Data

    In this section, you will learn several features of SAS/INSIGHT for data manipulation.

    Arranging Variables or Observations

    You can easily change the order in which the variables appear in the data window. For example, you can move the variable SALARY from its position at the far right of the baseball data set to the leftmost position. There are two ways to do this:

    1. You may select tex2html_wrap_inline779 :Move to First which, as long as a variable is not already selected, will bring up a dialog box containing the names of all variables. Scroll down the list in this box, click on "SALARY" and then on "OK".

    2. You may first select the variable SALARY in the data window and then select tex2html_wrap_inline779 :Move to First. In this case no dialog box will appear. This also works with any pre-selected observation.

    tex2html_wrap_inline779 :Move to Last will reverse this operation. These methods also work with several variables or observations: just select the desired variables or observations.

    Sorting Observations

    Sorting observations by values of a variable is easy in SAS/INSIGHT. As an example, suppose you want to sort the data according to player's salary. To do this, scroll to SALARY using the horizontal scroll bar at the bottom of the data window. Select the variable "SALARY". Now click on tex2html_wrap_inline779 :Sort. The data are now in order of ascending salary. Note that the "."s in the data set stand for missing data. (You could also have done this without selecting "SALARY" first. Then a dialog box would appear and you would select "SALARY" from it.).

    Finding Observations

    Sometimes you want to find observations that share some characteristic. For example, I know you all want to find all the Red Sox players in this data set. To do this, click on Edit:Observations:Find. A dialog box will appear. Select the variable TEAM from the left box, "=" from the center box, and "Bos." from the right box, then click on "OK". Now all the Red Sox players are highlighted.

    You can do a bit more. By selecting tex2html_wrap_inline779 :Find Next the Red Sox player closest to the top will be put at the top of the data set, and the order of observations will be maintained. By selecting tex2html_wrap_inline779 :Move to First, all the Red Sox players will be moved to the top of the data set, but of course the order of the observations will be changed.

    Transforming Data

      You can transform variables to create new variables in SAS/INSIGHT. For example, though there is no batting average variable in the BASEBALL data set, you can easily create one as follows (For you non-fans, batting average is the number of hits divided by the number of at bats):

    1. Choose Edit:Variables:Other. A dialog box will appear.

    2. In the box with the variables list click on NO_HITS to select it, then click on the "Y" button. "NO_HITS" should appear in the box below it.

    3. Next click on NO_ATBAT to select it, then click on the "X" button. "NO_ATBAT" should appear in the box below it.

    4. Click on the "Y/X" under "Transformation:".

    5. You can use the name SAS assigns the new variable, or you can replace it with a more meaningful one, such as BA (which is what we'll call it in the rest of this document).

    6. Now click on "OK". The variable for batting average will appear last in the data window.

    Despite its appearance, SAS/INSIGHT is not a spreadsheet. However, it does have modest editing capabilities. For instance, you can easily change individual data values. Suppose we don't like Mike Schmidt's .0500 batting average and want to change it to .3500. You can do so by clicking on the cell containing his average and typing in .3500.

    Graphing Data

    One of SAS/INSIGHT's strengths is its ability to create sophisticated graphical displays. To introduce you to SAS/INSIGHT's graphical capabilities, we'll consider the simplest graphical display, the frequency histogram. A frequency histogram is a graphical summary of a data set which creates a number of subgroups of the data based on the value of the variable being plotted. One bar is drawn over the range of values in each subgroup. The height of the bar drawn over a subgroup is equal to the number of data points in that subgroup.

    Draw a frequency histogram for each of the variables SALARY and BA. To do this, choose Analyze:Histogram/Bar Chart (Y) from the menu bar. In the resulting dialog box, choose SALARY and BA as the Y variables, and click "OK". A window containing two frequency histograms will appear. Enlarge this window now. The graphs will remain small.

    To enlarge the graphs, choose Edit:Windows:Renew from the menu bar of the graph window. A dialog box will appear: click on "OK".

    You can move and change the size and/or shape of the graphs using the mouse. To move a graph, click with the left mouse button anywhere (except at a corner) on the side of the frame enclosing the graph. Then, still holding mouse button down, move the frame to a new location. Release the mouse button when the frame is where you want it. To enlarge (or shrink) the graph, click on a corner of the frame. As you move the mouse, the frame will change shape. Release the mouse button when the graph is the right size. With a little practice, you'll get quite good at this.

    Incidentally, now would be a good time to try out the context-sensitive help facility in SAS/INSIGHT. Put the pointer on one of the frequency histograms and press the F1 key. This will bring up a help window about frequency histograms.

    Customizing the Frequency Histogram

    SAS/INSIGHT will automatically choose the number of groups and the group boundaries on the frequency histogram. You can customize the frequency histogram by altering both the number of groups and/or the group boundaries, as follows:

    1.
    Select Edit:Windows:Tools from the menu bar of the frequency histogram window. A window will pop up containing three icons at the top, a palette of colors below it and a number of buttons with different symbols below that.

    2.
    Click on the icon shaped like a hand. This is the move tool.

    3.
    Click on the frequency histogram. This will change the widths of the bars depending on how far the hand is from the base of the bars: clicking close to the base gives greater width; clicking far from the base gives smaller width.

    4.
    Similarly, the place where the first bar starts can be changed by varying the horizontal position of the move tool.

    A good way to see how the appearance of the frequency histogram can be changed is to hold down the left mouse button while moving the move tool all around the frequency histogram. Try this now. Does it help you to get a better picture of the data?

    You can more precisely specify the positions of the bars in the bar chart by choosing tex2html_wrap_inline779 :Ticks, where tex2html_wrap_inline779 is found in the lower left corner of the frequency histogram window. The resulting dialog box allows you to specify the minimum and maximum of the axis as well as the starting and ending location of the bars (first and last ticks) and bar width (tick increment).

    Identifying Observations

    This feature demonstrates some of the power of SAS/INSIGHT. Suppose you want to look at the data in the leftmost bar of the frequency histogram for SALARY. To do this, click on that bar. You will notice that not only does that bar become highlighted, but parts of the frequency histogram for BA do as well. Now look at the data window. You'll notice that the observations of the players whose salaries are displayed in the leftmost bar of the frequency histogram are also highlighted. This illustrates two things. First, you can select observations by clicking on locations on graphs. Graphs with this feature are said to be dynamic. Second, when you select a subset of observations, the selection is displayed on all relevant windows in SAS/INSIGHT. Graphs with this feature are said to be linked. To de-select, just click on an empty region of the histogram window. Try this now.

    You can do this in reverse as well. Go to the data window and select observations 1-10. These will become highlighted in the data window and on your graphs.

    Deleting Graphs from a Window

    To delete a graph, first select it by putting the cursor outside the graph frame and clicking and dragging the cursor inside the frame. The graph will become highlighted. Then choose Edit:Delete. The graph will disappear.

    Plots Broken Down by Groups

    Suppose you want to compare the frequency histograms of batting averages for American and National Leagues. This is easily done as follows. Choose Analyze:Histogram/Bar Chart (Y). From the resulting dialog box, select BA and click on the "Y" button. Next select the variable LEAGUE and click on the "Group" button. Click on "OK". Separate frequency histograms for each League should appear side by side in the resulting window.

    Be careful in comparing them, though! The scale of their axes won't be the same. To get the same horizontal axes and the bars over the same intervals, adjust the ticks as described above. To get the same vertical axes, choose Edit:Windows:Align. Try this now. Do you detect any differences in batting averages between the two leagues?

    Scatter Plots

    A scatter plot or X-Y plot is a graph of bivariate data which plots the X variable on the horizontal axis and the Y variable on the vertical axis. As an example, suppose you are interested in whether there was a relation between a player's salary and his batting average. The best way to see any relationship is to plot SALARY (Y) versus BA (X). To do this, choose Analyze:Scatter Plot ( Y X ) from the menu bar of either the data window or the histogram window. A dialog box will appear. Select BA as the X variable by clicking on BA in the variables box on the left and then clicking on the "X" button at the upper right. Select SALARY as the Y variable by clicking on SALARY in the variables box and then clicking on the "Y" button. Select NAME as the label variable by clicking on it in the variables box and then clicking on the "Label" box. Then click on "OK". The scatter plot will appear. Enlarge the window and renew the plot as desired.

    Do you see a pattern to the data? Are there any unusual points? To find out who they are, click on any of those points on the plot. The player's name will appear because that is the label you gave the data. Who were the most underpaid players in terms of batting average? The most overpaid?

    Perhaps you want to find which variables among NO_RBI, CR_RBI and SALARY were most related. You can use SAS/INSIGHT to produce a scatterplot array. In the data window select the variables NO_RBI, CR_RBI and SALARY. Then from the menu bar choose Analyze:Scatter Plot (Y X). Enlarge the window as desired and renew the plot. Check out the results. Smooth, huh? What do you conclude about the relationships between pairs of these variables?

    Printing Window Contents

    As all SAS/INSIGHT output seen at the screen is written to the SAS/INSIGHT windows, it is important to be able to print the contents of these windows.

    PLEASE NOTE: In order to cut down on wasted paper, we have configured SAS so that no header page is produced. Rather, your user id will appear on the page so that you can identify your output.

    To get a good printed version of the window, follow this five step procedure:

    1. Unless you select certain objects, the entire window contents (i.e. what you see in the window) will be printed. Objects which do not appear in the window will not be printed. If you select a subset of the objects that you see in the window, only they will be printed. So your first task is to be sure what you see in the window is what you want printed. This may involve selecting a subset of objects, moving and regrouping objects and/or scrolling the window.

    2. Once what you see in the window is exactly what you wish to print, choose File:Print.

    3. A "Print" window will pop up, displaying the default printer. If you want to change your printer, or its properties (for example, to print in landscape mode), click on Setup..., When you are ready to print, click on the Print button.

    4. A "SAS:Print" window will pop up. Click on the two buttons "Fill Page" and "Use Titles and Footnotes", then on "OK". We have found that printing sometimes behaves badly unless "Fill Page" is selected. Selecting "Use Titles and Footnotes" is essential to ensure that your user id appears at the bottom of the output page.

    5. The printout will appear at the printer a few minutes later.

    Saving Window Contents to a File

    Sometimes, it is desirable to save SAS/INSIGHT output to a file for inclusion in a document. SAS/INSIGHT allows you to do so in a number of formats (e.g., gif, PostScript, and bitmap).

    To save SAS/INSIGHT output to a file, from the window containing the output to be saved, choose File:Save:Graphics File. In the resulting dialog box, choose the type of file and give the file a name (including the path if it is to be written in another directory). Be sure to check "Titles and Footnotes" if you intend to submit it in a lab report or homework assignment.

    As a default, only what is visible in the window will be saved to the file. If you want to include parts of the output that are not visible or only partly visible, you must select all objects you want to include in the file.

  • Saving Data  

    The SAS data sets you read into SAS/INSIGHT are not affected by any modifications you may have made during your SAS/INSIGHT session. You can, however, save the data modified in SAS/INSIGHT to a SAS data set. The resulting data set will contain:

    • All data values and variables as they currently appear in the data window.
    • All observation states, including color, marker shape and show/hide.

    To save the baseball data set as it currently exists in SAS/INSIGHT, choose File:Save:
    Data, and from the resulting dialog select the library where you want the data set stored (usually WORK if you want it to be temporary and SASUSER if permanent). You should also choose a data set name.

    Connection with SAS

    SAS/INSIGHT accesses the same SAS data sets common to all SAS modules. Therefore any output written to a SAS data set by SAS/INSIGHT can be accessed by other SAS modules and vice-versa. Also, SAS/INSIGHT can be run simultaneously with other SAS modules such as SAS/EIS.

    There is one caution, however. If a data set is open in SAS/INSIGHT, other SAS programs may be unable to access or write to it. This is particularly true of the macros in SAS/EIS. In this case a good strategy is to save a copy of the data set to a temporary data set as outlined in "Saving Data", and use one for analysis in SAS/INSIGHT and another for all other SAS analyses.

    An Introduction to SAS/INSIGHT II: Advanced Concepts  

    Examining Data

    SAS/INSIGHT allows you to examine data that you see in graphs. As an example, go back to the scatterplot of SALARY versus BA. Choose an unusual observation and double click on it. A window will appear with the values of all variables for this observation. You can do the same for groups of observations. You can obtain the same results by single clicking on the observation(s) and choosing Edit:Observations:Examine.

    Edit:Observations:Examine is also useful in examining data for observations chosen by Edit:Observations:Find. For example, you can look at the records of all Red Sox players by choosing Edit:Observations:Find, selecting the variable TEAM from the left box, "=" from the center box, and "Bos." from the right box, then clicking on "OK". Now choose Edit:Observations:Examine to get the data on all the Red Sox.

    Slicing

    Slicing is a dynamic technique for viewing subsets of data based on a range of values for one variable. For example, to see how SALARY is related to BA and NO_RBI, create two scatterplots by selecting SALARY as the Y variable and BA and NO_RBI as the X variables.

    Create a rectangular brush by clicking in the middle of the point cloud on the SALARY by BA scatter plot, holding the left mouse button down, and moving the mouse to create a rectangle. When you release the mouse button, all points in the brush are selected and will become highlighted on both graphs. Now move the brush by clicking in it and dragging. As the brush moves, different observations are selected in both graphs. Now to see how the relation between SALARY and NO_RBI changes for changing BA values, make the brush long (in the SALARY direction) and thin (in the BA direction) and move it left to right or right to left on the SALARY by BA scatter plot.

    To make the effect more dramatic, choose tex2html_wrap_inline779 :Observations and then drag the brush. Now only the selected observations will appear. One final feature you should be aware of that's also kind of fun is that if you release the mouse button while still dragging the brush, it will continue to move on its own.

    Marking Observations

    You can assign markers to use for displaying observations in scatter plots, boxplots (which you'll learn about later) and rotating 3-D plots (for which you're on your own). The markers appear with each observation in the data window. You can assign markers for observations you select, and you can let SAS/INSIGHT assign markers automatically based on the value of a variable. You can control the size of the markers in any plot.

    Marking Individual Observations

    To see how to mark individual observations, create a scatter plot of NO_RBI versus NO_HITS. Select an observation that interests you by clicking on it. If the SAS:Tools window is not already open, Choose Edit:Windows:Tools (if you choose Edit:Windows and see a highlighted square to the left of Tools, the SAS:Tools window is already open). A SAS Tools window will appear. Click on the shape of the marker you want to denote the chosen observation. The marker will change to the shape you choose in all graphs and in the data window.

    Marking by Nominal Variable

    A nominal variable is a variable whose values stand for names of categories. LEAGUE, DIVISION, TEAM, and POSITION are all nominal variables. SAS/INSIGHT can assign markers based on the value of a nominal variable. Let's mark the National and American League players separately in the NO_RBI versus NO_HITS plot. To do this, select LEAGUE in the data window and click on the multiple marker button at the bottom of the SAS: Tools window.

    Marking by Interval Variable

    You can also assign markers based on the value of an interval variable (i.e a variable whose values stand for numerical quantities, such as BA and NO_HITS). Let's assign markers in the NO_RBI versus NO_HITS plot based on SALARY. To do this, select SALARY in the data window and click on the multiple marker button at the bottom of the markers window. A different marker will be assigned to the players in the upper, middle and lower third of SALARY values.

    Adjusting Marker Size

    You can adjust the marker size on the plot by choosing tex2html_wrap_inline779 :Marker Sizes. Try a few sizes to find one you like.

    Coloring Observations

    If you are using a color monitor or printer, coloring the markers different colors may be a more effective strategy than changing marker shapes. (Although with the black and white printers in the stat lab, different shapes of markers show up better).

    Basically, coloring observations proceeds in the same way as marking observations. The same SAS:Tools window used in marking is also used in coloring, so make sure it is open.

    Coloring Individual Observations

    To see how to color individual observations, create a scatter plot of NO_RBI versus NO_HITS. Select an observation that interests you by clicking on it. From the SAS:Tools window click on the color you want to denote the chosen observation. The color will change to the shade you choose in all graphs and in the data window.

    Coloring by Nominal Variable

    Let's color the National and American League players separately in the NO_RBI versus NO_HITS plot. To do this, select LEAGUE in the data window and click on the multiple color button (the rectangular colored button) at the bottom of the colors.

    Coloring by Interval Variable

    Let's assign colors in the NO_RBI versus NO_HITS plot based on SALARY. To do this, select SALARY in the data window and click on the multiple color button. A different color will be assigned to the players in the upper, middle and lower third of SALARY values.

    Hiding Observations

    You can adjust the range of data displayed and show subsets of the data by hiding observations. To illustrate the procedure, display the scatter plot of SALARY versus BA. We would like to investigate this relationship for each league on the same scatter plot (note that we could generate two separate scatter plots by using the variable LEAGUE as a group variable). We need to select the players from the National and American Leagues separately. A clever way to do this is to generate a bar chart of the variable LEAGUE (Bar charts are the analogue of frequency histograms for nominal data.) By clicking on the bar for the American League, all American League players are selected. Do this now.

    To look at the scatterplot of SALARY versus BA for just National League players, choose Edit:Observations:Hide in Graphs.

    Now look at the data window. De-select the selected observations by clicking on the upper left data cell of the data array. Notice that the previously selected observations now have no markers at all in the far left column. This says that these observations are hidden in all graphs (notice that the frequency histogram of LEAGUE has only the National League bar) .

    To make the observations visible in the graphs again, first choose Edit:Observations:
    Invert Selection, which de-selects all selected observations and selects all de-selected observations. Since all observations were de-selected just prior to this, all observations are now selected. If you now choose Edit:Observations:Show in Graphs, all observations will appear in the the graphs.

    Toggling the Display of Observations

    You can show subsets of the data by toggling the display of observations. This causes observations to be displayed only when they are selected. To illustrate this, create two scatter plots: one of SALARY versus BA, and the other of SALARY versus NO_RBI, by choosing Analyze:Scatter Plot ( Y X ), and assigning SALARY the Y role and BA and NO_RBI the X role.

    You will now create a toggle on the value of LEAGUE as follows:

    1. Choose from the lower left of one of the scatterplots tex2html_wrap_inline779 :Observations. All the observation markers will disappear from the two scatter plots.

    2. Choose Edit:Observations:Find from the data window.

    3. In the dialog box, select LEAGUE from the variables list. Select the value you wish to display first: American or National League. Click on "OK".

    Both scatterplots will now display the data for the league you selected. To toggle between the two leagues, choose Edit:Observations:Invert Selection. Each time you do this the data displayed will change to the other league. By doing this quickly, you can detect differences between the leagues.

    To undo the toggling, choose tex2html_wrap_inline779 :Observations again. Click on an empty area of the graph window to de-select.

    Getting Started in the Statistics Multimedia Computer Classroom

     

    Introduction

    The Mathematical Sciences Department's Statistics Multimedia Computer Classroom (hereinafter known as the stat lab) is located in 207 Kaven Hall. It is equipped with twenty-eight X-terminals networked to the server statlab. There are two networked PostScript printers, stat1 and stat2 available for student use. The classroom also has an A/V system with projector, screen, and PC, VCR, and cable TV input.

    Starting Your X-Terminal Session

    To begin your X-terminal session, move the mouse to activate the screen.

    Each X-terminal is connected to a server, whose name will be displayed on the X-terminal screen. The stat lab X-terminals are connected to the server statlab. The screen will display the message ``Please enter user name.'' Type in your user ID and hit the ``Return'' key. The message ``Please enter password'' will appear next. Type in your password and hit the ``Return'' key (NOTE: the password will not be displayed on the screen).

    After a short wait, the Common Desktop Environment (CDE) will appear on the screen. (Windows users will recognize this as similar to the Windows desktop.) In order to proceed, you need a window in which to enter commands. There will be a taskbar at the bottom of the screen. On the taskbar, there will be an icon consisting of a notepad and pencil. (Clicking on this will bring up the default text editor.) Immediately above this icon, there is an upward-pointing arrow. Click on this arrow. A personal application popup menu will appear. Select terminal from this menu. This will bring up the appropriate window. Using the mouse, position the arrow anywhere in the main part of this window. Then click the left mouse button to activate this window. You'll know the window is active when the bar at the top darkens and the cursor begins flashing on and off. There will be a prompt (a ``>'') in the window just to the left of the cursor.

    Ending Your X-Terminal Session

    To end your X-terminal session, click on the "Exit" icon on the taskbar in the CDE.

    Sources of Information

    The machines in KH 207 are set up the same way as other networked X-terminals on campus, so the general information found online through the information systems xinfo or the WPI web pages- especially the introductory sections, and the sections dealing with simple Unix commands-will be useful to you in getting acquainted with the Unix computer systems on campus. To access xinfo you must first log in to a Unix machine.

    Xinfo

    Xinfo is a mode of the editor emacs which serves as an online information system. You do not have to know anything about emacs to use it, though you can access it from inside emacs if you are an emacs user.

    To start xinfo from the command line, just type xinfo at the Unix prompt. An xinfo window will appear with a menu. The information you seek is found under the menu item "Campus Computing (CCC)". Xinfo is easy to use and self-explanatory.

    The WPI Web Pages

    The WPI web pages are a source of information about WPI on the World Wide Web (WWW). Of course, if you are reading these words, you have already accessed part of the WPI web pages. To obtain information about computing at WPI, click on "Services" in the WPI home page, then on "College Computer Center (CCC)" on the next page that appears.

    Classroom Facilities

    During certain hours the stat lab is reserved for classes. You are free to use the machines in the classroom anytime it is open outside those reserved hours. The hours it is available may change from term to term, but they will be posted on the lab door. The hours are also listed on the World Wide Web at http://www.math.wpi.edu/Lab_Admin/statlab_sched.html.

    The main difference between the machines elsewhere on campus and those in KH 207 is that students using SAS or Maple software for coursework in MA courses have priority in KH 207. In addition, students will not be charged the usual $.10 per page fee for printouts produced using SAS or Maple software for MA coursework and submitted to either of the two classroom printers, stat1 or stat2.

     

    Basic Unix Information

    Much of the scientific computing done at WPI (and around the world) is done on computers running the Unix operating system. At WPI, these computers consist mainly of Alpha machines, made by Compaq. The version of SAS supported at WPI and described in this document is found on the Unix network.

    Information on Unix may be found at http://www.wpi.edu/Academics/CCC/Help/Unix and http://www.math.wpi.edu/Doc/help/index.html.


    Joe Petruccelli < jdp@wpi.edu>
    Last modified: Fri Jul 27 15:09:15 EDT 2001