statguru

profilestrength
BreifAndWeek1-4.zip

Assignment brief and tutorials/Assignment Brief and Example presentation/~$C519 DMBI Assignment Brief.doc

Assignment brief and tutorials/Assignment Brief and Example presentation/5CC519 DMBI Assignment Brief.doc

University of Derby

School of Computing and Mathematics

In-course Assignment Specification

Module Code and Title: 5CC519 Data management and Business Intelligence

Assignment No. and Title: A SAS-based information analysis and presentation system

Assessment Tutor: John Panneerselvam

Weighting Towards Module Grade: 100%

Date Set: 25/01/2018

Hand-In Deadline Date:

Various:

Design Justification: 29 March 2018, 11.59pm.

Demonstration: 4 May 2018

Penalty for Late Submission

Assignments are submitted electronically. The electronic submission area will continue to accept work after the due date. The date and time of submission will be recorded to identify late submissions. Please visit the regulations pages (http://www.derby.ac.uk/regs) for more information.

Between the opening of the assignment submission and the deadline date, in most cases, you can submit and overwrite with a resubmission repeatedly. It's important you try to submit as early as you can to take advantage of the plagiarism checking function prior to your final submission. This will also mean that you've submitted something for your lecturer to mark if circumstances beyond your control mean that you can't complete your final submission. For more information on this please see our exceptional extenuating circumstances (http://www.derby.ac.uk/EEC) page.

See http://www.derby.ac.uk/esub for more information.

Level of Collaboration:

None

Learning Outcomes covered in this Assignment:

1. Identify and critically evaluate the key requirements for effective data management, analysis and presentation to meet users’ needs

2. Develop a small management information reporting system that meets the users’ needs using a prototyping / Agile approach

Criteria for Assessment:

See Portfolio Specification

The Task

Overall, you will specify, design and build a small management information analysis and presentation system to analyse the NHS Transparency RTT (Referral to Treatment Time) datasets from the following location https://www.england.nhs.uk/statistics/statistical-work-areas/rtt-waiting-times/rtt-data-2016-17/ and use the XLS spreadsheets for the Admitted Adjusted Commissioner statistics.

The first part of the Portfolio, due on Thursday 29th March 2018 at 11:59 pm in Turnitin, is a justified specification of the Management Information System (MIS).

The second part of the portfolio, which will be assessed on Friday 4th May 2018, will be a demonstration of your SAS based MIS. The demonstration schedule will be published by Week 10.

The relative weightings of the two components is:-

Design Justification 30%

SAS based MIS 70%

Part 1 - Design and Justification - 30%

This will be a short business style report of approximately 1000 words which provides an outline of the MIS and its design. In order to prepare yourself to design the system and write the report, you will need to carry out at the following steps.

You are required to use the template provided for the Outline Design Specification. Please do not remove the section marking criteria. They provide you with a checklist of what you need to achieve for each section.

Step 1

Understand the structure and meaning of the data from the RTT Monthly spreadsheets; you will be able to do this using both Excel and SAS. From this, identify and justify the key stakeholders who have an interest in the data presented in the spreadsheets.

Step 2

Choose one of the stakeholders from the above analysis and identify and fully justify no more than three key questions that can be answered by a suitable selection and presentation of data from up to five of the RTT monthly sets of data.

Step 3

Identify the relevant data and the appropriate selection, summarisation and presentation necessary to allow the stakeholder to pose their question and then gain a suitably presented answer. You may need to consider performance issues in designing a reasonably efficient means of storing your data (this could be done using the Log statistics from the SAS 9.4 system when you experiment with different approaches).

Step 4

Present and justify simple mock-ups of the relevant data entry and presentation interfaces. User interface can be developed using MENU, SELECT, FRAME, PROJECT etc. Select and deploy the appropriate user interface using SAS.

Step 5

Identify and justify how you will prove that the MIS system that will be developed based on this specification and provide suitable test schedule specifications and test conditions.

Report Content

The key aspects that must be addressed in the specification report are:

1) Identification of the Stakeholder chosen for this MIS and justification of up to three types of questions

2) Identification and justification for the chosen Data Architecture to support the selection and presentation of the data necessary to provide answers to the questions

3) Identification and justification for the chosen Systems Logic to support the selection and presentation of the data necessary to provide answers to the questions

4) Identification and justification for the chosen User Interface to support the selection and presentation of the data necessary to provide answers to the questions

5) Identification and justification of the Test Data and Schedule to ensure that the system meets the Stakeholders’ needs

Assessment Criteria for the Design Report

Each of the assessment section will have a weighting of 20%.

Section

100%-70%

60%-69%

59%-50%

49%-40%

39%-35%

34%-0%

Stakeholders

Identifies stakeholder and 3 Key Questions - comprehensive justification

Identifies stakeholder and 3 Key Questions and good justification

Identifies stakeholder and 2 Key Questions and some justification

Identifies stakeholder and 1 Key Question - basic justification

Identifies either stakeholder or some questions

Little of merit

Data Architecture

Justifies best option for RTT data based on SAS Log metrics for best performance

Justifies SAS structure based on 3rd Normal concepts, i.e. storage efficiency

Basic justification of the SAS storage structure in relation to questions

Defines single repository in SAS

Defines RTT sources as XL

Little of merit

Systems Logic

Clearly justified specification of logic, using appropriate techniques such as UML or Flow-charts

Clearly justified specification of key logic

Clear specification of logic

Basic selection and processing criteria

Identification of simple criteria

Little of merit

User Interface

Fully justified definition of all aspects of the user interface including “criteria” input screens and output reports and graphics

Justified definition of most aspects of the user interface including output reports and graphics

Definition of some aspects of the user interface including output reports and graphics

Definition of a few aspects of the user interface including output graphics

Identifies types of graphs that might answer a user’s question

Little of merit

Test Requirements

Comprehensively defined and justified test schedules and test data

Justifies test schedules and test data

Defines test schedules and test data

Defines testing process

Identifies need for testing and suggests how it might be done

Little of merit

Part 2 - Build MIS in SAS - 70%

Part 2.1 (50%):

Using the Outline Specification from part one of the portfolio, you will develop a small MIS application in SAS that implements the intent of your specification.

Part 2.2 (20%):

Each week you have been given a set of exercises to test your understanding of the material covered that week. You are to demonstrate a selection of the exercises that you have done for marking which will be randomly selected from exercises 1-10. Questions will be asked to test your understanding. The assessment will be based on your understanding and the knowledge of SAS programming.

Part 2.1 and 2.2 will be demonstrated together to your tutor during the last week of the semester. The demonstration schedule will be available on Udo Blackboard.

Assessment Criteria for part 2.1

Out of the 50% in part 2.1, coding and use of SAS will have 30% and data will have 20% weighting.

Percentage Band

Coding and use of SAS

Data

90%-100%

Relevant SAS Source code pre-written and stored in SAS system. Code modules chosen and run from a menu system. Data selection menu driven with complex options. Graphic data presentation, good choice of output styles which are menu driven.

Uses pre-imported data in a saved master file using the PCT detail data and SHA ID data. Optimises SAS storage structures to the types of questions provided.

Optimises storage space by use of surrogate keys to natural key tables.

Optimises use of permanent and work file structures.

80%-89%

Relevant SAS Source code pre-written and stored in SAS system. Code modules chosen and run from a menu system. Data selection menu driven with complex options. Graphic data presentation, good choice of output styles.

Uses pre-imported data in a saved master file using the PCT detail data and SHA ID data. Optimises SAS storage structures to the types of questions provided.

Optimises storage space by use of surrogate keys to natural key tables.

70%-79%

Relevant SAS Source code pre-written and stored in SAS system. Code modules chosen and run from a menu system. Data selection code pre-written with complex options. Graphic data presentation, good choice of output styles.

Uses pre-imported data in a saved master file using the PCT detail data and SHA ID data. Optimises storage space by use of surrogate keys to natural key tables

60%-69%

Relevant SAS Source code pre-written and stored in SAS system. Code modules chosen and run manually. Data selection code pre-written but with complex options. Graphic data presentation, good choice of output styles.

Uses pre-imported data in a saved master file using the PCT detail data and SHA ID data.

50%-59%

Relevant SAS Source code pre-written and stored in SAS system. Code modules chosen and run manually. Data selection code pre-written. Simple Graphic data presentation, some choice of output styles.

Uses pre-imported data in a saved master file using the PCT detail data and SHA ID data for several months

40%-49%

Relevant SAS Source code pre-written and copy / pasted into SAS system. Data selection code pre-written. Simple Graphic data presentation, no choice of output styles.

Uses pre-imported data in a saved master file using the PCT detail data for several months

35%-39%

Relevant SAS Source code pre-written and copy / pasted into SAS system. Data selection code written into programme editor window.

Uses data for a single month imported at run time from XLS file using the PCT detail data

0%-34%

Little of merit

Little of merit

6

Assignment brief and tutorials/Assignment Brief and Example presentation/Example.pptx

Historic Mortality Data

By Antony Paul, Adam Langley, Rashidi Ramazani & Ghanem Nasser

Aims and Objectives

Analyse the data

Present the data (SAS/ODS Graphics)

Trends in the data

Identifying key stakeholders

Introduction

Importing the data

Results of the data

Creating tables

Potential stakeholders

Key stakeholders

NHS (Patient records, research, CQC)

Councils (Records, burial records, council tax)

Lawyers/Solicitors (Referrals, wills, attorneys)

Insurance companies (Premium & rates, trends)

Funeral directors/providers (Resource management, location)

Care Quality Commission

National Audit Office (Census)

World Health Organisation

How to import a dataset *Using codes ()

libname home 'U:\Historic Mortality data\Historic Mortality data';

run;

proc import out=home

datafile='U:\Historic Mortality data\Historic Mortality data\PDH7484.csv'

dbms=csv replace;

run;

proc print;

run;

Result

How to Import data without coding

Go file - select import data

How to import data without using Code

-Select a data source

-click next

How to import data without using Code

Select Browse.

Select the file you want to import

How to import data without using Code

-Choose your library.

-Add member name.

-Click next / then click finish.

Merging datasets

Result

Rename Variables

Proc Means Statement

Year from 1963 - 1984

Proc Tabulate Statement

Creating group chart

Create a small dataset

2. PROC CHART

3. RESULT

Comparing gender mortality for ages 10 to 14 years from 1901 to 1910 England and Wales.

FACT: There was a

decrease in death rates by over 1,000

Assignment brief and tutorials/week 1/Exercises(1).pdf

Exercises

Level 1

1. Create a Report in an ODS Destination

a. Open the starter file r101e01. The program is not syntactically correct, so do not

submit it until after you make changes. If you have not already done so, submit the

libname program that is in your working directory in order to ensure that the class

data files are available to you.

Starter Program

** Close the LISTING destination;

ods <destination> file='filename.ext' style=<style>;

proc tabulate data=OrionRPT.customer_orders f=comma8. ;

title 'Quantity Ordered by Country and Type';

title2 'Low Volume Orders';

class customer_country order_type;

var quantity;

table customer_country,

order_type*quantity*sum;

keylabel sum = ' ';

run;

ods <destination> close;

** Reopen the LISTING destination;

b. Use the demonstration program as a model and modify the program as follows:

1) Replace the <destination> placeholder with the ODS destination that

creates a

Printer Control Language result file.

2) Replace the <style> placeholder with the PRINTER style for PCL output.

3) Use a valid name for the FILE= option. The correct file extension for a Printer

Control Language file is .PCL. Name the file r101s01.pcl and store the

output file in your working directory.

4) Close the LISTING destination at the top of the program.

5) Reopen the LISTING destination at the bottom of the program.

6) Submit the program.

c. Double-click the icon in the SAS Results window. Is there an application available to

render

the PCL file? You only need to answer this question Yes or No. (If an application is

available, it opens automatically. If an application is not available, you may see a

warning message or be asked to choose a program to open the file.) The purpose of

this task is to illustrate the behavior that occurs with a less frequently used ODS

destination.)

d. Modify the program code to create an HTML file that uses the EGDEFAULT style.

Name

the output file Exercise1d.HTML. Submit the program.

e. When the job completes, double-click the HTML icon in the SAS Results window to

launch a browser window.

f. Open the HTML file with Excel. Does the HTML output look the same in Excel as it

looks in the browser?

Level 2

2. Creating Two Different HTML Files and a PDF File

a. Open and modify the r101e02 starter program to use ODS HTML, ODS CHTML,

and ODS PDF statements to create three different output files in one job step. If you

have not already done so, submit the libname program that is found in your

working directory to ensure that the class data files are available to you. Code

STYLE=SASWEB for both of the HTML destinations. You can use the default style

for the PDF output file.

b. Open each of the result files. These are the desired results:

HTML Desired Results

CHTML Desired Results:

 Even though you code STYLE=SASWEB for the ODS CHTML destination,

CHTML (or Compact HTML) does not use a full SAS style template

definition. This is one of the differences between CHTML and HTML. For

more information about CHTML, consult the ODS MARKUP

documentation.

PDF Desired Results

Level 3

3. Suppressing ODS Results

This Level 3 exercise shows you a way to completely suppress the icons in the Results

window.

a. In an interactive SAS session, icons for your ODS result files are automatically

tracked in the Results window. As each file is created, it is automatically opened for

viewing. You can also manually open your results by clicking on the output file icon

in the SAS Results window. The ODS NORESULTS statement changes this default

behavior. Look up the ODS RESULTS (or ODS NORESULTS) statement in the

documentation.

b. What is the reason given in the documentation for using the ODS NORESULTS

statement?

c. Use the ODS NORESULTS statement in the starter program, r101e03. What is the

effect of running your code with this statement?

d. How can you find and open your output files when ODS NORESULTS is in effect?

Assignment brief and tutorials/week 1/Introduction to SAS Programming.pptx

Data Management and Business Intelligence 5CC519

www.derby.ac.uk/engtech

Introduction to SAS Programming

Dr John Panneerselvam

College of Engineering and Technology, University of Derby

Contents

Discussion of Data Analysis

Base SAS

Introduction to SAS Programming

Executing the SAS Program

Summary

www.derby.ac.uk/engtech

Discussion of Data Analysis

Base SAS

Introduction to SAS Programming

Executing the SAS Program

Summary

www.derby.ac.uk/engtech

www.derby.ac.uk/engtech

Data Mining

Machine Learning

KDD – Knowledge Discovery in Databases

www.derby.ac.uk/engtech

Data Pre-processing

Data Mining

Post-processing

Raw Data

Information

Discussion of Data Analysis

Base SAS

Introduction to SAS Programming

Running the SAS Program

Summary

www.derby.ac.uk/engtech

SAS programming

SAS is an integrated system of software solutions

that enables you to perform the following tasks:

Data entry, retrieval, and management

Report writing and graphics design

Statistical and mathematical analysis

Business forecasting and decision support

Operations research and project management

Applications development

www.derby.ac.uk/engtech

Base SAS Base SAS is a highly flexible and integrated software environment that can be used in virtually any setting to access, manipulate, manage, store, analyze, and report on data.

www.derby.ac.uk/engtech

Base SAS provides the following:

A graphical user interface for administering SAS tasks

A highly flexible and extensible programming language

A rich library of prewritten, ready-to-use SAS procedures

The flexibility to run on all major operating environments such as Windows, UNIX, and z/OS (OS/390)

The access to virtually any data source such as DB2, Oracle, SYBASE, Teradata, SAP, and Microsoft Excel the support for most widely used character encodings for globalization

www.derby.ac.uk/engtech

Discussion of Data Analysis

Base SAS

Introduction to SAS Programming

Executing the SAS Program

Discussions

www.derby.ac.uk/engtech

Introduction to SAS Programming

SAS environment

Overview of SAS program

Basic steps in writing a SAS program

www.derby.ac.uk/engtech

SAS Windowing Environment

12

In the SAS windowing environment, windows are used to edit and execute programming statements, display the log, output, and Help facility, and more.

www.derby.ac.uk/engtech

Three Primary Windows

13

In the SAS windowing environment, you submit and view the results of a SAS program using three primary windows.

contains the SAS program to submit.
contains information about the processing of the SAS program, including any warning and error messages.
contains reports generated by the SAS program.

www.derby.ac.uk/engtech

Editor Windows

Enhanced Editor Program Editor
Only available in the Windows operating environment Available in all operating environments
Default editor for Windows operating environment Default editor for all operating environments except Windows
Multiple instances of the editor can be open at one time Only one instance of the editor can be open at one time
Code does not disappear after it is submitted Code disappears after it is submitted
Incorporates color-coding as you type Incorporates color-coding after you press ENTER

14

www.derby.ac.uk/engtech

Editor Windows

Enhanced Editor

Program Editor

www.derby.ac.uk/engtech

Log Window

Partial SAS Log

16

33 data work.NewSalesEmps;

34 length First_Name $ 12 Last_Name $ 18

35 Job_Title $ 25;

36 infile 'newemps.csv' dlm=',';

37 input First_Name $ Last_Name $

38 Job_Title $ Salary;

39 run;

NOTE: The infile 'newemps.csv' is:

File Name=S:\Workshop\newemps.csv,

RECFM=V,LRECL=256

NOTE: 71 records were read from the infile 'newemps.csv'.

The minimum record length was 28.

The maximum record length was 47.

NOTE: The data set WORK.NEWSALESEMPS has 71 observations and 4 variables.

40

proc print data=work.NewSalesEmps;

NOTE: Writing HTML Body file: sashtml.htm

42 run;

NOTE: There were 71 observations read from the data set WORK.NEWSALESEMPS.

www.derby.ac.uk/engtech

Results Viewer Window

In SAS®9.3 output is displayed using HTML by default.

www.derby.ac.uk/engtech

Output Window

18

Obs First_Name Last_Name Job_Title Salary

1 Satyakam Denny Sales Rep. II 26780

2 Monica Kletschkus Sales Rep. IV 30890

3 Kevin Lyon Sales Rep. I 26955

4 Petrea Soltau Sales Rep. II 27440

5 Marina Iyengar Sales Rep. III 29715

6 Shani Duckett Sales Rep. I 25795

7 Fang Wilson Sales Rep. II 26810

8 Michael Minas Sales Rep. I 26970

9 Amanda Liebman Sales Rep. II 27465

10 Vincent Eastley Sales Rep. III 29695

11 Viney Barbis Sales Rep. III 30265

12 Skev Rusli Sales Rep. II 26580

13 Narelle James Sales Rep. III 29990

14 Gerry Snellings Sales Rep. I 26445

15 Leonid Karavdic Sales Rep. II 27860

www.derby.ac.uk/engtech

Sas

SAS Program

19

DATA steps are typically used to create SAS data sets.

PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, manage data, and sort data).

A SAS program is a sequence of steps that the user submits for execution.

Raw Data

DATA Step

Report

SAS Data Set

PROC Step

SAS Data Set

www.derby.ac.uk/engtech

Program statements

20

SAS steps begin with either of the following:

a DATA statement

a PROC statement

SAS detects the end of a step when it encounters one of the following:

a RUN statement (for most steps)

a QUIT statement (for some procedures)

the beginning of another step (DATA statement or PROC statement)

www.derby.ac.uk/engtech

This DATA step creates a temporary SAS data set named Work.NewSalesEmps by reading four fields from a raw data file.

The DATA step

21

data work.NewSalesEmps;

length First_Name $ 12

Last_Name $ 18 Job_Title $ 25;

infile 'newemps.csv' dlm=',';

input First_Name $ Last_Name $

Job_Title $ Salary;

run;

proc print data=work.NewSalesEmps;

run;

www.derby.ac.uk/engtech

This PROC PRINT step creates a listing report of the Work.NewSalesEmps data set.

The PROC step

22

data work.NewSalesEmps;

length First_Name $ 12

Last_Name $ 18 Job_Title $ 25;

infile 'newemps.csv' dlm=',';

input First_Name $ Last_Name $

Job_Title $ Salary;

run;

proc print data=work.NewSalesEmps;

run;

www.derby.ac.uk/engtech

The PROC step

This PROC MEANS step creates a summary report of the Work.NewSalesEmps data set with statistics for the variable Salary for each value of Job_Title.

23

data work.NewSalesEmps;

length First_Name $ 12

Last_Name $ 18 Job_Title $ 25;

infile 'newemps.csv' dlm=',';

input First_Name $ Last_Name $

Job_Title $ Salary;

run;

proc print data=work.NewSalesEmps;

run;

proc means data=work.NewSalesEmps;

class Job_Title;

var Salary;

run;

www.derby.ac.uk/engtech

Variables and Observations

www.derby.ac.uk/engtech

Obs First_Name Last_Name Job_Title Salary

1 Satyakam Denny Sales Rep. II 26780

2 Monica Kletschkus Sales Rep. IV 30890

3 Kevin Lyon Sales Rep. I 26955

4 Petrea Soltau Sales Rep. II 27440

5 Marina Iyengar Sales Rep. III 29715

6 Shani Duckett Sales Rep. I 25795

7 Fang Wilson Sales Rep. II 26810

8 Michael Minas Sales Rep. I 26970

9 Amanda Liebman Sales Rep. II 27465

10 Vincent Eastley Sales Rep. III 29695

11 Viney Barbis Sales Rep. III 30265

12 Skev Rusli Sales Rep. II 26580

13 Narelle James Sales Rep. III 29990

14 Gerry Snellings Sales Rep. I 26445

15 Leonid Karavdic Sales Rep. II 27860

Step Boundaries

SAS detects the end of the DATA step when it encounters the RUN statement.

SAS detects the end of the PROC PRINT step when it encounters the beginning of the PROC MEANS step.

25

data work.NewSalesEmps;

length First_Name $ 12

Last_Name $ 18 Job_Title $ 25;

infile 'newemps.csv' dlm=',';

input First_Name $ Last_Name $

Job_Title $ Salary;

run;

proc print data=work.NewSalesEmps;

proc means data=work.NewSalesEmps;

class Job_Title;

var Salary;

www.derby.ac.uk/engtech

Discussion of Data Analysis

Base SAS

Introduction to SAS Programming

Executing the SAS Program

Summary

www.derby.ac.uk/engtech

Program Syntax

The syntax in the program is colour coded to

highlight the following;

Step boundaries.

Keywords

Variables

Data set Names

www.derby.ac.uk/engtech

SAS program execution

Include/write the SAS program in the editor window.

To open a SAS program into your SAS session, select File  Open or click and then select the file that you want to include. To open a program, your Program Editor must be active.

You can also issue the INCLUDE command to open (include) a SAS program into your SAS session.

With the Program Editor active, on the command bar type include and the name of the file containing the program.

Press ENTER.

The program is included in the Program Editor window.

www.derby.ac.uk/engtech

Program Execution (Cont..)

Submit the program.

To submit the program for execution, issue the SUBMIT command, click , or select Run  Submit. The output from the program is displayed in the Output window.

Log window shows any warnings/errors of the program in the editor window.

The reports will be generated in the output window.

To clear the contents of the window, issue the CLEAR command, select Edit  Clear All, or click .

www.derby.ac.uk/engtech

SAS Libraries

Work – temporary and default library

Create your own library and link it to the working folder

SAS datasets will be stored in the physical folder in reference to your library

Naming a SAS dataset

Derby.Name

www.derby.ac.uk/engtech

libref

Dataset

Name

The SAS help facility

31

www.derby.ac.uk/engtech

References

Tan, Steinbach, Kumar, Introduction to Data Mining, Pearson, 2014, ISBN: 10: 1-292-02615-4

SAS Programming 1: Essentials- SAS Institute.

SAS: The Power to Know- Official documentation.

www.derby.ac.uk/engtech

Assignment brief and tutorials/week 1/Lecture 1 - Introduction to ODS.pptx

Data Management and Business Analytics 5CC519

www.derby.ac.uk/engtech

Lecture 1 – Introduction to Output Delivery System

College of Engineering and Technology, University of Derby

Assignment brief and tutorials/week 1/Module Introduction_DBBI.pptx

Data Management and Business Intelligence 5CC519 Course Introduction

www.derby.ac.uk/engtech

www.derby.ac.uk/engtech

About the lecturer

College of Engineering and Technology

Dr. John Panneerselvam

Email: [email protected])

Lives in MS310 STEM Building, Markeaton Street

Telephone: 01332 592404

This module runs every Friday 2 pm to 6 pm, in MS215, Markeaton street

www.derby.ac.uk/engtech

Module Outcomes

Identify and critically evaluate the key requirements for effective data management, analysis and presentation to meet users’ needs

Develop a small management information reporting system that meets the users’ needs using a prototyping / Agile approach.

Proactive

And reactive

3

www.derby.ac.uk/engtech

Module Structure

Scheduled learning and teaching activities: 28%

Guided independent study: 76%

 

Lab based Workshops 24 hours

Lab based tutorials 24 hours

Guided Independent Study 152 hours

Total: 200 hours

Proactive

And reactive

4

www.derby.ac.uk/engtech

What we will cover

The principles of data analysis and normalisation as the foundation of the “store once and once only” principle for data integrity

The principles of data administration and security management

The requirements for optimisation of data management for the conflicting needs of real-time and MIS / EIS and BI purposes

The impact of User-centred analysis and participatory development on the development data structures and stores and systems that meet the needs of the organisation

The role, application and benefits of prototyping in the rapid development of systems, using RAD and Agile approaches

Techniques and practices for effective data analysis and presentation

The impact of cloud computing and storage on the management of and access to organisational data.

Together with a range of current, leading edge topics and technologies that affect data management and usage.

Proactive

And reactive

5

www.derby.ac.uk/engtech

Assessment

100% coursework - Portfolio

The module will be assessed by a single portfolio consisting of an individual application project supported by a short, business report which academically justifies the design decisions that underpin the finished application system, which will be built in the SAS environment.

Proactive

And reactive

6

www.derby.ac.uk/engtech

Assessment

You will be given opportunities, through both formative and

summative assessments, to reflect on your learning to

support professional and personal development

– Formative Assessment - the bidirectional process between

teacher and student to enhance, recognize and respond to the

learning

– Summative Assessment - summarizes the development of

learners at a particular time

Proactive

And reactive

7

www.derby.ac.uk/engtech

Formative Assessment

– In-class group presentation

– A Zip file containing your answers and codes of all your

exercises. (Each week you have been given a set of exercises

to do that tested your understanding of the material covered

that week.)

• You will not get extra marks by the formative assessment

• You WILL lose marks (individually) by NOT completing the

assessment

Proactive

And reactive

8

www.derby.ac.uk/engtech

Summative Assessment

CW: 100%

The assignment represents the coursework element of the module.

Assessment will be by means of a portfolio of learning artefacts demonstrating the achievement of all learning outcomes

The assignment should be carried out independently. That is, no collaboration is allowed.

An electronic copy should be handed-in via Udo.

Proactive

And reactive

9

www.derby.ac.uk/engtech

10

Assignment brief and tutorials/week 2/Lecture 2_Summary Reports with TABULATE Procedure.pptx

Data Management and Business Intelligence 5CC519

www.derby.ac.uk/engtech

Lecture 2 – Getting started with Proc TABULATE

Dr John Panneerselvam

College of Engineering and Technology, University of Derby

Chapter 2: Getting Started with the TABULATE Procedure

2.1 Introduction to the TABULATE Procedure
2.2 Creating the Structure of the Table
2.3 Adding Overall Summarizations
2.4 Adding Additional Statistics

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

2

Summary Reports

r102d01

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

3

TABULATE Procedure

The TABULATE procedure displays descriptive statistics in tabular format.

4

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

4

Dimensional Tables

The structure of the tables produced with the TABULATE procedure is one-, two-, or three-dimensional.

 One variable or multiple variables can be in a dimension.

5

Page dimension Row dimension Column dimension
One-dimensional
Two-dimensional
Three-dimensional

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

5

One-Dimensional Table

A one-dimensional table contains a column dimension.

Country is in the column dimension.

Gender is in the column dimension.

6

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

6

Two-Dimensional Table

A two-dimensional table contains a row and a column dimension.

7

Gender is in the row dimension and Country is in the column dimension.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

7

Three-Dimensional Table

A three-dimensional table contains a page, a row, and a column dimension.

Job_Title is in the page dimension, Gender is in the row dimension, and Country is in the column dimension.

8

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

8

2.01 Quiz

Specify the variables that appear in each dimension.

Page dimension Row dimension Column dimension

Is this table

one-dimensional,

two-dimensional, or three-dimensional?

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

9

2.01 Quiz – Correct Answer

Specify the variables that appear in each dimension.

Page dimension Row dimension Column dimension
Gender Job_Title Country

Is this table

one-dimensional,

two-dimensional, or three-dimensional?

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

10

Concatenating Variables

Concatenating variables places the output for one variable immediately next to the output for another variable in the same dimension.

In the following examples, Gender and Country are concatenated in either the

column or row dimension.

11

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

11

Crossing Variables

Crossing variables places the value of one variable within the value of another variable.

In the following examples, Country is crossed within Gender in either the column or row dimension.

12

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

12

Classification and Analysis Variables

Every variable specified in a dimension must be declared as a classification variable or an analysis variable.

13

Classification Variable Specifies which variable(s) are used to group the data. Can be character or numeric. N, the number of nonmissing values, is the default statistic.
Analysis Variable Identifies the variable(s) for which statistics are calculated. Must be numeric. Tends to be continuous. SUM is the default statistic.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

13

Classification and Analysis Variables

For the following data:

Job_Title, Gender, and Country are reasonable classification variables.

Salary and Bonus are likely analysis variables.

14

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

14

Chapter 2: Getting Started with the TABULATE Procedure

2.1 Introduction to the TABULATE Procedure
2.2 Creating the Structure of the Table
2.3 Adding Overall Summarizations
2.4 Adding Additional Statistics

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

15

Objectives

Use the TABLE statement with commas to create one-, two-, and three-dimensional tables.

Use the CLASS statement to define classification variables.

Concatenate variables by using the blank in the TABLE statement.

Cross variables by using the asterisk in the TABLE statement.

Use the VAR statement to define analysis variables.

16

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

16

TABULATE Procedure

General form of the TABULATE procedure:

The TABLE statement describes the table structure.

Every variable that is specified in the TABLE statement must be defined as a classification variable (CLASS statement) or an analysis variable (VAR statement).

17

PROC TABULATE DATA=SAS­-data-­set <options>;

CLASS classification­-variable(s);

VAR analysis-variable(s);

TABLE <page­-expression>,

<row­-expression>,

column­-expression </ option(s)>;

RUN;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

17

TABLE Statement

In the TABLE statement, the comma is used to separate the dimension expressions.

A one-dimensional table has a column expression only (no commas).

A two-dimensional table has a row expression and a column expression (one comma).

A three-dimensional table has a page, a row, and a column expression (two commas).

18

TABLE , , ;

row- expression

page- expression

column- expression

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

18

TABLE Statement

Examples:

19

TABLE , , ;

table Gender , Country;

table Country;

table Job_Title , Gender , Country;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

19

One-Dimensional Table

A one-dimensional table contains a column dimension.

20

proc tabulate data=orion.sales;

class Country;

table Country;

run;

proc tabulate data=orion.sales;

class Gender;

table Gender;

run;

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

20

One-Dimensional Table

Multiple TABLE statements can be in one step.

21

proc tabulate data=orion.sales;

class Country Gender;

table Country;

table Gender;

run;

 N is the default statistic when there is only a CLASS statement and no VAR statement.

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

21

Two-Dimensional Table

A two-dimensional table contains a row and a column dimension.

22

proc tabulate data=orion.sales;

class Country Gender;

table Gender, Country;

run;

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

22

Three-Dimensional Table

A three-dimensional table contains a page, a row, and a column dimension.

23

r102d02

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Job_Title, Gender, Country;

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

23

Concatenating Variables

Concatenating variables is accomplished by specifying a blank between variables in one dimension.

24

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Job_Title, Gender Country;

run;

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

24

Crossing Variables

Crossing variables is accomplished by specifying an asterisk (*) between variables in one dimension.

25

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Gender*Country, Job_Title;

run;

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

25

Analysis Variable

SUM is the default statistic for an analysis variable that is used in the TABLE statement.

26

proc tabulate data=orion.sales;

class Country Gender Job_Title;

var Salary;

table Job_Title, Country, Gender*Salary;

run;

 Analysis variables are defined in the VAR statement.

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

26

Analysis Variable

If you forget to define a variable as an analysis variable, you get an error.

 The same type of error exists if you forget to define a variable as a classification variable.

27

880 proc tabulate data=orion.sales;

881 where Job_Title contains 'Rep';

882 class Country Gender Job_Title;

883 table Job_Title, Country, Gender*Salary;

884 run;

ERROR: The type of name (Salary) is unknown.

NOTE: The SAS System stopped processing this step because of

errors.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

27

Chapter 2: Getting Started with the TABULATE Procedure

2.1 Introduction to the TABULATE Procedure
2.2 Creating the Structure of the Table
2.3 Adding Overall Summarizations
2.4 Adding Additional Statistics

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

28

Objectives

Use the keyword ALL to create overall summarizations.

Include the keyword ALL with concatenated and crossed variables.

Use parentheses with the keyword ALL to summarize within a parenthetical group.

Cross an analysis variable with the keyword ALL.

29

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

29

Business Scenario

Orion Star Country managers want overall summarizations added to the tabular reports about their sales employees.

r102d03

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

30

ALL Keyword

The keyword ALL can be included in the TABLE statement to summarize all of the categories for class variables in the given dimension.

31

proc tabulate data=orion.sales;

class Country Gender;

table Gender All, Country All;

run;

r102d03

 The keyword ALL does not belong in the CLASS or VAR statement.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

31

ALL Keyword

The keyword ALL can be specified in any of the three dimensions.

32

table Job_Title All,

Gender All,

Country All;

r102d03

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

32

Concatenated Variables

The keyword ALL can be included with concatenated variables.

33

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Job_Title, Gender All Country All;

run;

r102d03

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

33

Crossed Variables

The keyword ALL can be included with crossed variables.

34

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Gender*Country All, Job_Title;

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

34

Parentheses

Parentheses can be used with the keyword ALL to summarize all of the categories for class variables in the same parenthetical group.

35

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Gender*(Country All) All,

Job_Title;

run;

r102d03

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

35

Analysis Variables

If the keyword ALL is specified in the dimension that contains an analysis variable, ALL must be crossed with the variable to get a statistic such as SUM.

36

ALL*SALARY

ALL

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

36

Analysis Variables

37

proc tabulate data=orion.sales;

class Country Gender Job_Title;

var Salary;

table Job_Title*Gender All,

Country*Salary All*Salary;

run;

r102d03

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

37

Chapter 2: Getting Started with the TABULATE Procedure

2.1 Introduction to the TABULATE Procedure
2.2 Creating the Structure of the Table
2.3 Adding Overall Summarizations
2.4 Adding Additional Statistics

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

38

Objectives

Specify the default statistics in the TABLE statement.

Add additional statistics to the TABLE statement.

Add percentages to the TABLE statement.

39

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

39

Business Scenario

Orion Star Country managers want additional statistics added to the tabular reports about their sales employees.

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

40

Default Statistics

N is the default statistic when there is only a CLASS statement and no VAR statement.

41

proc tabulate data=orion.sales;

class Country Gender;

table Gender All, Country All;

run;

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

41

Default Statistics

SUM is the default statistic when there is a VAR statement with or without a CLASS statement.

42

proc tabulate data=orion.sales;

class Country Gender Job_Title;

var Salary;

table Job_Title*Gender All,

Country*Salary All*Salary;

run;

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

42

Default Statistics

The default statistics can be specified in the TABLE statement. The asterisk (*) is used to associate a statistic keyword with a variable or the keyword ALL.

43

table Gender All,

Country*N All*N;

table Job_Title*Gender All,

Country*Salary*Sum All*Salary*Sum;

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

43

Statistic Keywords

The following descriptive statistics and quantile statistics can be specified in the TABLE statement:

44

CSS CV KURTOSIS LCLM MAX
MEAN MIN MODE N NMISS
RANGE SKEWNESS STDDEV STDERR SUM
SUMWGT UCLM USS VAR
PCTN PCTSUM REPPCTN REPPCTSUM PAGEPCTN
PAGEPCTSUM ROWPCTN ROWPCTSUM COLPCTN COLPCTSUM
MEDIAN P1 P5 P10 P25
P75 P90 P95 P99 QRANGE

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Keywords and Formulas page 1666

44

Statistic Keywords

45

proc tabulate data=orion.sales;

class Country Job_Title;

var Salary;

table Job_Title All,

Country*N

Salary*Mean Salary*Max;

run;

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

45

Statistic Keywords

The statistic keywords can be specified in the row dimension instead of the column dimension.

46

proc tabulate data=orion.sales;

class Country Job_Title;

var Salary;

table Country*N

Salary*Mean Salary*Max,

Job_Title All;

run;

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

46

Statistic Keywords

The statistic keywords can be specified in only one dimension.

47

1269 proc tabulate data=orion.sales;

1270 class Country Job_Title;

1271 var Salary;

1272 table Country*N

1273 Salary*Mean Salary*Max,

1274 Job_Title*N All;

1275 run;

ERROR: There are multiple statistics associated with a single

table cell in the following nesting : Country * N *

Job_Title * N.

ERROR: There are multiple statistics associated with a single

table cell in the following nesting : Salary * Mean *

Job_Title * N.

ERROR: There are multiple statistics associated with a single

table cell in the following nesting : Salary * Max *

Job_Title * N.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

47

Parentheses

Multiple statistics can be specified in parentheses.

48

table Country All,

Gender*Salary*Min

Gender*Salary*Max;

table Country All,

Gender*Salary*(Min Max);

r102d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

48

Percentages

The TABULATE procedure can calculate many different percentages.

49

PCTN PCTSUM REPPCTN REPPCTSUM PAGEPCTN
PAGEPCTSUM ROWPCTN ROWPCTSUM COLPCTN COLPCTSUM

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

49

r102d06

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

50

Which statement is true?

A one-dimensional table contains a row dimension.

A two-dimensional table contains a row and a column dimension.

A three-dimensional table contains a row, a column, and a diagonal dimension.

A four-dimensional table contains a row, a column, a diagonal, and a page dimension.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

51

Which statement is false?

Concatenating variables place the output for one variable immediately next to the output for another variable.

Crossing variables place the value of one variable within the value of another variable.

Classification variables are used to group the data and must be character.

Analysis variables are variables for which statistics are calculated and must be numeric.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

52

Which statement accurately represents the correct order of the dimensions in the TABLE statement?

table column, row, page;

table column, page, row;

table page, column, row;

table page, row, column;

table row, column, page;

table row, page, column;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

53

Which of the following statements is false concerning the operators in a TABLE statement?

An asterisk is used to cross variables.

A forward slash is used to divide values.

A blank is used to concatenate variables.

A comma is used to separate the dimensions.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

54

Which step produces the following table?

x

x

proc tabulate data=orion.sales;

class Gender;

table Gender, Country*N;

run;

proc tabulate data=orion.sales;

class Gender Country;

table Gender, Country;

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

55

Which statement is false?

The keyword ALL can be crossed with an analysis variable.

The keyword ALL must be declared in the CLASS statement.

The keyword ALL can be specified in any of the three dimensions.

The keyword ALL can be included in the TABLE statement to summarize all of the categories in the given dimension.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

56

Which TABLE statement produces the following report?

table Gender, Country All;

table Gender, Country All*Salary;

table Gender, Country*Salary All;

table Gender, Country*Salary All*Salary;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

57

SUM is the default statistic when an analysis variable is used in the TABLE statement with or without a classification variable.

 True

 False

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

58

Which table is produced from the following step?

.

.

proc tabulate data=orion.sales;

class Gender;

var Salary;

table Gender, Salary*(Min Max);

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

59

Which percentage statistic is in the following table?

COLPCTN

PCTN

REPPCTN

ROWPCTN

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

60

Assignment brief and tutorials/week 2/Tuorials - Getting started with PROC TABULATE.docx

Exercisesc:\Program Files\PowerServ\CourseGraphics\exer_arrow.jpg

Level 1

1. Creating Two-Dimensional Tables

a. Retrieve the program r102e01.

r102e01

ods html file='r102e01.html' style=sasweb;

proc tabulate data=orders f=comma8. ;

title 'Requesting Statistics';

class order_type;

** Table 1;

table row-dimension, column-dimension;

** Table 2;

table row-dimension, column-dimension;

run;

ods html close;

b. The program has two TABLE statements. Change the first TABLE statement to produce this output table.

c. Change the second TABLE statement to produce this output table.

The two tables have different headers in the column dimension.

Level 2

2. Nesting Variables in Two Dimensions

a. The starter program, r102e02, contains a PROC SORT step for subsetting the data and contains your ODS statements.

b. Create a PROC TABULATE step that produces the following output.

1) Nest Total_Retail_Price underneath Order_Type in the column dimension. The default statistic of SUM is what you want to see in the data cells, so you do not need to explicitly use the SUM statistic in the TABLE statement.

2) Nest Customer_Age_Group within Customer_Gender in the row dimension.

3) Use the ALL class variable in both dimensions to produce a grand total row and a grand total column. Again, use the default statistic of SUM, so you do not need to explicitly code the SUM statistic in the TABLE statement.

Desired Results:

Level 3

3. Investigating Elementary Statistics

This Level 3 exercise enables you to explore the list of elementary statistics and discover which ones cannot be requested with PROC TABULATE.

a. What are some of the elementary statistics that you cannot request with PROC TABULATE?

Hint: Search in the documentation on the string “elementary statistics TABULATE”. The documentation contains an information table that compares TABULATE statistics to other procedures' statistics.

Assignment brief and tutorials/week 3/Lecture 3_More with TABULATE Procedure.pptx

Data Management and Business Intelligence 5CC519

www.derby.ac.uk/engtech

Lecture 3 – Doing More with Proc TABULATE

Dr John Panneerselvam

College of Engineering and Technology, University of Derby

Recap

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

2

TABULATE Procedure

General form of the TABULATE procedure:

The TABLE statement describes the table structure.

Every variable that is specified in the TABLE statement must be defined as a classification variable (CLASS statement) or an analysis variable (VAR statement).

3

PROC TABULATE DATA=SAS­-data-­set <options>;

CLASS classification­-variable(s);

VAR analysis-variable(s);

TABLE <page­-expression>,

<row­-expression>,

column­-expression </ option(s)>;

RUN;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

3

TABLE Statement

In the TABLE statement, the comma is used to separate the dimension expressions.

A one-dimensional table has a column expression only (no commas).

A two-dimensional table has a row expression and a column expression (one comma).

A three-dimensional table has a page, a row, and a column expression (two commas).

4

TABLE , , ;

row- expression

page- expression

column- expression

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

4

TABLE Statement

Examples:

5

TABLE , , ;

table Gender , Country;

table Country;

table Job_Title , Gender , Country;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

5

One-Dimensional Table

Multiple TABLE statements can be in one step.

6

proc tabulate data=orion.sales;

class Country Gender;

table Country;

table Gender;

run;

 N is the default statistic when there is only a CLASS statement and no VAR statement.

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

6

Two-Dimensional Table

A two-dimensional table contains a row and a column dimension.

7

proc tabulate data=orion.sales;

class Country Gender;

table Gender, Country;

run;

r102d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

7

Three-Dimensional Table

A three-dimensional table contains a page, a row, and a column dimension.

8

r102d02

proc tabulate data=orion.sales;

class Country Gender Job_Title;

table Job_Title, Gender, Country;

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

8

Chapter 3: Doing More with the TABULATE Procedure

3.1 Enhancing with Labels and Formats
3.2 Enhancing with Options
3.3 Adding Trafficlighting
3.4 Adding Additional Style Attributes

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

9

Chapter 3: Doing More with the TABULATE Procedure

3.1 Enhancing with Labels and Formats
3.2 Enhancing with Options
3.3 Adding Trafficlighting
3.4 Adding Additional Style Attributes

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

10

Objectives

Use the LABEL statement to add descriptive headings to variables and the KEYLABEL statement to add descriptive headings to keywords.

Add labels to the TABLE statement to add descriptive headings to variables and keywords.

Use the FORMAT statement to format classification variables and the FORMAT= option in the PROC TABULATE statement to specify a format for the statistical cells.

Add formats to the TABLE statement to format specific statistical cells.

Define the order of precedence for formats.

Use a PICTURE format to format percentages.

11

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Business Scenario

Orion Star Country managers want descriptive headings and formatted values added to the tabular reports about their sales employees.

r103d01

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

LABEL Statement

The LABEL statement can be added to the TABULATE procedure to assign descriptive labels to classification and analysis variables.

13

label Gender='Employee Gender'

Country='Location'

Salary='Annual Salary';

r103d01

LABEL variable-1 = 'description-1 '

<variable-n = 'description-n '>;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

KEYLABEL Statement

The KEYLABEL statement, a statement specific to the TABULATE procedure, can be added to assign descriptive labels to statistic keywords and the ALL keyword.

14

r103d01

keylabel Mean='Average'

All='Total';

KEYLABEL keyword-1 = 'description-1 '

<keyword-n = 'description-n '>;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

KEYLABEL Statement

If the KEYLABEL statement specifies a blank value for a statistic, the cell is removed from the table.

15

r103d01

keylabel Mean='Average'

All='Total';

keylabel Mean=' '

All='Total';

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Labels in the TABLE Statement

If the cell containing a classification or analysis variable is not desired, a blank label for the cell must be specified in the TABLE statement.

16

table Gender=' ' All,

Country*Salary=' '*Mean

All*Salary=' '*Mean;

table Gender All,

Country*Salary*Mean

All*Salary*Mean;

variable='label '

r103d01

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Labels in the TABLE Statement

A label can also be specified in the TABLE statement to change the heading of a keyword.

17

table Gender=' ' All='Row Total',

Country*Salary=' '*Mean

All='Column Total'*Salary=' '*Mean;

keyword='label '

r103d01

Recommendation: Only put a label in the TABLE statement if the label cannot be accomplished in the LABEL or KEYLABEL statement.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

FORMAT Statement

The FORMAT statement can be added to the TABULATE procedure to assign a format to a classification variable.

18

r103d02

FORMAT variable-1 <variable-n> format

<variable-2 <variable-n> format>;

format Gender $genfmt.

Country $ctryfmt.;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

FORMAT Statement

A format is particularly useful for grouping values into fewer categories.

19

format Hire_Date year4.;

format Hire_Date date9.;

r103d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Default Format

By default, the cells containing the N value have no decimal places (12.0) and the rest of the statistical cells have two decimal places (12.2).

20

r103d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

In the LISTING destination, all statistical cells use the format of 12.2.

20

FORMAT= Option

The FORMAT= option can be added to the PROC TABULATE statement to specify a format for the statistical cells.

21

proc tabulate data=orion.sales

format=comma12.;

FORMAT=format

r103d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Formats in the TABLE Statement

A format modifier can be added to the TABLE statement by using the asterisk (*) to associate a format modifier with an analysis variable or keyword.

22

table Gender All,

Country*Salary*(N Mean)

All*Salary*(N Mean*f=dollar12.2);

F=format

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Format Precedence

The following is the order for formats associated with the statistical cells.

23

Default (12.0/12.2)

FORMAT= in PROC

Format in Page

Format in Row

Format in Column

By default, the format specified in the column dimension is used over any other format.

If no format is specified in any of the dimensions, the format specified with the FORMAT= option in the PROC statement is used over the default format.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

FORMAT_PRECEDENCE= Option

The order of precedence can be changed by using the FORMAT_PRECEDENCE= option in the TABLE statement. By default, the FORMAT_PRECEDENCE= option is set to COLUMN.

24

table Gender All*f=dollar12.,

Country*Salary*(N Mean*f=dollar12.2)

/ format_precedence=row;

r103d02

FORMAT_PRECEDENCE=

PAGE | ROW | COLUMN

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

The FORMAT Procedure

The FORMAT procedure is used to create user-defined formats.

General form of the FORMAT procedure with the VALUE statement:

25

PROC FORMAT;

VALUE format-name range1 = 'label ' range2 = 'label ' . . . ;

RUN;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

25

The FORMAT Procedure

A format-name

names the format that you are creating

cannot be more than 32 characters in SAS®9

for character values, must have a dollar sign ($) as the first character, and a letter or underscore as the second character

for numeric values, must have a letter or underscore as the first character

cannot end in a number

cannot be the name of a SAS format

does not end with a period in the VALUE statement.

26

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

26

Character User-Defined Format

The OTHER keyword matches all values that do not match any other value or range.

27

proc format;

value $ctryfmt 'AU' = 'Australia'

'US' = 'United States'

other = 'Miscoded';

run;

character format name

keyword

labels

p112d07

discrete character values

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

27

Character User-Defined Format

28

proc format;

value $ctryfmt 'AU' = 'Australia'

'US' = 'United States'

other = 'Miscoded';

run;

proc print data=orion.sales label;

var Employee_ID Job_Title Salary

Country Birth_Date Hire_Date;

label Employee_ID='Sales ID'

Job_Title='Job Title'

Salary='Annual Salary'

Birth_Date='Date of Birth'

Hire_Date='Date of Hire';

format Salary dollar10.0

Birth_Date Hire_Date monyy7.

Country $ctryfmt.;

run;

Part 2

Part 1

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

28

Notice the two parts. First we create and second we use. No period when you create but a period when you use.

Character User-Defined Format

Partial PROC PRINT Output

29

Annual Date of Date of

Obs Sales ID Job Title Salary Country Birth Hire

60 120178 Sales Rep. II $26,165 Australia NOV1954 APR1974

61 120179 Sales Rep. III $28,510 Australia MAR1974 JAN2004

62 120180 Sales Rep. II $26,970 Australia JUN1954 DEC1978

63 120198 Sales Rep. III $28,025 Australia JAN1988 DEC2006

64 120261 Chief Sales Officer $243,190 United States FEB1969 AUG1987

65 121018 Sales Rep. II $27,560 United States JAN1944 JAN1974

66 121019 Sales Rep. IV $31,320 United States JUN1986 JUN2004

67 121020 Sales Rep. IV $31,750 United States FEB1984 MAY2002

68 121021 Sales Rep. IV $32,985 United States DEC1974 MAR1994

69 121022 Sales Rep. IV $32,210 United States OCT1979 FEB2002

70 121023 Sales Rep. I $26,010 United States MAR1964 MAY1989

71 121024 Sales Rep. II $26,600 United States SEP1984 MAY2004

72 121025 Sales Rep. II $28,295 United States OCT1949 SEP1975

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

29

Numeric User-Defined Format

30

proc format;

value tiers 20000-49999 = 'Tier 1'

50000-99999 = 'Tier 2'

100000-250000 = 'Tier 3';

run;

p112d07

numeric ranges

labels

numeric format name

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

30

We just looked at a character format example, so now we will look at a numeric example…

Numeric User-Defined Format

31

proc format;

value tiers low-<50000 = 'Tier 1'

50000- 100000 = 'Tier 2'

100000<-high = 'Tier 3';

run;

proc print data=orion.sales label;

var Employee_ID Job_Title Salary

Country Birth_Date Hire_Date;

label Employee_ID='Sales ID'

Job_Title='Job Title'

Salary='Annual Salary'

Birth_Date='Date of Birth'

Hire_Date='Date of Hire';

format Birth_Date Hire_Date monyy7.

Salary tiers.;

run;

Part 2

Part 1

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

31

Again, notice the two parts. First we create and second we use. No period when you create but a period when you use.

Numeric User-Defined Format

Partial PROC PRINT Output

32

Annual Date of Date of

Obs Sales ID Job Title Salary Country Birth Hire

60 120178 Sales Rep. II Tier 1 AU NOV1954 APR1974

61 120179 Sales Rep. III Tier 1 AU MAR1974 JAN2004

62 120180 Sales Rep. II Tier 1 AU JUN1954 DEC1978

63 120198 Sales Rep. III Tier 1 AU JAN1988 DEC2006

64 120261 Chief Sales Officer Tier 3 US FEB1969 AUG1987

65 121018 Sales Rep. II Tier 1 US JAN1944 JAN1974

66 121019 Sales Rep. IV Tier 1 US JUN1986 JUN2004

67 121020 Sales Rep. IV Tier 1 US FEB1984 MAY2002

68 121021 Sales Rep. IV Tier 1 US DEC1974 MAR1994

69 121022 Sales Rep. IV Tier 1 US OCT1979 FEB2002

70 121023 Sales Rep. I Tier 1 US MAR1964 MAY1989

71 121024 Sales Rep. II Tier 1 US SEP1984 MAY2004

72 121025 Sales Rep. II Tier 1 US OCT1949 SEP1975

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

32

Percent Format

A picture format can be created in order to add a percent sign to a percentage value.

The PICTURE statement creates a template for printing numbers.

The ROUND option rounds the value before formatting.

33

proc format;

picture pct(round) low-high='009.99%';

run;

PROC FORMAT;

PICTURE name <(format-option(s))> value-or-range-1 <..., value-or-range-n>='picture'

value-or-range-n <..., value-or-range-n>='picture';

RUN;

continued...

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Percent Format

The above step produces the following formatted values based on the stored values:

34

proc format;

picture pct(round) low-high='009.99%';

run;

Stored value .5 1.234 25 99.999
Formatted value 0.50% 1.23% 25.00% 100.00%

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Percent Format

After the picture format is created, the format can be used in the TABULATE procedure.

35

proc format;

picture pct(round) low-high='009.99%';

run;

proc tabulate data=orion.sales;

class Gender Job_Title;

table Gender All,

Job_Title*(N Pctn*f=pct.)

All*(N Pctn*f=pct.);

run;

r103d02

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Chapter 3: Doing More with the TABULATE Procedure

3.1 Enhancing with Labels and Formats
3.2 Enhancing with Options
3.3 Adding Trafficlighting
3.4 Adding Additional Style Attributes

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

36

Objectives

Use the BOX= option to place text in the box area.

Use the MISSTEXT= option to specify a text string for missing analysis values.

Use the MISSING= option in the OPTIONS statement to specify the character to be printed for missing numeric values.

Use the MISSING option to include observations with missing classification values.

Use the PRINTMISS option to create row and column headings that are the same for all pages of a table.

37

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Business Scenario

In addition to reports on sales employees, Orion Star Country managers want tabular reports about their warehouse employees.

r103d03

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Box Area

By default, the box area in the top left area of the table is empty in a two- or three-dimensional table.

39

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

BOX= Option

The BOX= option places text in the box area. This option is specified in the TABLE statement after a slash.

40

table Job_Title=' ', Gender*N=' '

/ box='Warehouse Title';

label Job_Title='Job Title';

table Job_Title=' ', Gender*N=' '

/ box=Job_Title;

continued...

BOX = 'string'

BOX=variable

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

BOX= Option

The _PAGE_ value places the page dimension text in the box area.

41

r103d03

table Country, Job_Title,

Gender*N=' '

/ box=_page_;

BOX=_PAGE_

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Missing Analysis Values

By default, missing analysis values are displayed with a period.

42

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

MISSTEXT= Option

The MISSTEXT= option specifies a text string for missing analysis values. This option is specified in the TABLE statement after a slash.

43

r103d03

table Job_Title=' ', Gender*N=' '

/ box='Job Title'

misstext='None';

MISSTEXT='string'

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

MISSING= Option

An alternative to the MISSTEXT= option is the MISSING= option in the OPTIONS statement.

The MISSING= option specifies the character to be printed for missing numeric values but is limited to one character.

44

r103d03

options missing=0;

proc tabulate data=orion.warehouse;

class Gender Job_Title;

table Job_Title=' ', Gender*N=' '

/ box='Job Title';

run;

options missing=.;

MISSING=character

resets back to the default

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Missing Classification Values

By default, the TABULATE procedure does not include observations with a missing value for a classification variable.

45

Observations 4 and 5 are excluded from the TABULATE output.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

MISSING Option

The MISSING option includes observations with missing classification values. If the option is placed in the PROC statement, all missing classification values are included.

46

r103d03

proc tabulate data=orion.warehouse missing;

class Gender Job_Title;

table Job_Title=' ', Gender*N=' '

/ box='Job Title';

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

MISSING Option

The MISSING option can be placed in the CLASS statement to control which missing classification values are included.

47

proc tabulate data=orion.warehouse;

class Gender / missing;

class Job_Title;

table Job_Title=' ', Gender*N=' '

/ box='Job Title';

run;

r103d03

The MISSING option is specified in the CLASS statement after a slash.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Suppressed Rows or Columns

By default, the TABULATE procedure suppresses a row or column for which there is no data.

48

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

PRINTMISS Option

The PRINTMISS option creates row and column headings that are the same for all pages of a table. This option is specified in the TABLE statement after a slash.

49

r103d03

table Country, Job_Title, Gender*N=' '

/ box=_page_ printmiss;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

References

SAS Programming 1: Essentials- SAS Institute.

SAS: The Power to Know- Official documentation.

www.derby.ac.uk/engtech

www.derby.ac.uk/engtech

r103d08

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

52

Which statement is false?

The KEYWORD statement can be added to the TABULATE procedure to assign descriptive labels to keywords.

The LABEL statement can be added to the TABULATE procedure to assign descriptive labels to variables.

If a KEYLABEL statement specifies a blank value for a statistic, the cell is removed from the table.

A label can be specified in the TABLE statement to change the heading of a variable or keyword.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

53

Which table is produced from the following step?

. b. .

proc tabulate data=orion.sales;

class Country Gender;

table Country, Gender*N=' ';

keylabel N='Count';

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

54

Which statement is true concerning formats?

By default, the format specified in the row dimension is used over any other format.

A FORMAT= option can be added to the PROC statement after a forward slash to specify a format for the statistical cells.

A FORMAT_PRECEDENCE= option can be added to the PROC statement to change the order of precedence for formats.

A format modifier can be added to the TABLE statement by using the asterisk (*) to associate a format modifier to an analysis variable or keyword.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

55

Which statement correctly places the page dimension text in the box area?

table Gender, Country, Job_Title / box=page;

table Gender, Country, Job_Title / box=_page_;

table Gender, Country, Job_Title / box=Gender;

table Gender, Country, Job_Title / box=_Gender_;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

56

Which statement is false concerning missing classification values?

By default, the TABULATE procedure excludes observations with a missing classification value.

A MISSING option can be placed in the PROC statement to include all missing classification values.

A MISSING option can be placed in the VAR statement to control which missing classification values are included.

A MISSING option can be placed in the CLASS statement to control which missing classification values are included.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

57

Which syntax is correct for the PRINTMISS option?

proc tabulate data=orion.warehouse printmiss;

proc tabulate data=orion.warehouse / printmiss;

class Country Gender Job_Title / printmiss;

table Country, Job_Title, Gender / printmiss;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

58

Which TABLE statement produces the following report?

table Country, Gender*style={backgroundcolor=pink};

table Country, Gender*{style={backgroundcolor=pink}};

table Country, Gender / style={backgroundcolor=pink};

table Country, Gender / {style={backgroundcolor=pink}};

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

59

Which table is based on the following format?

b.

proc format;

value ncell . = 'beige';

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

60

Which statement cannot include the STYLE= option?

PROC

CLASSLEV

VAR

TABLE

LABEL

KEYWORD

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

61

Which of the following is not a style attribute that is specified with the STYLE= option?

BORDERWIDTH

COLOR

FONTFAMILY

FORMAT

TEXTALIGN

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

62

Assignment brief and tutorials/week 3/Tutorial - Doing More with Proc TABULATE.docx

Exercisesc:\Program Files\PowerServ\CourseGraphics\exer_arrow.jpg

Level 1

1. Changing PROC TABULATE Defaults

a. Retrieve the starter program r102e04. Change the statistic KEYLABEL statement, the variable LABEL statement, the BOX= option, and the format in the program to produce these results:

Desired Results

Level 2

2. Changing Labels in the TABLE Statement

a. Starter program r102e05 has a PROC SORT step for subsetting and ODS HTML statements for your output. Write a PROC TABULATE step that has the following characteristics:

1) Nest Order_Type within Customer_Gender in the row dimension.

2) Show a summary row for the Order_Type values for each gender. Label this row Total.

3) Show a summary grand total row at the bottom of the report. Label this row Grand Total.

4) Use Order_Date crossed with the sum of Total_Retail_Price in the column dimension, formatted with the YEAR4. format. Label Order_Date as Order Year. This report uses the default SUM statistic, so you do not have to explicitly put the SUM statistic in your TABLE statement.

5) Show a summary column at the right of the report table that is the summary of Total_Retail_Price for every row. Label this column as Yearly Summary.

6) Put the string Total Retail Price in the table box area.

Desired Results

Hint: To approach this solution, you might think of working from the picture and building a TABLE statement that uses all the default headers, formats, and labels. Then, once you have the right table structure, refine your program with formats and labels.

Level 3

3. Investigate TABULATE Documentation

This Level 3 exercise investigates the restrictions on the TABLE statement.

a. There are three restrictions on a TABLE statement in PROC TABULATE. What are they?

Hint: Look in the documentation for the string “Restrictions on a TABLE Statement”.

Assignment brief and tutorials/week 4/Exercises 4.docx

Exercisesc:\Program Files\PowerServ\CourseGraphics\exer_arrow.jpg

Level 1

1. Using STYLE= Overrides with PROC TABULATE

a. Retrieve the program r103e04. Make the following changes to the program:

1) Change the style of the headers for Order_Type and Customer_Country to have a background color of blue and a foreground color of white.

2) Change the class levels of Order_Type to have a background color of white and a foreground color of cx339966 (light teal green).

3) Change the class levels of Customer_Country by using the user-defined format $CTRY. For the background color (a form of trafficlighting) and a foreground color of black.

4) Change the FONT_FACE attribute for Quantity to be Courier New.

The font name contains a space, so it must be quoted in the style override.

5) Change ALL in the row dimension (in the TABLE statement) to have a background color of pink and a font_weight of bold.

6) Change the box style to have a background color of cx339966 (light teal green).

7) Change the KEYWORD statement for ALL to have a background color of pink, a foreground color of black, and a font_weight of bold.

Program r103e04

proc sort data=orion.customer_orders

out=work.orders;

where customer_country in ('DE', 'IT', 'FR', 'US');

by customer_country order_date order_type;

run;

proc format;

value $ctry 'DE' = 'cx66cc66'

'IT' = 'cxffff99'

'FR' = 'cxcc99ff'

'US' = 'cx663399';

run;

ods html file='r103e04_style.html' style=sasweb;

proc tabulate data=orders f=comma8.;

title 'Chapter 3 Exercise 4';

class customer_country order_type / style={ };

classlev order_type / s={ };

classlev customer_country / style={ };

var quantity / s={ };

table order_type*customer_country all*{s={ }},

Quantity='Quantity'*(sum='Total' max='Max')

/ box={label="Style Changes"

style={ }}

style={rules=all cellspacing=0 bordercolor=black frame=box};

keyword all / s={ };

run;

ods html close;

Desired HTML Results

Level 2

2. Using the POSTHTML Style Attribute

a. Retrieve and submit the r103e05 starter program. Review the SAS log. What error message do you get when you specify an invalid style attribute?

b. Modify the program to use this STYLE= override attribute and value in the TABLE statement:

style={posthtml='<h2>Confidential</h2>'}

c. Submit the program again and review the results. What destination shows the text string Confidential?

d. If you add a FOOTNOTE statement to the program, does the footnote text appear above or below the text placed by the POSTHTML attribute (when you rerun the program)?

footnote 'Footnote for Solution';

Level 3

3. Investigating Additional STYLE= Capability

This Level 3 exercise enables you to discover a usage of STYLE= that is unique to PROC TABULATE.

a. What does the syntax STYLE=<parent> accomplish when used in a PROC TABULATE CLASSLEV statement?

Not all style attributes are supported in all destinations. To determine which style elements and attributes are supported in the different ODS destinations, consult the ODS documentation. Style element and attribute detailed information can be found in the PROC TEMPLATE section of the ODS documentation.

Assignment brief and tutorials/week 4/Lecture 4.pptx

Data Management and Business Intelligence 5CC519

www.derby.ac.uk/engtech

Lecture 4 – Doing more with Proc TABULATE II

Dr John Panneerselvam

College of Engineering and Technology, University of Derby

Doing More with the TABULATE Procedure

Enhancing with Labels and Formats
Enhancing with Options
Adding Trafficlighting
Adding Additional Style Attributes

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

2

Objectives

Use the STYLE= option in the TABLE statement to change the background color of the data cells.

Reference a user-defined format in the STYLE= option to perform trafficlighting.

3

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Business Scenario

Orion Star Country managers want tabular reports with trafficlighting about their non-sales employees.

r103d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

STYLE= Option

The STYLE= option can be added to the TABLE statement by using the asterisk (*) to associate a style attribute with an analysis variable or keyword.

5

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=LightBlue}};

r103d04

The STYLE= option in the dimension expression is impacting the data cells.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

STYLE= Option

When the STYLE= option is used in a dimension expression, the option must be enclosed within braces ({ and }) or square brackets ([ and ]).

 S= is an alias for the STYLE= option.

6

{ STYLE = { attribute-name = attribute value

< … attribute-name = attribute value> } }

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=LightBlue}};

r103d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

BACKGROUNDCOLOR= Style Attribute

The BACKGROUNDCOLOR= style attribute specifies the color of the cell background.

 BACKGROUND= is an alias for the BACKGROUNDCOLOR= style attribute.

7

{ STYLE = { BACKGROUNDCOLOR = color } }

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=LightBlue}};

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Color-Naming Schemes

Some of the color-naming schemes are as follows:

8

Color-Naming Scheme Examples
SAS Registry Colors AntiqueWhite, Beige, Chocolate, Fuchsia, Gold, LightBlue, PaleGreen, Pink, and Tan
SAS Color Naming System (CNS) VeryLightPurplishBlue, DarkGrayishGreen, and VIGY (vivid greenish yellow)
Red Green Blue (RGB) cxFF0000 (red), cx00FFFF (cyan), cxFF0080 (pink), cxFFFF00 (yellow), and cxFFFFFF (white)
Hue Lightness Saturation (HLS) Light Vivid Green, Moderate Blue, Light Yellow, H0F080FF (green), and HH000BB00 (light gray)
Gray Scale GRAYFF (white), GRAYC0 (light gray), GRAY40 (dark gray), and GRAY00 (black)

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

8

Color-Naming Schemes

Additional documentation for colors can be found in SAS/GRAPH® Reference.

9

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

http://support.sas.com/documentation/cdl/en/graphref/65389/HTML/default/viewer.htm#p0ekhb3mdqahk3n15wzt55qtctr7.htm

9

Trafficlighting

The STYLE= option can be added to the TABULATE procedure in order to incorporate trafficlighting.

Trafficlighting is the act of highlighting individual cells based on the cell’s value.

10

r103d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

User-Defined Formats

A color can be hardcoded in the STYLE=option.

Or the STYLE= option can refer to a user-defined format that specifies the color value based on a numeric value.

11

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=LightBlue}};

proc format;

value traffic 50000-100000='LightBlue';

run;

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=traffic.}};

r103d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

3.08 Quiz

Which output corresponds to the following code?

a. b.

proc format;

value traffic 50000-100000='LightBlue';

run;

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=traffic.}};

r103d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

12

3.08 Quiz – Correct Answer

Which output corresponds to the following code?

a. b.

proc format;

value traffic 50000-100000='LightBlue';

run;

table Job_Title=' ',

Gender*Salary*Sum*f=dollar12.*

{style={backgroundcolor=traffic.}};

r103d04

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

13

Traffic Lighting Program

14

proc format;

value ncell 1 = 'Gold';

value meancell 0-30000 = 'LightBlue'

60000-high = 'Pink'

. = 'Wheat';

run;

proc tabulate data=orion.marketing;

class Gender Job_Title;

var Salary;

table Job_Title=' ',

Gender*Salary*

(N*{style={background=ncell.}}

Mean*f=dollar12.*

{style={background=meancell.}})

/ box=Job_Title;

label Job_Title='Job Title';

format Gender $genfmt.;

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Trafficlighting Output

The program from the previous slide creates the following tabular report:

15

A count of 1 gets a gold background.

A mean value between 0 and 30,000 gets a light blue background.

A mean value 60,000 or higher gets a pink background.

A missing mean value gets a wheat background.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Chapter 3: Doing More with the TABULATE Procedure

3.1 Enhancing with Labels and Formats
3.2 Enhancing with Options
3.3 Adding Trafficlighting
3.4 Adding Additional Style Attributes

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

16

Objectives

Use the STYLE= option in various statements in the TABULATE procedure to specify the BACKGROUNDCOLOR= style attribute.

Use additional style attributes with the STYLE= option.

17

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Business Scenario

Orion Star Country managers want a tabular employee report with additional style enhancements.

r103d05

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

STYLE= Option

The STYLE= option can be added to the TABLE statement in a dimension expression to specify the style attributes for the data cells.

19

r103d05

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

19

STYLE= Option

In addition, the STYLE= option can be added to other syntax to impact the style attributes of other cells.

20

KEYWORD

PROC or TABLE

BOX

CLASS

CLASSLEV

KEYWORD

VAR

TABLE

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

20

STYLE= Option

The following is the syntax needed if the STYLE= option is in a dimension expression or a BOX= option in the TABLE statement:

The following is the syntax needed if the STYLE= option is used outside of a dimension expression or BOX= option:

 The STYLE= option must be placed after a forward slash when used in any statement except the PROC statement or used in a table dimension.

21

{ STYLE = { attribute-name = attribute value

< … attribute-name = attribute value> } }

STYLE = { attribute-name = attribute value

< … attribute-name = attribute value> }

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

PROC Statement

The STYLE= option in the PROC TABULATE statement affects all of the data cells.

22

r103d06

proc tabulate data=orion.marketing

style={backgroundcolor=grayEE};

The BACKGROUNDCOLOR= style attribute specifies the color of the background.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

CLASS and VAR Statements

The STYLE= option in the CLASS and VAR statements impacts the classification and analysis variable headings.

23

r103d06

class Gender Job_Title /

style={backgroundcolor=green};

var Salary /

style={backgroundcolor=green};

Multiple CLASS and VAR statements are allowed.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

CLASSLEV Statement

The CLASSLEV statement specifies style attributes for the CLASS variable values.

24

classlev Gender Job_Title /

style={backgroundcolor=cxFFFF99};

r103d06

Multiple CLASSLEV statements are allowed.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

KEYWORD Statement

The KEYWORD statement specifies style attributes for keyword (statistics and ALL) headings.

25

keyword Sum Mean /

style={backgroundcolor=cxFFFF99};

keyword All /

style={backgroundcolor=green};

r103d06

Multiple KEYWORD statements are allowed.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

TABLE Statement

The STYLE= option in the TABLE statement impacts dimension data cells, the BOX cell, or the table structure.

26

table Gender*Job_Title

All*{style={backgroundcolor=green}},

Salary*(Sum Mean)*f=dollar12.

/ box={style={backgroundcolor=grayEE}}

style={backgroundcolor=black

borderspacing=2};

r103d06

The BORDERSPACING= style attribute specifies the thickness of space between cells in a table.

black interior lines (table structure)

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

STYLE= Option Placement

The STYLE= option can be placed in various statements.

27

Impacts Placement
Data cells (all the same) PROC TABULATE statement
Data cells (varying) TABLE statement in a dimension
Box cells TABLE statement in BOX= option
Overall table structure TABLE statement
Class variable headings CLASS statement
Class level value headings CLASSLEV statement
Analysis variable headings VAR statement
Keyword headings KEYWORD statement

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

3.10 Quiz

Which statement produces the orange cells?

a.

b.

classlev Job_Title

/ style={backgroundcolor=orange};

class Job_Title

/ style={backgroundcolor=orange};

r103d06

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

28

3.10 Quiz – Correct Answer

Which statement produces the orange cells?

a.

b.

classlev Job_Title

/ style={backgroundcolor=orange};

class Job_Title

/ style={backgroundcolor=orange};

r103d06

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

29

Style Attributes

The following is a partial list of style attributes:

30

Style Attribute Task Example Values
BACKGROUNDCOLOR= BACKGROUND= Specify the color of the background salmon, vlig, gray00, cxFFFF99, h000FF00
COLOR= FOREGROUND= Specify the color of the foreground (text) salmon, vlig, gray00, cxFFFF99, h000FF00
URL= Specify a link 'rep1.html‘ 'C:\temp\rep1.html'
PREIMAGE= POSTIMAGE= Specify an image 'Orion.jpg‘ 'C:\temp\Orion.jpg'
HEIGHT= CELLHEIGHT= Specify the height 1in, 4cm, 8mm, 12%
WIDTH= CELLWIDTH= Specify the width 1in, 4cm, 8mm, 12%

continued...

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Style Attribute Task Example Values
FONTFAMILY= FONT_FACE= Specify the font 'Arial', 'Courier New', 'Times New Roman'
FONTSIZE= FONT_SIZE= Specify the size of the font 12pt (printer’s point), 0.2cm, 3 (relative size)
FONTSTYLE= FONT_STYLE= Specify the style of the font ITALIC, ROMAN, SLANT
FONTWEIGHT= FONT_WEIGHT= Specify the weight of the font LIGHT, MEDIUM, BOLD, EXTRA_BOLD
TEXTALIGN= JUST= Specify the horizontal justification of text CENTER (C), DEC (D), LEFT (L), RIGHT (R)
VERTICALALIGN= VJUST= Specify the vertical justification of text BOTTOM (B), MIDDLE (M), TOP (T)

Style Attributes

31

continued...

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Style Attribute Task Example Values
BORDERCOLOR= Specify the color of the border salmon, vlig, gray00, cxFFFF99, h000FF00
BORDERWIDTH= Specify the width of the border 10, 0.1in, 0.5cm, 1mm
BORDERSPACING= CELLSPACING= Specify the thickness of space between cells in a table 0, 0.1in, 0.5cm, 1mm
RULES= Specify the rules type (interior lines) to use on a table ALL, COLS. GROUPS, NONE, ROWS
FRAME= Specify the frame type (exterior lines) to use on a table ABOVE, BELOW, BOX, HSIDES, LHS, RHS, VOID, VSIDES

Style Attributes

32

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Style Attributes

Additional documentation for style attributes can be found in Output Delivery System User’s Guide.

33

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Initial Report

The TABULATE procedure is used to create the following initial report:

34

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Desired Report

STYLE= options can be added to the TABULATE procedure in order to create the following desired report:

35

r103d07

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

Using STYLE= Options

This demonstration illustrates specifying style attributes with the STYLE= option in the TABULATE procedure.

r103d07

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

36

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

37

r103d08

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

38

Which statement is false?

The KEYWORD statement can be added to the TABULATE procedure to assign descriptive labels to keywords.

The LABEL statement can be added to the TABULATE procedure to assign descriptive labels to variables.

If a KEYLABEL statement specifies a blank value for a statistic, the cell is removed from the table.

A label can be specified in the TABLE statement to change the heading of a variable or keyword.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

39

Which table is produced from the following step?

. b. .

proc tabulate data=orion.sales;

class Country Gender;

table Country, Gender*N=' ';

keylabel N='Count';

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

40

Which statement is true concerning formats?

By default, the format specified in the row dimension is used over any other format.

A FORMAT= option can be added to the PROC statement after a forward slash to specify a format for the statistical cells.

A FORMAT_PRECEDENCE= option can be added to the PROC statement to change the order of precedence for formats.

A format modifier can be added to the TABLE statement by using the asterisk (*) to associate a format modifier to an analysis variable or keyword.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

41

Which statement correctly places the page dimension text in the box area?

table Gender, Country, Job_Title / box=page;

table Gender, Country, Job_Title / box=_page_;

table Gender, Country, Job_Title / box=Gender;

table Gender, Country, Job_Title / box=_Gender_;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

42

Which statement is false concerning missing classification values?

By default, the TABULATE procedure excludes observations with a missing classification value.

A MISSING option can be placed in the PROC statement to include all missing classification values.

A MISSING option can be placed in the VAR statement to control which missing classification values are included.

A MISSING option can be placed in the CLASS statement to control which missing classification values are included.

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

43

Which syntax is correct for the PRINTMISS option?

proc tabulate data=orion.warehouse printmiss;

proc tabulate data=orion.warehouse / printmiss;

class Country Gender Job_Title / printmiss;

table Country, Job_Title, Gender / printmiss;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

44

Which TABLE statement produces the following report?

table Country, Gender*style={backgroundcolor=pink};

table Country, Gender*{style={backgroundcolor=pink}};

table Country, Gender / style={backgroundcolor=pink};

table Country, Gender / {style={backgroundcolor=pink}};

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

45

Which table is based on the following format?

b.

proc format;

value ncell . = 'beige';

run;

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

46

Which statement cannot include the STYLE= option?

PROC

CLASSLEV

VAR

TABLE

LABEL

KEYWORD

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

47

Which of the following is not a style attribute that is specified with the STYLE= option?

BORDERWIDTH

COLOR

FONTFAMILY

FORMAT

TEXTALIGN

‹#›

Copyright © 2012, SAS Institute Inc. All rights reserved.

48

Assignment brief and tutorials/week 4/UML tutorial questions.docx

UML Tutorial Questions:

Read the UML Tutorial available on Udo Blackboard and answer the following questions:

1. What is UML?

2. What are the advantages of using UML?

3. Explain the types of diagrams in UML.

4. What are the three types of modelling in UML?

5. Explain the types of behavioural diagrams in UML.

6. What are Building blocks of UML.

7. What are Relationships in UML? 

8. What are the purposes of the class diagram?

9. What are the purposes of the Statechart diagram?

10. What is the difference between Activity and Sequence diagrams?