The following problems (with associated data sets) are designed to test your ability to determine the proper multivariate statistical method(s) to apply in order to answer the research question(s) of interest.
For each problem, submit a document that describes:
(1) the research question(s) of interest
(2) the method of analysis and why it is appropriate
(3) the assumptions that underlie the method
(4) the statistical tests to be conducted
(5) a discussion of results that will answer the research question
Assume you are writing the "methods" section of a research paper to be submitted to a professional journal.
Each of the 6 data sets described employs at least one of the following statistical methods:
(1) Analysis of Variance
(2) Analysis of Covariance
(3) Multivariate Analysis of Variance
(4) Multivariate Analysis of Covariance
(5) Discriminant Analysis
(6) Logistic Regression
(7) Cluster Analysis
(8) Principal Components
(9) Exploratory Factor Analysis
(10) Confirmatory Factor Analysis
Each of the graded papers is worth 25 points.
Problem 1:
Nabisco Brands, Inc. is continually developing promotional ideas for its products. The company wants to study the effects of three different types of promotions on its sales of Ritz Bitz mini-cracker snacks. In addition to regular shelf space, the three strategies under consideration include:
1) Sampling of Ritz Bitz by in-store customers
2) Additional shelf space in regular location
3) Special display shelves at ends of aisle
The most effective promotion strategy will be implemented at all supermarkets that stock Ritz Bitz mini-cracker snacks.
To determine the best strategy, 45 stores were selected from supermarkets across the country to participate in a 4-week study, with 15 stores randomly assigned to each of the three promotions. Other relevant conditions under the control of Nabisco, such as price and advertising, were kept the same for all stores during the 4-week study period. Both the number of cases of Ritz Bitz sold during the promotional period (post-sales) and the number of cases sold during the 4-weeks preceding the promotional period (pre-sales) are recorded. The variables for the study (with SAS names listed first) are described below. Nabisco's objective is to determine the promotion that yields the highest mean sales.
STRATEGY -- Promotional strategy (INSTORE, ADDSPAC, SPECIAL)
PRESALE -- Number of cases sold 4 weeks prior to the study
POSTSALE -- Number of cases sold during the promotional period
The data are saved in the RITZ SAS file. Several observations are listed below.
ID STRATEGY PRESALE POSTSALE
1 INSTORE 21 38
2 INSTORE 26 39
3 INSTORE 22 36
16 ADDSPAC 34 43
17 ADDSPAC 26 38
18 ADDSPAC 29 38
43 SPECIAL 31 30
44 SPECIAL 26 29
45 SPECIAL 19 22
Problem 2:
Attribution theory is concerned with the cognitive processes that individuals use to explain their own performance in situations where causal relations are ambiguous. Empirical evidence indicates a tendency for individuals to attribute their own successful performance to internal factors, such as effort or ability, while poor performance is attributed to external factors beyond the individual's control. An experiment was conducted to examine the causal reasoning patterns of system users at the conclusion of a competitive, computer-based business game.
Eighty MBA students used what appeared to be different computer models to analyze unexpected variances in manufacturing costs. (Actually, all students utilized the same computer model.) Upon completion, students were paid an amount based on their overall performance: those who were told they performed poorly relative to their peers were paid $5, while those who were told they did well earned $20. (In actuality, the students were randomly assigned to one of the two performance groups.) At the time of payment, participants completed an evaluation form upon which five outcome variables were measured (each on a 7-point Likert scale):
Internal Outcome SAS Variable Names
EFFORT -- amount of effort expended
UND -- how well they understood the cost structure
External Outcome SAS Variable Names
QUALITY -- quality of the computer model used
LUCK -- level of good/bad luck
DIFF -- difficulty of the task itself
The main purpose of the study is to determine whether the means of the outcome variables described above differ depending on performance level (SAS variable defined below).
PERLEVEL = 1 if poor performance ($5),
2 if good performance ($20)
The data are saved in the ATTRIB SAS file. Several observations are listed below.
PERLEVEL EFFORT UND QUALITY LUCK DIFF
1 4 6 5 4 3
1 3 4 6 1 1
1 3 4 4 6 3
1 3 3 5 5 5
1 4 5 5 6 1
2 7 5 4 4 4
2 5 2 1 6 6
2 5 3 3 4 5
2 4 3 1 2 4
2 4 5 4 5 4
Problem 3:
Radio-frequency identification (RFID) is the wireless use of electromagnetic fields to track data. Some industries have already adopted RFID technology (e.g., an RFID tag attached to an automobile during production is used to track its progress through the assembly line), but others have yet to adopt. This study attempts to identify those factors that increase the likelihood or probability of RFID adoption for supply chain management companies.
Data were collected through a Web-based survey of managers who are members of the Institute for Supply Management (ISM). A total of 755 managers participated in the survey. A list of the variables measured for each manager is provided below (SAS variable name given first). The researchers want to use these variables to build an algorithm which accurately predicts whether or not a supply management firm will adopt RFID technology.
ASDOPT -- Firm’s RFID adoption status (1=adopted, 0=not adopted)
NUMIT -- Total number of other information technology adoptions at firm
WLAN – Level of wireless LAN adoption (HI-USE, LO-USE, or NO-USE)
WMS – Level of warehouse management system adoption (HI-USE, LO-USE, or NO-USE)
BAR – Level of barcode adoption (HI-USE, LO-USE, or NO-USE)
P2LS – Level of “pick-to-light” system adoption (HI-USE, LO-USE, or NO-USE)
FIRMTYPE – Domestic (DOM) or International (INT) firm
REVENUE – Firm revenue status (LOW or HIGH)
CHLEADER -- “My firm is obligated to do as the channel/supply chain leader suggests”
(7-point Likert scale where 1=strongly disagree and 7=strongly agree)
QUALITY -- “My firm is concerned with product quality”
(7-point Likert scale where 0=never and 7=always)
SERVICE -- “My firm feels channel/supply chain leader provides services needed”
(7-point Likert scale where 1=strongly disagree and 7=strongly agree)
The data are saved in the RFID SAS file. Several observations are listed below.
CHLEADER QUALITY SERVICE ADOPT WLAN WMS BAR P2LS NUMIT REVENUE FIRMTYPE
4 1 2 1 LO-USE NO-USE LO-USE NO-USE 3 LOW DOM
6 2 4 0 HI-USE NO-USE HI-USE NO-USE 3 LOW DOM
2 1 6 0 NO-USE HI-USE HI-USE NO-USE 4 HIGH DOM
6 4 4 1 HI-USE HI-USE HI-USE NO-USE 5 HIGH INT
4 5 4 0 LO-USE HI-USE HI-USE NO-USE 4 HIGH INT
Problem 4:
The concern over a depletion of the ozone layer due to air pollution led to a recent study of the ambient air quality at 41 major cities across the U.S. The seven variables measured for each city are described below. (Several of these variables have been found to be good indicators of a city’s air pollution.) The researchers want to use the data to segment the cities based on their demographics and ambient air quality characteristics. Conduct the appropriate analysis of the data. Describe the characteristics of each segment.
SAS variables
(names listed first):
SO2 -- Sulphur dioxide level (micrograms per cubic meter), i.e., oxygen saturation
TEMP -- Average annual temperature (degrees Fahrenheit
FACTORIES -- Number of manufacturing enterprises employing 20 or more workers
POPLN -- Population size (thousands)
WINDSPD -- Average annual wind speed (miles per hour)
RAIN -- Average annual precipitation (inches)
DAYS -- Average number of days with precipitation per year
The data are saved in the USAIR SAS file. Several observations are listed below.
Obs CITY SO2 TEMP FACTORIES POPLN WINDSPD RAIN DAYS
1 Phoenix 10 70.3 213 582 6.0 7.05 36
2 LittleRock 13 61.0 91 132 8.2 48.52 100
3 SanFran 12 56.7 453 716 8.7 20.66 67
4 Denver 17 51.9 454 515 9.0 12.95 86
5 Hartford 56 49.1 412 158 9.0 43.37 127
Problem 5:
Seishu wine, manufactured only in Japan, is one of the most popular, most expensive, and highly rated types of wine in the world. A preliminary experiment was conducted on the sensory and chemical characteristics of Japanese Seishu wine.
Thirty different brands of Seishu wine, all of the same "age", were rated by several expert judges on "taste" and "odor". In addition, eight chemically-related variables were measured for each brand. The 10 variables studied are described below (SAS variable names listed first).
TASTE -- Taste score, ranging from -1 (poor) to 1 (excellent)
ODOR -- Odor score, ranging from -1 (poor) to 1 (excellent)
PH -- Amount of pH per liter of wine
ACID1 -- Acidity of wine (judged)
ACID2 -- Acidity of wine (measured)
SAKE -- Sake meter rating
SUGARDIR -- Amount of direct reducing sugar
SUGARTOT -- Total amount of sugar
ALCOHOL -- Amount of alcohol
FNITRO -- Amount of formyl-nitrogen
Ultimately, the researchers would like to use these 10 variables to predict the price charged per bottle of wine. However, they are concerned that high-to-moderate correlations among the variables will lead to potentially confusing and invalid results. To prevent such problems the researchers want to identify and interpret, if possible, two or three variables which contain nearly as much information as the 10 listed above, but which are independent. These new variables will be used in the prediction equation for price.
The data are saved in the WINE SAS file. Several observations are listed below.
TASTE ODOR PH ACID1 ACID2 SAKE SUGARDIR SUGARTOT ALCOHOL FNITRO
1 0.8 4.05 1.68 0.85 3 3.97 5 16.9 122
0.1 0.2 3.81 1.39 0.3 0.6 3.62 4.52 15.8 62
0.5 0 4.2 1.63 0.92 -2.3 3.48 4.46 15.8 139
0.7 0.7 4.35 1.43 0.97 -1.6 3.45 3.98 15.4 150
-0.1 -1 4.35 1.53 0.87 -2 3.67 4.22 15.4 138
Problem 6:
Certified public accounting is a highly competitive industry. Consequently, marketing success, as effected by practice development, is important to both firms and individual CPAs within those firms. Unfortunately, very little empirical information is available regarding the use of practice development techniques. To fill this void, an exploratory study was undertaken to identify those techniques that CPAs use in developing a practice.
A questionnaire was administered to a sample of 319 accountants selected from large CPA firms in different geographic regions of the U.S. The survey instrument consisted of 7-point Likert-scale type questions on practice development behaviors. Specifically, the survey asked respondents to indicate, "... the importance of the item to your own personal practice development style" (where 1="not important" and 7="extremely important"). The 14 items pertaining to "making contacts" are described below. The main goal of the study is to identify, if possible, underlying behavior dimensions produced by these variables. Use the data on the following variables (SAS name listed first) to find and identify the behavior dimensions.
SOCIAL: Purely social contacts (family, neighbors, church, etc.)
BUSINESS: Business contacts (attorneys, bankers, etc.)
COLLEAGS: Former colleagues
CLIENTS: Client referrals and recommendations
FIRM: Referrals from within your firm
SPEECHES: Speeches and articles
CLUB: Country club contacts
SPCLUB: Spouse's country club contacts
EXPERT: Reputation/name recognition for technical expertise
REGIST: Social-register organizations contacts
SPREGIST: Spouse's social-register organizations contacts
PROFORG: Professional organizations
CHARIT: Local charitable/volunteer/civic organization contacts
SPCHARIT: Local charitable/volunteer/civic organization contacts
The data are saved in the CPA SAS file. Several observations are listed below.
SOCIAL BUSINESS ... (variables in same order as listed above) SPCHARIT
4 5 2 6 7 1 1 4 5 2 1 6 6 1
5 7 5 5 3 1 2 1 6 1 1 1 1 1
3 7 7 7 7 6 1 1 6 1 1 4 1 1
3 7 5 7 7 2 0 0 5 1 1 3 4 0
5 7 6 7 5 1 3 5 4 1 1 5 2 1