Python Programming and Data warehouse

profilebalusuyaswanth
In-ClassExercise-DimensionalModeling.docx

Exercise #1 Data Selection: Designing a Dimensional Database

The TU Hotel chain is a small chain with properties throughout Pennsylvania, Arizona, Florida, and New York. They have a central database to store and track guest reservations. In 2008, they put cafes in many of their hotels, called “Café in the Hotel.” They have an order-tracking system that relays customer orders from the wait staff to the kitchen.

TU Hotels would like to use the data they have collected to better understand the performance of their hotels and cafes. They also have access to a database from the online review site “HotelComplainer.com.”

Your task is to design two dimensional data marts using data from those three databases. You will plan the star schema for each data mart by choosing the dimensions, facts, and attributes from the data contained in those databases. The relational schemas of those databases are on the following page.

NOTE: You will not actually come up with answers to these questions since you have no data. You’ll just be designing the star schemas that can address these questions about the business.

The questions to be addressed by each data mart are listed in the table below. You should include only the data fields you need in each data mart, but the data marts can contain some of the same data (i.e., the same fields can appear in both data marts).

To complete the exercise, you’ll need to perform the following steps:

1) Identify the main business event for each data mart. This will be the fact. Ask yourself, “What is the basic business event that generates the performance metric (revenue)?”

2) Identify the attributes associated with the fact. Ask yourself, “How is the business event quantified (measured)?”

3) Identify the dimensions and their attributes. Ask yourself, “What data characterizes the various aspects of that business event?”

4) Either sketch the star schema or make a list of the dimensions and the fact and their attributes. You can use the examples in the slides as a guide.

Data Mart 1: Hotel Performance

Data Mart 2: Restaurant Performance

· During which month are the most rooms rented?

· Identify the “off season” (if any) for our hotels in Arizona, Florida, Pennsylvania, and New York

· Which hotel generates the most (non-restaurant) revenue?

· What is the average length of stay in hotels with 4.5 or more stars?

· Do smokers stay longer than non-smokers?

· For a given hotel, how many customers come from out of state?

· Which hotel restaurant generates the most revenue?

· Do the best rated hotels generate more restaurant revenue?

· What is the most frequently ordered item in the Philadelphia metropolitan area?

HotelsHotel_idCountry_codeHotel_nameHotel_addressHotel_cityHotel_zipcodeCountriesCountry_codeCountry_currencyCountry_nameHotel roomsRoom_numberHotel_idRoom_typeRoom_floorRoom typesRoom_type_codeRoom_standard_rateRoom_descriptionSmoking_YNRoom BookingsBooking_idRoom_type_codeHotel_idCheckin_dateNumber_of_daysRoom_countGuest BookingsBooking_idGuest_numberGuestsGuest_numberGuest_firstnameGuest_lastnameGuest_addressGuest_cityGuest_zipcodeGuest_emailHotel Amenities LookupCharacteristic_idCharacteristic_descriptionHotel AmenitiesCharacteristic_idHotel_id

Hotel Reservation Database

RatingsRating_idHotel_idDate_of_stayStar_ratingReview_textHotelsHotel_idCountry_codeHotel_nameHotel_addressHotel_cityHotel_zipcode

HotelComplainer Ratings Database

(totally external company)

CustomerCustomer_numberCustomer_nameCustomer_addressCustomer_cityCustomer_zipcodeCustomer_emailHotel RatingRating_idCustomer_idDate_of_rating

Café in the Hotel Database

(same company but database is not connected to the hotel)

CustomerCustomer_numberCustomer_nameCustomer_addressCustomer_cityCustomer_zipcodeOrderOrder_numberCustomer_numberHotel_idFood_item_idOrder_dateOrder_timeTable_numberFood itemOrder numberFood_item_idOrder_dateOrder_timeHotelsHotel_idCountry_codeHotel_nameHotel_addressHotel_cityHotel_zipcode