Discussion: Reading Assignment - chapter 5 and 6 (250 words)
CHAPTER
5 Database Systems and Big Data
Rafal Olechowski/Shutterstock.com
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Know?Did Yo u
• The amount of data in the digital universe is expected to increase to 44 zettabytes (44 trillion gigabytes) by 2020. This is 60 times the amount of all the grains of sand on all the beaches on Earth. The majority of data generated between now and 2020 will not be produced by humans, but rather by machines as they talk to each other over data networks.
• Most major U.S. wireless service providers have implemented a stolen-phone database to report and track stolen phones. So if your smartphone or tablet
goes missing, report it to your carrier. If someone else tries to use it, he or she will be denied service on the carrier’s network.
• You know those banner and tile ads that pop up on your browser screen (usually for products and services you’ve recently viewed)? Criteo, one of many digital advertising organizations, automates the recommendation of ads up to 30 billion times each day, with each recommendation requiring a calculation involving some 100 variables.
Principles Learning Objectives
• The database approach to data management has become broadly accepted.
• Data modeling is a key aspect of organizing data and information.
• A well-designed and well-managed database is an extremely valuable tool in supporting decision making.
• We have entered an era where organizations are grappling with a tremendous growth in the amount of data available and struggling to understand how to manage and make use of it.
• A number of available tools and technologies allow organizations to take advantage of the opportunities offered by big data.
• Identify and briefly describe the members of the hier- archy of data.
• Identify the advantages of the database approach to data management.
• Identify the key factors that must be considered when designing a database.
• Identify the various types of data models and explain how they are useful in planning a database.
• Describe the relational database model and its funda- mental characteristics.
• Define the role of the database schema, data definition language, and data manipulation language.
• Discuss the role of a database administrator and data administrator.
• Identify the common functions performed by all data- base management systems.
• Define the term big data and identify its basic characteristics.
• Explain why big data represents both a challenge and an opportunity.
• Define the term data management and state its overall goal.
• Define the terms data warehouse, data mart, and data lakes and explain how they are different.
• Outline the extract, transform, load process.
• Explain how a NoSQL database is different from an SQL database.
• Discuss the whole Hadoop computing environment and its various components.
• Define the term in-memory database and explain its advantages in processing big data.
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Why Learn about Database Systems and Big Data? Organizations and individuals capture prodigious amounts of data from a myriad of sources every day. Where does all this data come from, where does it go, how is it safeguarded, and how can you use it to your advantage? In this chapter, you will learn about tools and processes that enable users to manage all this data so that it can be used to uncover new insights and make effective decisions. For example, if you become a marketing manager, you can access a vast store of data related to the Web-surfing habits, past purchases, and even social media activity of existing and potential customers. You can use this information to create highly effective marketing programs that generate consumer interest and increased sales. If you become a biologist, you may use big data to study the regulation of genes and the evolution of genomes in an attempt to understand how the genetic makeup of different cancers influences outcomes for cancer patients. If you become a human resources manager, you will be able to use data to analyze the impact of raises and changes in employee-benefit packages on employee retention and long-term costs. Regardless of your field of study in school and your future career, using database systems and big data will likely be a critical part of your job. As you read this chapter, you will see how you can use databases and big data to extract and analyze valuable information to help you succeed. This chapter starts by introducing basic concepts related to databases and data management systems. Later, the topic of big data will be discussed along with several tools and technologies used to store and analyze big data.
As you read this chapter, consider the following:
• Why is it important that the development and adoption of data management, data modeling, and business information systems be a cross-functional effort involving more than the IS organization?
• How can organizations manage their data so that it is a secure and effective resource?
A database is a well-designed, organized, and carefully managed collection of data. Like other components of an information system, a database should help an organization achieve its goals. A database can contribute to organizational success by providing managers and decision makers with timely, accurate, and relevant information built on data. Databases also help companies ana- lyze information to reduce costs, increase profits, add new customers, track past business activities, and open new market opportunities.
A database management system (DBMS) consists of a group of pro- grams used to access and manage a database as well as provide an interface between the database and its users and other application programs. A DBMS provides a single point of management and control over data resources, which can be critical to maintaining the integrity and security of the data. A database, a DBMS, and the application programs that use the data make up a database environment.
Databases and database management systems are becoming even more important to organizations as they deal with rapidly increasing amounts of information. Most organizations have many databases; how- ever, without good data management, it is nearly impossible for anyone to find the right and related information for accurate and business-critical decision making.
Data Fundamentals
Without data and the ability to process it, an organization cannot successfully complete its business activities. It cannot pay employees, send out bills, order new inventory, or produce information to assist managers in decision making. Recall that data consists of raw facts, such as employee numbers and sales fig- ures. For data to be transformed into useful information, it must first be orga- nized in a meaningful way.
database: A well-designed, organized, and carefully managed collection of data.
database management system (DBMS): A group of programs used to access and manage a database as well as provide an interface between the database and its users and other application programs.
194 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Hierarchy of Data Data is generally organized in a hierarchy that begins with the smallest piece of data used by computers (a bit), progressing up through the hierarchy to a data- base. A bit is a binary digit (i.e., 0 or 1) that represents a circuit that is either on or off. Bits can be organized into units called bytes. A byte is typically eight bits. Each byte represents a character, which is the basic building block of most information. A character can be an uppercase letter (A, B, C, …, Z), a low- ercase letter (a, b, c, …, z), a numeric digit (0, 1, 2, …, 9), or a special symbol (., !, þ, �, /, etc.).
Characters are put together to form a field. A field is typically a name, a number, or a combination of characters that describes an aspect of a business object (such as an employee, a location, or a plant) or activity (such as a sale). In addition to being entered into a database, fields can be computed from other fields. Computed fields include the total, average, maximum, and mini- mum value. A collection of data fields all related to one object, activity, or individual is called a record. By combining descriptions of the characteristics of an object, activity, or individual, a record can provide a complete descrip- tion of it. For instance, an employee record is a collection of fields about one employee. One field includes the employee’s name, another field contains the address, and still others the phone number, pay rate, earnings made to date, and so forth. A collection of related records is a file—for example, an employee file is a collection of all company employee records. Likewise, an inventory file is a collection of all inventory records for a particular company or organization.
At the highest level of the data hierarchy is a database, a collection of inte- grated and related files. Together, bits, characters, fields, records, files, and databases form the hierarchy of data. See Figure 5.1. Characters are combined to make a field, fields are combined to make a record, records are combined to make a file, and files are combined to make a database. A database houses not only all these levels of data but also the relationships among them.
Data Entities, Attributes, and Keys Entities, attributes, and keys are important database concepts. An entity is a person, place, or thing (object) for which data is collected, stored, and main- tained. Examples of entities include employees, products, and customers. Most organizations organize and store data as entities.
FIGURE 5.1 Hierarchy of data Together, bits, characters, fields, records, files, and databases form the hierarchy of data.
Database
Hierarchy of data Example
FilesFilesFiles
RecordsRecordsRecordsRecordsRecords
Fields
Each character is represented as
8 bits
Personnel file
Department file
Payroll file
(Project database)
(Personnel file)
(Record containing employee #, last and first name, hire date)
(Last name field)
098 - 40 - 1370 Fiske, Steven 01-05-2001
Fiske
(Letter F in ASCII)1000110
098 - 40 - 1370 Fiske, Steven 01-05-2001 549 - 77 - 1001 Buckley, Bill 02-17-1995 005 - 10 - 6321 Johns, Francine 10-07-2013
bit: A binary digit (i.e., 0 or 1) that represents a circuit that is either on or off.
character: A basic building block of most information, consisting of upper- case letters, lowercase letters, numeric digits, or special symbols.
field: Typically a name, a number, or a combination of characters that describes an aspect of a business object or activity.
record: A collection of data fields all related to one object, activity, or individual.
file: A collection of related records.
hierarchy of data: Bits, characters, fields, records, files, and databases.
entity: A person, place, or thing for which data is collected, stored, and maintained.
CHAPTER 5 • Database Systems and Big Data 195
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
An attribute is a characteristic of an entity. For example, employee num- ber, last name, first name, hire date, and department number are attributes for an employee. See Figure 5.2. The inventory number, description, number of units on hand, and location of the inventory item in the warehouse are attributes for items in inventory. Customer number, name, address, phone number, credit rating, and contact person are attributes for customers. Attri- butes are usually selected to reflect the relevant characteristics of entities such as employees or customers. The specific value of an attribute, called a data item, can be found in the fields of the record describing an entity. A data key is a field within a record that is used to identify the record.
Many organizations create databases of attributes and enter data items to store data needed to run their day-to-day operations. For instance, database technology is an important weapon in the fight against crime and terrorism, as discussed in the following examples:
● The Offshore Leaks Database contains the names of some 100,000 secretive offshore companies, trusts, and funds created in locations around the world. Although creating offshore accounts is legal in most countries, offshore accounts are also established to enable individuals and organizations to evade paying the taxes they would otherwise owe. The database has been used by law enforcement and tax officials to identify potential tax evaders.1
● Major U.S. wireless service providers have implemented a stolen-phone database to report and track stolen 3G and 4G/LTE phones. The providers use the database to check whether a consumer’s device was reported lost or stolen. If a device has been reported lost or stolen, it will be denied service on the carrier’s network. Once the device is returned to the rightful owner, it may be reactivated. The next step will be to tie foreign service providers and countries into the database to diminish the export of stolen devices to markets outside the United States.2
● The Global Terrorism Database (GTD) is a database including data on over 140,000 terrorist events that occurred around the world from 1970 through 2014 (with additional annual updates). For each terrorist event, information is available regarding the date and location of the event, the weapons used, the nature of the target, the number of casualties, and, when identifiable, the group or individual responsible.3
● Pawnshops are required by law to report their transactions to law enforcement by providing a description of each item pawned or sold along with any identifying numbers, such as a serial number. LEADS Online is a nationwide online database system that can be used to fulfill this reporting responsibility and enable law enforcement officers to track merchandise that is sold or pawned in shops throughout the nation. For
FIGURE 5.2 Keys and attributes The key field is the employee number. The attributes include last name, first name, hire date, and department number.
Employee #
005-10-6321
549-77-1001
098-40-1370
Last name First name Hire date Dept. number
257
632
59801-05-2001
02-17-1995
10-07-2013Francine
Bill
StevenFiske
Buckley
Johns
ATTRIBUTES (fields)
KEY FIELD
E N
T IT
IE S
( re
co rd
s)
attribute: A characteristic of an entity.
data item: The specific value of an attribute.
196 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
example, if law enforcement has a serial number for a stolen computer, they can enter this into LEADS Online and determine if it has been sold or pawned, when and where the theft or transaction occurred and, in the case of an item that was pawned, who made the transaction.4
As discussed earlier, a collection of fields about a specific object is a record. A primary key is a field or set of fields that uniquely identifies the record. No other record can have the same primary key. For an employee record, such as the one shown in Figure 5.2, the employee number is an example of a primary key. The primary key is used to distinguish records so that they can be accessed, organized, and manipulated. Primary keys ensure that each record in a file is unique. For example, eBay assigns an “Item number” as its primary key for items to make sure that bids are associated with the correct item. See Figure 5.3.
In some situations, locating a particular record that meets a specific set of criteria might be easier and faster using a combination of secondary keys rather than the primary key. For example, a customer might call a mail-order com- pany to place an order for clothes. The order clerk can easily access the custo- mer’s mailing and billing information by entering the primary key—usually a customer number—but if the customer does not know the correct primary key, a secondary key such as last name can be used. In this case, the order clerk enters the last name, such as Adams. If several customers have a last name of Adams, the clerk can check other fields, such as address and first name, to find the correct customer record. After locating the correct record, the order can be completed and the clothing items shipped to the customer.
The Database Approach At one time, information systems referenced specific files containing relevant data. For example, a payroll system would use a payroll file. Each distinct operational system used data files dedicated to that system.
Today, most organizations use the database approach to data manage- ment, where multiple information systems share a pool of related data.
FIGURE 5.3 Primary key eBay assigns an Item number as a primary key to keep track of each item in its database.
w w w .e ba y. co m
primary key: A field or set of fields that uniquely identifies the record.
database approach to data management: An approach to data management where multiple information systems share a pool of related data.
CHAPTER 5 • Database Systems and Big Data 197
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
A database offers the ability to share data and information resources. Federal databases, for example, often include the results of DNA tests as an attribute for convicted criminals. The information can be shared with law enforcement officials around the country. Often, distinct yet related databases are linked to provide enterprise-wide databases. For example, many Walgreens stores include in-store medical clinics for customers. Walgreens uses an electronic health records database that stores the information of all patients across all stores. The database provides information about customers’ interactions with the clinics and pharmacies.
To use the database approach to data management, additional software— a database management system (DBMS)—is required. As previously discussed, a DBMS consists of a group of programs that can be used as an interface between a database and the user of the database. Typically, this software acts as a buffer between the application programs and the database itself. Figure 5.4 illustrates the database approach.
Vehicle Theft Database You are a participant in an information systems project to design a vehicle theft database for a state law enforcement agency. The database will provide information about stolen vehicles (e.g., autos, golf carts, SUVs, and trucks), with details about the vehicle theft as well as the stolen vehicle itself. These details will be useful to law enforcement officers investigating the vehicle theft.
Review Questions 1. Identify 10 data attributes you would capture for each vehicle theft incident.
How many bytes should you allow for each attribute? 2. Which attribute would you designate as the primary key?
FIGURE 5.4 Database approach to data management In a database approach to data management, multiple information systems share a pool of related data.
Database management
system
Payroll data
Inventory data
Invoicing data
Other data
Payroll program
Reports
Inventory control
program
Management inquiries
Invoicing program
Database Interface Application programs
Users
Reports
Reports
Reports
198 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Questions 1. Should the database include data about the status of the theft investigation? If
so, what sort of data needs to be included? 2. Can you foresee any problems with keeping the data current? Explain.
Data Modeling and Database Characteristics
Because today’s businesses must keep track of and analyze so much data, they must keep the data well organized so that it can be used effectively. A database should be designed to store all data relevant to the business and to provide quick access and easy modification. More- over, it must reflect the business processes of the organization. When building a database, an organization must carefully consider the following questions:
● Content. What data should be collected and at what cost? ● Access. What data should be provided to which users and when? ● Logical structure. How should data be arranged so that it makes sense to
a given user? ● Physical organization. Where should data be physically located? ● Archiving. How long must this data be stored? ● Security. How can this data be protected from unauthorized access?
Data Modeling When organizing a database, key considerations include determining what data to collect, what the source of the data will be, who will have access to it, how one might want to use it, and how to monitor database performance in terms of response time, availability, and other factors. AppDynamics offers its i-nexus cloud-based business execution solution to clients for use in defining the actions and plans needed to achieve business goals. The service runs on 30 Java virtual machines and eight database servers that are constantly supervised using database perfor- mance monitoring software. Use of the software has reduced the mean time to repair system problems and improved the performance and responsiveness for all its clients.5
One of the tools database designers use to show the logical relationships among data is a data model. A data model is a diagram of entities and their relationships. Data modeling usually involves developing an understanding of a specific business problem and then analyzing the data and information needed to deliver a solution. When done at the level of the entire organiza- tion, this procedure is called enterprise data modeling. Enterprise data modeling is an approach that starts by investigating the general data and information needs of the organization at the strategic level and then moves on to examine more specific data and information needs for the functional areas and departments within the organization. An enterprise data model involves analyzing the data and information needs of an entire organization and provides a roadmap for building database and information systems by creating a single definition and format for data that can ensure compatibility and the ability to exchange and integrate data among systems. See Figure 5.5.
data model: A diagram of data enti- ties and their relationships.
enterprise data model: A data model that provides a roadmap for building database and information systems by creating a single definition and format for data that can ensure data compatibility and the ability to exchange and integrate data among systems.
CHAPTER 5 • Database Systems and Big Data 199
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The IBM Healthcare Provider Data Model is an enterprise data model that can be adopted by a healthcare provider organization to organize and integrate clinical, research, operational, and financial data.6 At one time, the University of North Carolina Health Care System had a smorgasbord of information system hardware and software that made it difficult to integrate data from its existing legacy systems. The organization used the IBM Healthcare Provider Data Model to guide its efforts to simplify its informa- tion system environment and improve the integration of its data. As a result, it was able to eliminate its dependency on outdated technologies, build an environment that supports efficient data management, and integrate data from its legacy systems to create a source of data to support future analytics requirements.7
Various models have been developed to help managers and database designers analyze data and information needs. One such data model is an entity-relationship (ER) diagram, which uses basic graphical symbols to show the organization of and relationships between data. In most cases, boxes in ER diagrams indicate data items or entities contained in data tables, and lines show relationships between entities. In other words, ER diagrams show data items in tables (entities) and the ways they are related.
ER diagrams help ensure that the relationships among the data entities in a database are correctly structured so that any application programs developed are consistent with business operations and user needs. In addition, ER diagrams can serve as reference documents after a database is in use. If changes are made to the database, ER diagrams help design them. Figure 5.6 shows an ER diagram for an order database. In this database design, one salesperson serves many custo- mers. This is an example of a one-to-many relationship, as indicated by the one- to-many symbol (the “crow’s-foot”) shown in Figure 5.6. The ER diagram also shows that each customer can place one-to-many orders, that each order includes one-to-many line items, and that many line items can specify the same product
FIGURE 5.5 Enterprise data model The enterprise data model provides a roadmap for building database and information systems.
Supports
Supports
Systems and data
Enables capture of business opportunities
Increases business effectiveness
Reduces costs
Enables simpler system interfaces
Reduces data redundancy
Ensures compatible data
The enterprise
Data model
entity-relationship (ER) diagram: A data model that uses basic graphical symbols to show the organization of and relationships between data.
200 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
(a many-to-one relationship). This database can also have one-to-one relation- ships. For example, one order generates one invoice.
Relational Database Model The relational database model is a simple but highly useful way to organize data into collections of two-dimensional tables called relations. Each row in the table represents an entity, and each column represents an attribute of that entity. See Figure 5.7.
FIGURE 5.6 Entity-relationship (ER) diagram for a customer order database Development of ER diagrams helps ensure that the logical structure of application programs is consistent with the data relationships in the database.
Serves
Salesperson
Product
Customer
Orders
Places
Line items
Includes Specifies
Invoice
Generates
relational database model: A simple but highly useful way to organize data into collections of two-dimensional tables called relations.
FIGURE 5.7 Relational database model In the relational model, data is placed in two-dimensional tables, or relations. As long as they share at least one common attribute, these relations can be linked to provide output useful information. In this example, all three tables include the Dept. number attribute.
Data Table 1: Project Table
Project Description Dept. number
155 Payroll 257
498 Widgets 632
226 Sales manual 598
Data Table 2: Department Table
Dept. Dept. name Manager SSN
257 Accounting 005-10-6321
632 Manufacturing 549-77-1001
598 Marketing 098-40-1370
Data Table 3: Manager Table
SSN Last name First name
005-10-6321 Johns Francine
549-77-1001 Buckley Bill
098-40-1370 Fiske Steven
Hire date Dept. number
10-07-2013 257
02-17-1995 632
01-05-2001 598
CHAPTER 5 • Database Systems and Big Data 201
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Each attribute can be constrained to a range of allowable values called its domain. The domain for a particular attribute indicates what values can be placed in each column of the relational table. For instance, the domain for an attribute such as type employee could be limited to either H (hourly) or S (salary). If someone tried to enter a “1” in the type employee field, the data would not be accepted. The domain for pay rate would not include negative numbers. In this way, defining a domain can increase data accuracy.
Manipulating Data After entering data into a relational database, users can make inquiries and analyze the data. Basic data manipulations include selecting, projecting, and joining. Selecting involves eliminating rows according to certain criteria. Sup- pose the department manager of a company wants to use an employee table that contains the project number, description, and department number for all projects a company is performing. The department manager might want to find the department number for Project 226, a sales manual project. Using selection, the manager can eliminate all rows except the one for Project 226 and see that the department number for the department completing the sales manual project is 598.
Projecting involves eliminating columns in a table. For example, a department table might contain the department number, department name, and Social Security number (SSN) of the manager in charge of the project. A sales manager might want to create a new table that contains only the depart- ment number and the Social Security number of the manager in charge of the sales manual project. The sales manager can use projection to eliminate the department name column and create a new table containing only the depart- ment number and Social Security number.
Joining involves combining two or more tables. For example, you can combine the project table and the department table to create a new table with the project number, project description, department number, department name, and Social Security number for the manager in charge of the project.
As long as the tables share at least one common data attribute, the tables in a relational database can be linked to provide useful information and reports. Linking, the ability to combine two or more tables through common data attributes to form a new table with only the unique data attributes, is one of the keys to the flexibility and power of relational databases. Suppose the president of a company wants to find out the name of the manager of the sales manual project as well as the length of time the manager has been with the company. Assume that the company has Manager, Department, and Proj- ect tables as shown in Figure 5.7. These tables are related as depicted in Figure 5.8.
domain: The range of allowable values for a data attribute.
selecting: Manipulating data to eliminate rows according to certain criteria.
projecting: Manipulating data to eliminate columns in a table.
joining: Manipulating data to combine two or more tables.
linking: The ability to combine two or more tables through common data attributes to form a new table with only the unique data attributes.
FIGURE 5.8 Simplified ER diagram This diagram shows the relationship among the Manager, Department, and Project tables.
Supervises
Manager
Department
Project
Performs
202 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Note the crow’s-foot by the Project table. This symbol indicates that a department can have many projects. The manager would make the inquiry to the database, perhaps via a personal computer. The DBMS would start with the project description and search the Project table to find out the project’s department number. It would then use the department number to search the Department table for the manager’s Social Security number. The depart- ment number is also in the Department table and is the common element that links the Project table to the Department table. The DBMS uses the manager’s Social Security number to search the Manager table for the manager’s hire date. The manager’s Social Security number is the common element between the Department table and the Manager table. The final result is that the manager’s name and hire date are presented to the president as a response to the inquiry. See Figure 5.9.
One of the primary advantages of a relational database is that it allows tables to be linked, as shown in Figure 5.9. This linkage reduces data redundancy and allows data to be organized more logically. The ability to link to the manager’s Social Security number stored once in the Manager table eliminates the need to store it multiple times in the Project table.
The relational database model is widely used. It is easier to control, more flexible, and more intuitive than other approaches because it orga- nizes data in tables. As shown in Figure 5.10, a relational database manage- ment system, such as Microsoft Access, can be used to store data in rows and columns. In this figure, hyperlink tools available on the ribbon/toolbar can be used to create, edit, and manipulate the database. The ability to link relational tables also allows users to relate data in new ways without having to redefine complex relationships. Because of the advantages of the rela- tional model, many companies use it for large corporate databases, such as those for marketing and accounting.
FIGURE 5.9 Linking data tables to answer an inquiry To find the name and hire date of the manager working on the sales manual project, the president needs three tables: Project, Department, and Manager. The project descrip- tion (Sales manual) leads to the department number (598) in the Project table, which leads to the manager’s Social Security number (098-40-1370) in the Department table, which leads to the manager’s last name (Fiske) and hire date (01-05-2001) in the Manager table.
Data Table 1: Project Table
Project number
155
498
226
Description
Payroll
Widgets
Sales manual
Dept. number
257
632
598
Data Table 2: Department Table
Dept. number
257
632
598
Dept. name
Accounting
Manufacturing
Marketing
Manager SSN
005-10-6321
549-77-1001
098-40-1370
Data Table 3: Manager Table
SSN
005-10-6321
549-77-1001
098-40-1370
Last name
Johns
Buckley
Fiske
First name
Francine
Bill
Steven
Hire date
10-07-2013
02-17-1995
01-05-2001
Dept. number
257
632
598
CHAPTER 5 • Database Systems and Big Data 203
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Databases based on the relational model include Oracle, IBM DB2, Micro- soft SQL Server, Microsoft Access, MySQL, Sybase, and others. The relational database model has been an outstanding success and is dominant in the com- mercial world today, although many organizations are beginning to use new nonrelational models to meet some of their business needs.
Data Cleansing Data used in decision making must be accurate, complete, economical, flexible, reliable, relevant, simple, timely, verifiable, accessible, and secure. Data cleansing (data cleaning or data scrubbing) is the process of detect- ing and then correcting or deleting incomplete, incorrect, inaccurate, or irrele- vant records that reside in a database. The goal of data cleansing is to improve the quality of the data used in decision making. The “bad data” may have been caused by user data-entry errors or by data corruption during data transmission or storage. Data cleansing is different from data validation, which involves the identification of “bad data” and its rejection at the time of data entry.
One data cleansing solution is to identify and correct data by cross- checking it against a validated data set. For example, street number, street name, city, state, and zip code entries in an organization’s database may be cross-checked against the United States Postal Zip Code database. Data cleansing may also involve standardization of data, such as the conversion of various possible abbreviations (St., St, st., st) to one standard name (Street).
Data enhancement augments the data in a database by adding related information—such as using the zip code information for a given record to append the county code or census tract code.
The cost of performing data cleansing can be quite high. It is prohibi- tively expensive to eliminate all “bad data” to achieve 100 percent database accuracy, as shown in Figure 5.11.
FIGURE 5.10 Building and modifying a relational database Relational databases provide many tools, tips, and shortcuts to simplify the process of creating and modifying a database.
Data cleansing (data cleaning or data scrubbing): The process of detecting and then correcting or delet- ing incomplete, incorrect, inaccurate, or irrelevant records that reside in a database.
M ic ro so ft pr od uc t sc re en sh ot s us ed
w ith
pe rm is si on
fr om
M ic ro so ft Co rp or at io n
204 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
Banco Popular is the largest bank in Puerto Rico. Some 3,000 bank employees in 200 branches use a customer database to obtain a complete view of 5.7 million personal and business accounts. The bank uses a data cleansing process to eliminate duplicate records and build an accurate and complete record of each customer, reflecting all of their various accounts (checking, savings, auto loan, credit card, etc.) with the bank. Part of this pro- cess includes identifying how many account holders live at the same address to eliminate duplicate mailings to the same household, thus saving over $840,000 in mailing expenses each year.8
Cleansing Weather Data The process of weather forecasting begins with the collection of as much data as possible about the current state of the atmosphere. Weather data (barometric pressure, humidity, temperature, and wind direction and speed) is collected from a variety of sources, including aircraft, automatic weather stations, weather balloons, buoys, radar, satellites, ships, and trained observers. Due to the variety of data types taken from multiple data sources, weather data is captured in a variety of data formats, primarily Binary Universal Form for the Representation of meteorological data (BUFR) and Institute of Electrical and Electronics Engineers (IEEE) binary. These observations are then converted to a standard format and placed into a gridded 3D model space called the Global Data Assimilation System (GDAS). Once this process is complete, the gridded GDAS output data can be used to start the Global Forecast System (GFS) model.
For purposes of this exercise, imagine that the accuracy of the weather forecasts has been slipping. In your role as project manager at the National Center for Environmental Information (NCEI), you have been assigned to lead a project reviewing the processing of the initial data and placing it into the GDAS.
Review Questions 1. NCEI is responsible for hosting and providing access to one of the most signif-
icant archives on Earth, with comprehensive oceanic, atmospheric, and geo- physical data. Good database design would suggest that an enterprise data model exists for the NCEI. Why?
FIGURE 5.11 Tradeoff of cost versus accuracy The cost of performing data cleans- ing to achieve 100 percent database accuracy can be prohibitively expensive.
100
C os
t
80
60
40
20
0 0% 20% 40% 60%
Accuracy 80% 100%
120
CHAPTER 5 • Database Systems and Big Data 205
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
2. Define the domain of acceptable values for barometric pressure, humidity, and temperature.
Critical Thinking Questions 1. What issues could cause the raw weather data received to be incomplete or
inaccurate? 2. How might incomplete or inaccurate data be identified and corrected or
deleted from the forecasting process? Are there risks in such data cleansing?
Relational Database Management Systems (DBMSs)
Creating and implementing the right database system ensures that the data- base will support both business activities and goals. But how do we actually create, implement, use, and update a database? The answer is found in the database management system (DBMS). As discussed earlier, a DBMS is a group of programs used as an interface between a database and application programs or between a database and the user. Database management systems come in a wide variety of types and capabilities, ranging from small inexpen- sive software packages to sophisticated systems costing hundreds of thou- sands of dollars.
SQL Databases SQL is a special-purpose programming language for accessing and manipulat- ing data stored in a relational database. SQL was originally defined by Donald D. Chamberlin and Raymond Boyce of the IBM Research Center and described in their paper “SEQUEL: A Structured English Query Language,” published in 1974. Their work was based on the relational database model described by Edgar F. Codd in his groundbreaking paper from 1970, “A Relational Model of Data for Large Shared Data Banks.”
SQL databases conform to ACID properties (atomicity, consistency, isola- tion, durability), defined by Jim Gray soon after Codd’s work was published. These properties guarantee database transactions are processed reliably and ensure the integrity of data in the database. Basically, these principles mean that data is broken down to atomic values—that is, values that have no compo- nent parts—such as employee_ID, last_name, first_name, address_line_1, address_line_2, and city. The data in these atomic values remains consistent across the database. The data is isolated from other transactions until the current transaction is finished, and it is durable in the sense that the data should never be lost.9
SQL databases rely upon concurrency control by locking database records to ensure that other transactions do not modify the database until the first transaction succeeds or fails. As a result, 100 percent ACID-compliant SQL databases can suffer from slow performance.
In 1986, the American National Standards Institute (ANSI) adopted SQL as the standard query language for relational databases. Since ANSI’s acceptance of SQL, interest in making SQL an integral part of relational databases on both mainframe and personal computers has increased. SQL has many built-in functions, such as average (AVG), the largest value (MAX), and the smallest value (MIN). Table 5.1 contains examples of SQL commands.
SQL: A special-purpose programming language for accessing and manipulating data stored in a relational database.
ACID properties: Properties (atom- icity, consistency, isolation, durability) that guarantee relational database transactions are processed reliably and ensure the integrity of data in the database.
206 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
SQL allows programmers to learn one powerful query language and use it on systems ranging from PCs to the largest mainframe computers. See Figure 5.12. Programmers and database users also find SQL valuable because SQL statements can be embedded into many programming languages, such as the widely used C++ and Java. Because SQL uses standardized and simplified procedures for retrieving, storing, and manipulating data, many programmers find it easy to understand and use—hence, its popularity.
Database Activities Databases are used to provide a user view of the database, to add and modify data, to store and retrieve data, and to manipulate the data and generate
FIGURE 5.12 Structured Query Language (SQL) SQL has become an integral part of most relational databases, as shown by this example from Microsoft Access 2013.
TABLE 5.1 Examples of SQL commands SQL Command Description
SELECT ClientName, Debt FROM Client WHERE Debt > 1000
This query displays clients (ClientName) and the amount they owe the company (Debt) from a database table called Client; the query would only display clients who owe the company more than $1,000 (WHERE Debt > 1000).
SELECT ClientName, ClientNum, OrderNum FROM Client, Order WHERE Client.Client- Num=Order.ClientNum
This command is an example of a join command that combines data from two tables: the Client table and the Order table (FROM Client, Order). The command creates a new table with the client name, client number, and order number (SELECT ClientName, ClientNum, OrderNum). Both tables include the client number, which allows them to be joined. This ability is indicated in the WHERE clause, which states that the client number in the Client table is the same as (equal to) the client number in the Order table (WHERE Client. ClientNum=Order.ClientNum).
GRANT INSERT ON Client to Guthrie
This command is an example of a security command. It allows Bob Guthrie to insert new values or rows into the Client table.
M ic ro so ft pr od uc t sc re en sh ot s us ed
w ith
pe rm is si on
fr om
M ic ro so ft Co rp or at io n
CHAPTER 5 • Database Systems and Big Data 207
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
reports. Each of these activities is discussed in greater detail in the following sections.
Providing a User View Because the DBMS is responsible for providing access to a database, one of the first steps in installing and using a large relational database involves “tell- ing” the DBMS the logical and physical structure of the data and the relation- ships among the data for each user. This description is called a schema (as in a schematic diagram). In a relational database, the schema defines the tables, the fields in each table, and the relationships between fields and tables. Large database systems, such as Oracle, typically use schemas to define the tables and other database features associated with a person or user. The DBMS can reference a schema to find where to access the requested data in relation to another piece of data.
Creating and Modifying the Database Schemas are entered into the DBMS (usually by database personnel) via a data definition language. A data definition language (DDL) is a collection of instructions and commands used to define and describe data and relationships in a specific database. A DDL allows the database’s creator to describe the data and relationships that are to be contained in the schema. In general, a DDL describes logical access paths and logical records in the database. Figure 5.13 shows a simplified example of a DDL used to develop a general schema. The use of the letter X in Figure 5.13 reveals where specific informa- tion concerning the database should be entered. File description, area descrip- tion, record description, and set description are terms the DDL defines and uses in this example. Other terms and commands can also be used, depend- ing on the DBMS employed.
Another important step in creating a database is to establish a data dictio- nary, a detailed description of all data used in the database. Among other things, the data dictionary contains the following information for each data item:
FIGURE 5.13 Data definition language (DDL) A data definition language (DDL) is used to define a schema.
SCHEMA DESCRIPTION SCHEMA NAME IS XXXX AUTHOR XXXX DATE XXXX FILE DESCRIPTION
FILE NAME IS XXXX ASSIGN XXXX
FILE NAME IS XXXX ASSIGN XXXX
AREA DESCRIPTION AREA NAME IS XXXX
RECORD DESCRIPTION RECORD NAME IS XXXX RECORD ID IS XXXX LOCATION MODE IS XXXX WITHIN XXXX AREA FROM XXXX THRU XXXX
SET DESCRIPTION SET NAME IS XXXX ORDER IS XXXX MODE IS XXXX MEMBER IS XXXX . . .
schema: A description that defines the logical and physical structure of the database by identifying the tables, the fields in each table, and the relation- ships between fields and tables.
data definition language (DDL): A collection of instructions and com- mands used to define and describe data and relationships in a specific database.
data dictionary: A detailed description of all the data used in the database.
208 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
● Name of the data item ● Aliases or other names that may be used to describe the item ● Range of values that can be used ● Type of data (such as alphanumeric or numeric) ● Amount of storage needed for the item ● Notation of the person responsible for updating it and the various users
who can access it ● List of reports that use the data item
A data dictionary can also include a description of data flows, information about the way records are organized, and the data-processing requirements. Figure 5.14 shows a typical data dictionary entry.
Following the example in Figure 5.14, the information in a data dictionary for the part number of an inventory item can include the following information:
● Name of the person who made the data dictionary entry (D. Bordwell) ● Date the entry was made (August 4, 2016) ● Name of the person who approved the entry (J. Edwards) ● Approval date (October 13, 2016) ● Version number (3.1) ● Number of pages used for the entry (1) ● Data element name is Part name (PARTNO) ● A description of the element ● Other names that might be used (PTNO) ● Range of values (part numbers can range from 100 to 5000) ● Type of data (numeric) ● Storage required (four positions are required for the part number)
A data dictionary is a valuable tool for maintaining an efficient database that stores reliable information with no redundancy, and it simplifies the pro- cess of modifying the database when necessary. Data dictionaries also help computer and system programmers who require a detailed description of data elements stored in a database to create the code to access the data.
Adherence to the standards defined in the data dictionary also makes it easy to share data among various organizations. For example, the U.S. Depart- ment of Energy (DOE) developed a data dictionary of terms to provide a stan- dardized approach for the evaluation of energy data. The Building Energy Data Exchange Specification (BEDES) provides a common language of key data elements, including data formats, valid ranges, and definitions that are
FIGURE 5.14 Data dictionary entry A data dictionary provides a detailed description of all data used in the database.
NORTHWESTERN MANUFACTURING
PREPARED BY: D. BORDWELL DATE: 04 AUGUST 2016 APPROVED BY: J. EDWARDS DATE: 13 OCTOBER 2016 VERSION: 3.1 PAGE: 1 OF 1
DATA ELEMENT NAME: PARTNO DESCRIPTION: OTHER NAMES: PTNO VALUE RANGE: 100 TO 5000 DATA TYPE: NUMERIC POSITIONS: 4 POSITIONS OR COLUMNS
INVENTORY PART NUMBER
CHAPTER 5 • Database Systems and Big Data 209
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
designed to improve communications between contractors, software vendors, finance companies, utilities, and Public Utility Commissions. Adherence to these data standards allows information to be easily shared and aggregated without the need for extensive data scrubbing and translation. All stake- holders can use this standard set of data to answer key questions related to the energy savings and usage.10
Storing and Retrieving Data One function of a DBMS is to be an interface between an application program and the database. When an application program needs data, it requests the data through the DBMS. Suppose that to calculate the total price of a new car, a pricing program needs price data on the engine option—for example, six cylinders instead of the standard four cylinders. The application program requests this data from the DBMS. In doing so, the application program fol- lows a logical access path (LAP). Next, the DBMS, working with various sys- tem programs, accesses a storage device, such as a disk drive or solid state storage device (SSD), where the data is stored. When the DBMS goes to this storage device to retrieve the data, it follows a path to the physical location— physical access path—where the price of this option is stored. In the pricing example, the DBMS might go to a disk drive to retrieve the price data for six- cylinder engines. This relationship is shown in Figure 5.15.
This same process is used if a user wants to get information from the database. First, the user requests the data from the DBMS. For example, a user might give a command, such as LIST ALL OPTIONS FOR WHICH PRICE IS GREATER THAN $200. This is the logical access path. Then, the DBMS might go to the options price section of a disk to get the information for the user. This is the physical access path.
Two or more people or programs attempting to access the same record at the same time can cause a problem. For example, an inventory control pro- gram might attempt to reduce the inventory level for a product by 10 units because 10 units were just shipped to a customer. At the same time, a pur- chasing program might attempt to increase the inventory level for the same product by 200 units because inventory was just received. Without proper database control, one of the inventory updates might be incorrect, resulting in an inaccurate inventory level for the product. Concurrency control can be used to avoid this potential problem. One approach is to lock out all other
FIGURE 5.15 Logical and physical access paths When an application requests data from the DBMS, it follows a logical access path to the data. When the DBMS retrieves the data, it follows a path to the physical access path to the data.
DBMS
Physical access path (PAP)
Logical access path (LAP)
Other software
Application programs
Management inquiries
Data on storage device
concurrency control: A method of dealing with a situation in which two or more users or applications need to access the same record at the same time.
210 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
application programs from access to a record if the record is being updated or used by another program.
Manipulating Data and Generating Reports After a DBMS has been installed, employees, managers, and other authorized users can use it to review reports and obtain important information. Using a DBMS, a company can manage this requirement. Some databases use Query by Example (QBE), which is a visual approach to developing database queries or requests. With QBE, you can perform queries and other database tasks by opening windows and clicking the data or features you want—similar to the way you work with Windows and other GUI (graphical user interface) operat- ing systems and applications. See Figure 5.16.
In other cases, database commands can be used in a programming lan- guage. For example, C++ commands can be used in simple programs that will access or manipulate certain pieces of data in the database. Here’s another example of a DBMS query:
SELECT * FROM EMPLOYEE WHERE JOB_CLASSIFICATION=“C2.”
The asterisk (*) tells the program to include all columns from the EMPLOYEE table. In general, the commands that are used to manipulate the database are part of the data manipulation language (DML). This specific language, provided with the DBMS, allows managers and other database users to access and modify the data, to make queries, and to generate reports. Again, the application programs go through schemas and the DBMS before getting to the data stored on a device such as a disk.
After a database has been set up and loaded with data, it can produce desired reports, documents, and other outputs. See Figure 5.17. These outputs usually appear in screen displays or on hard copy printouts. The output- control features of a database program allow a user to select the records and fields that will appear in a report. Formatting controls and organization options (such as report headings) help users customize reports and create flexible, convenient, and powerful information-handling tools.
FIGURE 5.16 Query by Example Some databases use Query by Example (QBE) to generate reports and information. M
ic ro so ft pr od uc t sc re en sh ot s us ed
w ith
pe rm is si on
fr om
M ic ro so ft Co rp or at io n
data manipulation language (DML): A specific language, provided with a DBMS, which allows users to access and modify the data, to make queries, and to generate reports.
CHAPTER 5 • Database Systems and Big Data 211
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
A DBMS can produce a wide variety of documents, reports, and other out- put that can help organizations make decisions and achieve their goals. Often, organizations have standard reports that are run on a regular basis. The most common reports select and organize data to present summary information about some aspect of company operations. For example, accounting reports often summarize financial data such as current and past due accounts. Many companies base their routine operating decisions on regular status reports that show the progress of specific orders toward completion and delivery.
Database Administration Database administrators (DBAs) are skilled and trained IS professionals who hold discussions with business users to define their data needs; apply database programming languages to craft a set of databases to meet those needs; test and evaluate databases; implement changes to improve their performance; and assure that data is secure from unauthorized access. Database systems require a skilled database administrator (DBA), who must have a clear understanding of the fundamental business of the organization, be proficient in the use of selected database management systems, and stay abreast of emerging technolo- gies and new design approaches. The role of the DBA is to plan, design, create, operate, secure, monitor, and maintain databases. Typically, a DBA has a degree in computer science or management information systems and some on-the-job training with a particular database product or more extensive experience with a range of database products. See Figure 5.18.
FIGURE 5.17 Database output A database application offers sophisticated formatting and organization options to produce the right information in the right format.
FIGURE 5.18 Database administrator The role of the database administrator (DBA) is to plan, design, create, operate, secure, monitor, and maintain databases.
database administrators (DBAs): Skilled and trained IS pro- fessionals who hold discussions with business users to define their data needs; apply database programming languages to craft a set of databases to meet those needs; test and evaluate databases; implement changes to improve the performance of databases; and assure that data is secure from unauthorized access.
Cl er ke nw
el l_ Im ag es /i st oc kp ho to .c om
M ic ro so ft pr od uc ts cr ee ns ho ts us ed
w ith
pe rm is si on
fr om
M ic ro so ft
Co rp or at io n
212 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The DBA works with users to decide the content of the database—to determine exactly what entities are of interest and what attributes are to be recorded about those entities. Thus, not only is it important that a DBA understand the business of an organization, but personnel outside of IS must also have some idea of what the DBA does and why this function is important. The DBA can play a crucial role in the development of effective information systems to benefit the organization, employees, and managers.
The DBA also works with programmers as they build applications to ensure that their programs comply with database management system stan- dards and conventions. After the database has been built and is operating, the DBA monitors operations logs for security violations. Database perfor- mance is also monitored to ensure that the system’s response time meets users’ needs and that it operates efficiently. If there is a problem, the DBA attempts to correct it before it becomes serious.
An important responsibility of a DBA is to protect the database from attack or other forms of failure. DBAs use security software, preventive measures, and redundant systems to keep data safe and accessible. In spite of the best efforts of DBAs, database security breaches are all too common. For example, customer records of more than 83 million customers of JPMor- gan Chase were stolen between June 2014 and August 2014. This represents the largest theft of consumer data from a U.S. financial institution in history.11
Some organizations have also created a position called the data administrator, an individual responsible for defining and implementing consistent principles for a variety of data issues, including setting data standards and data definitions that apply across all the databases in an organization. For example, the data administrator would ensure that a term such as “customer” is defined and treated consistently in all corpo- rate databases. The data administrator also works with business managers to identify who should have read or update access to certain databases and to selected attributes within those databases. This information is then communicated to the database administrator for implementation. The data administrator can be a high-level position reporting to top-level managers.
Popular Database Management Systems Many popular database management systems address a wide range of individ- ual, workgroup, and enterprise needs as shown in Table 5.2. The complete
TABLE 5.2 Popular database management systems
Open-Source Relational DBMS
Relational DBMS for Individuals and Workgroups
Relational DBMS for Workgroups and Enterprise
MySQL Microsoft Access Oracle
PostgreSQL IBM Lotus Approach IBM DB2
MariaDB Google Base Sybase Adaptive Server
SQL Lite OpenOffice Base Teradata
CouchDB Microsoft SQL Server
Progress OpenEdge
data administrator: An individual responsible for defining and imple- menting consistent principles for a variety of data issues.
CHAPTER 5 • Database Systems and Big Data 213
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
DBMS market encompasses software used by people ranging from nontechni- cal individuals to highly trained, professional programmers and runs on all types of computers from tablets to supercomputers. The entire market gener- ates billions of dollars per year in revenue for companies such as IBM, Oracle, and Microsoft.
Selecting a DBMS begins by analyzing the information needs of the organization. Important characteristics of databases include the size of the database, the number of concurrent users, database performance, the ability of the DBMS to be integrated with other systems, the features of the DBMS, the vendor considerations, and the cost of the database management system.
CouchDB by Couchbase is an open-source database system used by Zynga, the developer of the popular Internet game FarmVille, to process 250 million visitors a month.
With database as a service (DaaS), the database is stored on a service provider’s servers and accessed by the service subscriber over the Internet, with the database administration handled by the service provider. More than a dozen companies are now offering DaaS services, including Amazon, Database.com, Google, Heroku, IBM, Intuit, Microsoft, MyOwnDB, Oracle, and Trackvia. Amazon Relational Database Service (Amazon RDS) is a DaaS that enables organizations to set up and operate their choice of a MySQL, Microsoft SQL, Oracle, or PostgreSQL relational database in the cloud. The service automatically backs up the database and stores those backups based on a user-defined retention period.
TinyCo is a mobile gaming firm whose games Tiny Monsters, Tiny Village, and Tiny Zoo Friends can be found at the Amazon, Google Play, and iTunes app stores.12 The company employs Amazon Web Services (AWS) to enable it to support the rapid growth in the number of its users without having to devote constant time and effort to organize and configure its information sys- tems infrastructure. This arrangement has allowed the company to focus its resources on developing and marketing its new games. TinyCo application data is stored in the Amazon Relational Database Service (Amazon RDS) for MySQL.13
Using Databases with Other Software Database management systems are often used with other software and to interact with users over the Internet. A DBMS can act as a front-end applica- tion or a back-end application. A front-end application is one that people interact with directly. Marketing researchers often use a database as a front end to a statistical analysis program. The researchers enter the results of market questionnaires or surveys into a database. The data is then trans- ferred to a statistical analysis program to perform analysis, such as determin- ing the potential for a new product or the effectiveness of an advertising campaign. A back-end application interacts with other programs or applica- tions; it only indirectly interacts with people or users. When people request information from a Web site, the site can interact with a database (the back end) that supplies the desired information. For example, you can connect to a university Web site to find out whether the university’s library has a book you want to read. The site then interacts with a database that contains a cat- alog of library books and articles to determine whether the book you want is available. See Figure 5.19.
database as a service (DaaS): An arrangement where the database is stored on a service provider’s servers and accessed by the service subscriber over a network, typically the Internet, with the database administration handled by the service provider.
214 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
Database as a Service You are the database administrator for the customer database of a medium-sized manufacturing firm. The database runs on an Oracle database management system installed on a server owned and managed by your firm’s small IT organization. Recently you have been receiving a number of complaints from users of the data- base about extremely slow response time to their queries and report requests. Management has asked you to prepare a set of proposed solutions.
Review Questions 1. What advantages might be gained from moving to a database as a service
environment? 2. Can you think of any possible disadvantages to this approach?
Critical Thinking Questions 1. What additional questions need to be answered before you can decide if the
database as a service approach is right for your firm? 2. How might such a move affect you and your role?
Big Data
Big data is the term used to describe data collections that are so enormous (terabytes or more) and complex (from sensor data to social media data) that traditional data management software, hardware, and analysis processes are incapable of dealing with them.
FIGURE 5.19 Library of Congress Web site The Library of Congress (LOC) provides a back-end application that allows Web access to its databases, which include references to books and digital media in the LOC collection.
Source: www.loc.gov
CHAPTER 5 • Database Systems and Big Data 215
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Characteristics of Big Data Computer technology analyst Doug Laney associated the three characteristics of volume, velocity, and variety with big data14:
● Volume. In 2014, it was estimated that the volume of data that exists in the digital universe was 4.4 zettabytes (one zettabyte equals one trillion gigabytes). The digital universe is expected to grow to an amazing 44 zettabytes by 2020, with perhaps one-third of that data being of value to organizations.15
● Velocity. The velocity at which data is currently coming at us exceeds 5 trillion bits per second.16 This rate is accelerating rapidly, and the volume of digital data is expected to double every two years between now and 2020.17
● Variety. Data today comes in a variety of formats. Some of the data is what computer scientists call structured data—its format is known in advance, and it fits nicely into traditional databases. For example, the data generated by the well-defined business transactions that are used to update many corporate databases containing customer, product, inven- tory, financial, and employee data is generally structured data. However, most of the data that an organization must deal with is unstructured data, meaning that it is not organized in any predefined manner.18 Unstruc- tured data comes from sources such as word-processing documents, social media, email, photos, surveillance video, and phone messages.
Sources of Big Data Organizations collect and use data from a variety of sources, including busi- ness applications, social media, sensors and controllers that are part of the manufacturing process, systems that manage the physical environment in fac- tories and offices, media sources (including audio and video broadcasts), machine logs that record events and customer call data, public sources (such as government Web sites), and archives of historical records of transactions and communications. See Figure 5.20. Much of this collected data is unstruc- tured and does not fit neatly into traditional relational database management
Media Images, audio, video,
live data feeds, podcasts
Sensor data Process control devices,
smart electric meters,
packing line counters
Social media Twitter, Facebook,
LinkedIn, Pinterest
Data from business apps ERP, CRM, PLM, HR
Documents eMail, Power Point,
Word, Excel, .PDF, HTML
Archives Historical records
of communications
and transactions
Public data Local, state, and federal
government Web sites
Machine log data Call detail data event logs,
business process logs,
application logs
An organization’s collection of useful
data
FIGURE 5.20 Sources of an organization’s useful data An organization has many sources of useful data.
216 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
systems. Table 5.3 provides a starter list of some of the many Web portals that provide access to free sources of useful big data sets.
Big Data Uses Here are just a few examples of how organizations are employing big data to improve their day-to-day operations, planning, and decision making:
● Retail organizations monitor social networks such as Facebook, Google, LinkedIn, Twitter, and Yahoo to engage brand advocates, identify brand adversaries (and attempt to reverse their negative opinions), and even enable passionate customers to sell their products.
TABLE 5.3 Portals that provide access to free sources of useful big data Data Source Description URL
Amazon Web Services (AWS) public data sets
Portal to a huge repository of public data, including climate data, the million song data set, and data from the 1000 Genomes project.
http://aws.amazon.com/datasets
Bureau of Labor Statistics (BLS)
Provides access to data on inflation and prices, wages and benefits, employment, spending and time use, productivity, and workplace injuries
www.bls.gov
CIA World Factbook Portal to information on the economy, government, history, infrastructure, military, and population of 267 countries
https://cia.gov/library/publications /the-world-factbook
Data.gov Portal providing access to over 186,000 government data sets, related to topics such as agriculture, education, health, and public safety
http://data.gov
Facebook Graph Provides a means to query Facebook profile data not classified as private
https://developers.facebook.com/docs /graph-api
FBI Uniform Crime Reports
Portal to data on Crime in the United States, Law Enforcement Officers Killed and Assaulted, and Hate Crime Statistics
https://www.fbi.gov/about-us/cjis/ucr /ucr/
Justia Federal District Court Opinions and Orders database
A free searchable database of full-text opinions and orders from civil cases heard in U.S. Federal District Courts
http://law.justia.com/cases/federal /district-courts/
Gapminder Portal to data from the World Health Organization and World Bank on economic, medical, and social issues
www.gapminder.org/data
Google Finance Portal to 40 years of stock market data http://google.com/finance
Healthdata.gov Portal to 125 years of U.S. healthcare data, including national healthcare expenditures, claim-level Medicare data, and data related to healthcare quality, epidemiology, and population, among many other topics
www.healthdata.gov
National Centers for Environmental Information
Portal for accessing a variety of climate and weather data sets
www.ncdc.noaa.gov/data-access /quick-links#loc-clim
New York Times Portal that provides users with access to NYT arti- cles, book and movie reviews, data on political campaign contributions, and other material
http://developer.nytimes.com/docs
Social Institutions and Gender Index
Provides access to country profiles and data that measures the degree of cross-country discrimination against women in social institutions
http://genderindex.org
U.S. Census Bureau Portal to a huge variety of government statistics and data relating to the U.S. economy and its population
www.census.gov/data.html
CHAPTER 5 • Database Systems and Big Data 217
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
● Advertising and marketing agencies track comments on social media to understand consumers’ responsiveness to ads, campaigns, and promotions.
● Hospitals analyze medical data and patient records to try to identify patients likely to need readmission within a few months of discharge, with the goal of engaging with those patients in the hope of preventing another expensive hospital stay.
● Consumer product companies monitor social networks to gain insight into customer behavior, likes and dislikes, and product perception to identify necessary changes to their products, services, and advertising.
● Financial services organizations use data from customer interactions to identify customers who are likely to be attracted to increasingly targeted and sophisticated offers.
● Manufacturers analyze minute vibration data from their equipment, which changes slightly as it wears down, to predict the optimal time to perform maintenance or replace the equipment to avoid expensive repairs or potentially catastrophic failure.
Challenges of Big Data Individuals, organizations, and society in general must find a way to deal with this ever-growing data tsunami to escape the risks of information overload. The challenge is manifold, with a variety of questions that must be answered, includ- ing how to choose what subset of data to store, where and how to store the data, how to find those nuggets of data that are relevant to the decision making at hand, how to derive value from the relevant data, and how to identify which data needs to be protected from unauthorized access. With so much data avail- able, business users can have a hard time finding the information they need to make decisions, and they may not trust the validity of the data they can access.
Trying to deal with all this data from so many different sources, much of it from outside the organization, can also increase the risk that the orga- nization fails to comply with government regulations or internal controls (see Table 5.4). If measures to ensure compliance are not defined and fol- lowed, compliance issues can arise. Violation of these regulations can lead not only to government investigations but also to dramatic drops in stock prices, as when computer chipmaker Marvell Technologies alarmed
TABLE 5.4 Partial list of rules, regulations, and standards with which U.S. information system organizations must comply
Rule, Regulation, or Standard Intent
Bank Secrecy Act Detects and prevents money laundering by requiring financial institutions to report certain transactions to government agencies and to withhold from clients that such reports were filed about them
Basel II Accord Creates international standards that strengthen global capital and liquidity rules, with the goal of promoting a more resilient banking sector worldwide
California Senate Bill 1386
Protects against identity theft by imposing disclosure requirements for businesses and government agencies that experience security breaches that might put the personal information of California residents at risk; the first of many state laws aimed at protecting consumers from identity theft
European Union Data Protection Directive
Protects the privacy of European Union citizens’ personal information by placing limitations on sending such data outside of the European Union to areas that are deemed to have less than adequate standards for data security
Foreign Account Tax Compliance Act
Identifies U.S. taxpayers who hold financial assets in non-U.S. financial institutions and offshore accounts, to ensure that they do not avoid their U.S. tax obligations
218 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
investors by announcing that it had found problems with the way it booked revenue, resulting in a 16 percent drop in its stock price in just one day.19
Optimists believe that we can conquer these challenges and that more data will lead to more accurate analyses and better decision making, which in turn will result in deliberate actions that improve matters.
Not everyone, however, is happy with big data applications. Some people have privacy concerns about the fact that corporations are harvesting huge amounts of personal data that can be shared with other organizations. With all this data, organizations can develop extensive profiles of people without their knowledge or consent. Big data also introduces security concerns. Are organizations able to keep big data secure from competitors and malicious hackers? Some experts believe companies that collect and store big data could be open to liability suits from individuals and organizations. Even with these potential disadvantages, many companies are rushing into big data due to the lure of a potential treasure trove of information and new applications.
Data Management Data management is an integrated set of functions that defines the processes by which data is obtained, certified fit for use, stored, secured, and processed in such a way as to ensure that the accessibility, reliability, and timeliness of the data meet the needs of the data users within an organization. The Data Management Association (DAMA) International is a nonprofit, vendor- independent, international association whose members promote the under- standing, development, and practice of managing data as an essential enterprise asset. This organization has identified 10 major functions of data management, as shown in Figure 5.21. Data governance is the core component of data man- agement; it defines the roles, responsibilities, and processes for ensuring that data can be trusted and used by the entire organization, with people identified and in place who are responsible for fixing and preventing issues with data.
TABLE 5.4 Partial list of rules, regulations, and standards with which U.S. information system organizations must comply (continued)
Rule, Regulation, or Standard Intent
Foreign Corrupt Practices Act
Prevents certain classes of persons and entities from making payments to foreign government officials in an attempt to obtain or retain business
Gramm-Leach-Bliley Act Protects the privacy and security of individually identifiable financial information collected and processed by financial institutions
Health Insurance Porta- bility and Accountability Act (HIPAA)
Safeguards protected health information (PHI) and electronic PHI (ePHI) data gathered in the healthcare process and standardizes certain electronic transactions within the healthcare industry
Payment Card Industry (PCI) Data Security Standard
Protects cardholder data and ensures that merchants and service providers maintain strict information security standards
Personal Information Protection and Electronic Documents Act (Canada)
Governs the collection, use, and disclosure of personally identifiable information in the course of commercial transactions; created in response to European Union data protection directives
Sarbanes-Oxley Act Protects the interests of investors and consumers by requiring that the annual reports of public companies include an evaluation of the effectiveness of internal control over financial reporting; requires that the company’s CEO and CFO attest to and report on this assessment
USA PATRIOT Act This wide-ranging act has many facets; one portion of the Act relating to information system compliance is called the Financial Anti-Terrorism Act and is designed to combat the financing of terrorism through money laundering and other financial crimes
data management: An integrated set of functions that defines the processes by which data is obtained, certified fit for use, stored, secured, and processed in such a way as to ensure that the accessibility, reliability, and timeliness of the data meet the needs of the data users within an organization.
data governance: The core com- ponent of data management; it defines the roles, responsibilities, and pro- cesses for ensuring that data can be trusted and used by the entire organi- zation, with people identified and in place who are responsible for fixing and preventing issues with data.
CHAPTER 5 • Database Systems and Big Data 219
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The need for data management is driven by a variety of factors, including the need to meet external regulations designed to manage risk associated with financial misstatement, the need to avoid the inadvertent release of sensitive data, or the need to ensure that high data quality is available for key deci- sions. Haphazard or incomplete business processes and controls simply will not meet these requirements. Formal management processes are needed to govern data.
Effective data governance requires business leadership and active participation—it cannot be an effort that is led by the information system organization. The use of a cross-functional team is recommended because data and information systems are used by many different departments. No one individual has a complete view of the organization’s data needs. Employment of a cross-functional team is particularly important for ensuring that compliance needs are met. The data governance team should be a cross- functional, multilevel data governance team, consisting of executives, project managers, line-of-business managers, and data stewards. The data steward is an individual responsible for the management of critical data elements, including identifying and acquiring new data sources; creating and maintain- ing consistent reference data and master data definitions; and analyzing data for quality and reconciling data issues. Data users consult with a data stew- ard when they need to know what data to use to answer a business ques- tion, or to confirm the accuracy, completeness, or soundness of data within a business context.
The data governance team defines the owners of the data assets in the enterprise. The team also develops a policy that specifies who is accountable for various portions or aspects of the data, including its accuracy, accessibility, consistency, completeness, updating, and archiving. The team defines pro- cesses for how the data is to be stored, archived, backed up, and protected
FIGURE 5.21 Data management The Data Management Association (DAMA) International has identified 10 basic functions associated with data management. Source: “Body of Knowledge,” DAMA International, https://www.dama.org/content /body-knowledge. Copyright DAMA International.
Data Governance
Data Architecture Management
Data Development
Database Operations
Management
Data Security
Management
Reference & Master Data Management
Data Warehousing & Business Intelligence
Management
Document & Content
Management
Meta Data Management
Data Quality
Management
data steward: An individual responsible for the management of critical data elements, including identi- fying and acquiring new data sources; creating and maintaining consistent reference data and master data defini- tions; and analyzing data for quality and reconciling data issues.
220 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
from cyberattacks, inadvertent destruction or disclosure, or theft. It also devel- ops standards and procedures that define who is authorized to update, access, and use the data. The team also puts in place a set of controls and audit pro- cedures to ensure ongoing compliance with organizational data policies and government regulations.
Data lifecycle management (DLM) is a policy-based approach to manag- ing the flow of an enterprise’s data, from its initial acquisition or creation and storage to the time when it becomes outdated and is deleted. See Figure 5.22. Several vendors offer software products to support DLM such as IBM Informa- tion Lifecycle Governance suite of software products.
Walgreens Data Assimilation As of this writing, Walgreens is making moves to acquire Rite Aide in a move that would combine the nation’s second- and third-largest drugstore chains by market share, behind only fierce rival CVS Health. If this acquisition is approved, Rite Aide customer data will need to be assimilated into Walgreens’ information sys- tems. For pharmacy customers, this includes sensitive information, such as per- sonal data, details of medications prescribed, health insurance identification codes, and doctors used. Walgreens will need this data to provide smooth and uninterrupted service to the old Rite Aide customers. In addition, Walgreen has in place a system that automatically checks each new medication prescribed against other medications the customer is taking to ensure there will be no adverse drug interactions. The data must be captured in such a way that ensures its accuracy and completeness.
FIGURE 5.22 The big data life cycle A policy-based approach to man- aging the flow of an enterprise’s data, from its initial acquisition or creation and storage to the time when it becomes outdated and is deleted.
INFOGRAPHIC BACKGROUND
Archive
or discard
Define data
needs
Evaluate alternate
sources
Evaluate Acquire
data
Store data
Publish data
descriptions
Access
and use
data lifecycle management (DLM): A policy-based approach to managing the flow of an enterprise’s data, from its initial acquisition or creation and storage to the time when it becomes outdated and is deleted.
A be rt /S hu tt er st oc k. co m
CHAPTER 5 • Database Systems and Big Data 221
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Review Questions 1. Identify specific federal regulations that apply to the use and management of
Walgreens and Rite Aide data. 2. Would it make sense for Walgreen to appoint a data governance team to over-
see the Rite Aide data assimilation process? What might the responsibilities of such a team be?
Critical Thinking Questions 1. Do you think that Walgreens should attempt to automate the process of assim-
ilating Rite Aide customer, insurance, and medication data into its systems? Or, should Walgreens design an efficient manual process for former Rite Aide customers to provide the necessary data prior to or on their initial visit to a Walgreens pharmacy? What are the pros and cons of each approach? Which approach would you recommend?
2. Identify several potential negative consequences resulting from poor execu- tion of the data assimilation process.
Technologies Used to Process Big Data
Data Warehouses, Data Marts, and Data Lakes The raw data necessary to make sound business decisions is typically stored in a variety of locations and formats. This data is initially captured, stored, and managed by transaction-processing systems that are designed to support the day-to-day operations of an organization. For decades, organizations have collected operational, sales, and financial data with their online transaction processing (OLTP) systems. These OLTP systems put data into databases very quickly, reliably, and efficiently, but they do not support the types of big data analysis that today’s businesses and organizations require. Through the use of data warehouses and data marts, organizations are now able to access the data gathered via OLTP system and use it more effectively to support decision making.
Data Warehouses A data warehouse is a database that holds business information from many sources in the enterprise, covering all aspects of the company’s processes, products, and customers. Data warehouses allow managers to “drill down” to get greater detail or “roll up” to generate aggregate or summary reports. The primary purpose is to relate information in innovative ways and help man- agers and executives make better decisions. A data warehouse stores histori- cal data that has been extracted from operational systems and external data sources. See Figure 5.23.
Companies use data warehouses in a variety of ways, as shown in the fol- lowing examples:
● Walmart operates separate data warehouses for Walmart and Sam’s Club and allows suppliers access to almost any data they could possibly need to determine which of their products are selling, how fast, and even whether they should redesign their packaging to fit more product on store shelves.20
● UPS manages a 16-petabyte data warehouse containing data on some 16.3 million packages it ships per day for 8.8 million customers, who make an average of 39.5 million tracking requests per day.21
● Orscheln (a billion dollar retailer that sells farm- and home-related pro- ducts through its some 150 stores spread across the Midwest)
222 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
implemented an Oracle data warehouse that is used by merchants, buyers, planners, and store managers to perform analysis on inventory management, sales performance, pricing and promotions effectiveness, vendor compliance, and loss prevention.22
● General Electric uses a data warehouse to hold data from sensors on the performance of the blades on jet engines it manufactures.23
Because data warehouses are used for decision making, maintaining a high quality of data is vital so that organizations avoid wrong conclusions. For instance, duplicated or missing information will produce incorrect or mis- leading statistics (“garbage in, garbage out”). Due to the wide range of possi- ble data inconsistencies and the sheer data volume, data quality is considered one of the biggest issues in data warehousing.
Data warehouses are continuously refreshed with huge amounts of data from a variety of sources so the probability that some of the sources contain “dirty data” is high. The ETL (extract, transform, load) process takes data from a variety of sources, edits and transforms it into the format used in the data warehouse, and then loads this data into the warehouse, as shown in Figure 5.23. This process is essential in ensuring the quality of the data in the data warehouse.
● Extract. Source data for the data warehouse comes from many sources and may be represented in a variety of formats. The goal of this process is to extract the source data from all the various sources and convert it into a single format suitable for processing. During the extract step, data that fails to meet expected patterns or values may be rejected from further processing (e.g., blank or nonnumeric data in net sales field or a product code outside the defined range of valid codes).
FIGURE 5.23 Elements of a data warehouse A data warehouse can help managers and executives relate information in innovative ways to make better decisions.
Data extraction process
Transform and load
process
Data warehouse
Query and analysis tools
End-user access
Flat
files
Spreadsheets
Relational
databases
CHAPTER 5 • Database Systems and Big Data 223
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
● Transform. During this stage of the ETL process, a series of rules or algorithms are applied to the extracted data to derive the data that will be stored in the data warehouse. A common type of transformation is to convert a customer’s street address, city, state, and zip code to an organization-assigned sales district or government census tract. Also, data is often aggregated to reduce the processing time required to create antic- ipated reports. For example, total sales may be accumulated by store or sales district.
● Load. During this stage of the ETL process, the extracted and transformed data is loaded into the data warehouse. As the data is being loaded into the data warehouse, new indices are created and the data is checked against the constraints defined in the database schema to ensure its quality. As a result, the data load stage for a large data warehouse can take days.
A large number of software tools are available to support these ETL tasks, including Ab Initio, IBM InfoSphereDatastage, Oracle Data Integrator, and the SAP Data Integrator. Several open-source ETL tools are also available, includ- ing Apatar, Clover ETL, Pentaho, and Talend. Unfortunately, much of the ETL work must be done by low-level proprietary programs that are difficult to write and maintain.
Data Marts A data mart is a subset of a data warehouse. Data marts bring the data ware- house concept—online analysis of sales, inventory, and other vital business data that have been gathered from transaction processing systems—to small- and medium-sized businesses and to departments within larger companies. Rather than store all enterprise data in one monolithic database, data marts contain a subset of the data for a single aspect of a company’s business—for example, finance, inventory, or personnel.
Data Lakes A traditional data warehouse is created by extracting (and discarding some data in the process), transforming (modifying), and loading incoming data for predetermined and specific analyses and applications. This process can be lengthy and computer intensive, taking days to complete. A data lake (also called an enterprise data hub) takes a “store everything” approach to big data, saving all the data in its raw and unaltered form. The raw data residing in a data lake is available when users decide just how they want to use the data to glean new insights. Only when the data is accessed for a specific anal- ysis is it extracted from the data lake, classified, organized, edited, or trans- formed. Thus a data lake serves as the definitive source of data in its original, unaltered form. Its contents can include business transactions, clickstream data, sensor data, server logs, social media, videos, and more.
NoSQL Databases A NoSQL database provides a means to store and retrieve data that is mod- eled using some means other than the simple two-dimensional tabular rela- tions used in relational databases. Such databases are being used to deal with the variety of data found in big data and Web applications. A major advantage of NoSQL databases is the ability to spread data over multiple servers so that each server contains only a subset of the total data. This so-called horizontal scaling capability enables hundreds or even thousands of servers to operate on the data, providing faster response times for queries and updates. Most relational database management systems have problems with such horizontal scaling and instead require large, powerful, and expensive proprietary servers and large storage systems.
data mart: A subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making.
data lake (enterprise data hub): A “store everything” approach to big data that saves all the data in its raw and unaltered form.
NoSQL database: A way to store and retrieve data that is modeled using some means other than the simple two- dimensional tabular relations used in relational databases.
224 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Another advantage of NoSQL databases is that they do not require a predefined schema; data entities can have attributes edited or assigned to them at any time. If a new entity or attribute is discovered, it can be added to the database dynamically, extending what is already modeled in the database.
Most NoSQL databases do not conform to true ACID properties when pro- cessing transactions. Instead they provide for “eventual consistency” in which database changes are propagated to all nodes eventually (typically within milliseconds), so it is possible that user queries for data might not return the most current data.
The choice of a relational database management system versus a NoSQL solution depends on the problem that needs to be addressed. Often, the data structures used by NoSQL databases are more flexible than relational database tables and, in many cases, they can provide improved access speed and redundancy.
The four main categories of NoSQL databases and offerings for each cate- gory are shown in Table 5.5 and summarized below. Note that some NoSQL database products can meet the needs of more than one category.
● Key–value NoSQL databases are similar to SQL databases, but have only two columns (“key” and “value”), with more complex information some- times stored within the “value” columns.
● Document NoSQL databases are used to store, retrieve, and manage document-oriented information, such as social media posts and multime- dia, also known as semi-structured data.
● Graph NoSQL databases are used to understand the relationships among events, people, transactions, locations, and sensor readings and are well- suited for analyzing interconnections such as when extracting data from social media.
● Column NoSQL databases store data in columns, rather than in rows, and are able to deliver fast response times for large volumes of data.
Criteo is a digital-advertising organization serving up ads to over one billion unique Internet users around the world every month. The firm automates the recommendation of ads and the selection of products from advertiser catalogs up to 30 billion times each day. A recommendation can require a calculation involving some 100 variables, and it must be completed quickly—within 100 milliseconds or less. Criteo has deployed a Couchbase Server NoSQL data- base across 1,000 servers grouped into 24 clusters, providing access to a total of 107 terabytes of database storage to meet these demanding processing requirements.24
The National Security Agency (NSA), through its controversial PRISM program, uses NoSQL technology to analyze email messages, phone conver- sations, video chats, and social media interactions gleaned from the servers
TABLE 5.5 Popular NoSQL database products, by category Key–Value Document Graph Column
HyperDEX Lotus Notes Allegro Accumulo
Couchbase Server Couchbase Server Neo4J Cassandra
Oracle NoSQL Database
Oracle NoSQL Database
InfiniteGraph Druid
OrientDB OrientDB OrientDB Vertica
MongoDB Virtuoso HBase
CHAPTER 5 • Database Systems and Big Data 225
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
of major service providers, including Apple, Facebook, Google, Microsoft, Skype, Yahoo, and YouTube. The Accumulo NoSQL database enables its users to assign each piece of data a security tag that defines how people can access that data and who can access that data. This feature makes it possible for NSA agents to interrogate certain details while blocking access to person- ally identifiable information.25
Amazon DynamoDB is a NoSQL database that supports both document and key–value store models. MLB Advanced Media (MLBAM) uses DynamoDB to power its revolutionary Player Tracking System, which reveals detailed information about the nuances and athleticism of the game. Fans, broadcas- ters, and teams are finding this new data entertaining and useful. The system takes in data from ballparks across North America and provides enough com- puting power to support real-time analytics and produce results in seconds.26
Hadoop Hadoop is an open-source software framework that includes several software modules that provide a means for storing and processing extremely large data sets, as shown in Figure 5.24. Hadoop has two primary components: a data processing component (a Java-based system called MapReduce, which is dis- cussed in the next section) and a distributed file system (Hadoop Distributed File System, HDFS) for data storage. Hadoop divides data into subsets and distributes the subsets onto different servers for processing. A Hadoop cluster may consist of thousands of servers. In a Hadoop cluster, a subset of the data within the HDFS and the MapReduce system are housed on every server in the cluster. This places the data processing software on the same servers where the data is stored, thus speeding up data retrieval. This approach creates a highly redundant computing environment that allows the application to keep running even if individual servers fail.
FIGURE 5.24 Hadoop environment Hadoop can be used as a staging area for data to be loaded into a data warehouse or data mart.
Historical data from legacy system
Data from external source #1
Data from external source #2
Data from Facebook
Data from ERP system
Data from CRM system
Hadoop cluster running on 100s of servers
Data mart
Data mart
Data warehouse
Data from sensors on production floor
Data from visitors to organization’s Web site
Server #2
Server #3
Server #n
Server #1
Hadoop: An open-source software framework including several software modules that provide a means for storing and processing extremely large data sets.
Hadoop Distributed File System (HDFS): A system used for data storage that divides the data into subsets and distributes the subsets onto different servers for processing.
226 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
A MapReduce program is composed of a Map procedure that performs filtering and sorting (such as sorting customer orders by product ID into queues, with one queue for each product ID) and a Reduce method that performs a summary operation (such as counting the number of orders in each queue, thus determining product ID frequencies). MapReduce employs a JobTracker that resides on the Hadoop master server as well as TaskTrackers that sit on each server within the Hadoop cluster of servers. The JobTracker divides the comput- ing job up into well-defined tasks and moves those tasks out to the individual TaskTrackers on the servers in the Hadoop cluster where the needed data resides. These servers operate in parallel to complete the necessary computing. Once their work is complete, the resulting subset of data is reduced back to the central node of the Hadoop cluster.
For years, Yahoo! used Hadoop to better personalize the ads and articles that its visitors see. Now Hadoop is used by many popular Web sites and ser- vices (such as eBay, Etsy, Twitter, and Yelp). Verizon Wireless uses big data to perform customer churn analysis to get a better sense of when a customer becomes dissatisfied. Hadoop allows Verizon to include more detailed data about each customer, including clickstream data, chats, and even social media searches, to predict when a customer might switch to a new carrier.
Hadoop has a limitation in that it can only perform batch processing; it cannot process real-time streaming data such as stock prices as they flow into the various stock exchanges. However, Apache Storm and Apache Spark are often integrated with Hadoop to provide real-time data processing. Apache Storm is a free and open source distributed real-time computation system. Storm makes it easy to reliably process unbounded streams of data. Apache Spark is a framework for performing general data analytics in a distributed computing cluster environment like Hadoop. It provides in memory computa- tions for increased speed of data processing. Both Storm and Spark run on top of an existing Hadoop cluster and access data in a Hadoop data store (HDFS).
Medscape MedPulse is a medical news app for iPhone and iPad users that enables healthcare professionals to stay up-to-date on the latest medical news and expert perspectives. The app uses Apache Storm to include an automatic Twitter feed (about 500 million tweets per day are tweeted on Twitter) to help users stay informed about important medical trends being shared in real time by physicians and other leading medical commentators.27,28
In-Memory Databases An in-memory database (IMDB) is a database management system that stores the entire database in random access memory (RAM). This approach provides access to data at rates much faster than storing data on some form of secondary storage (e.g., a hard drive or flash drive) as is done with tradi- tional database management systems. IMDBs enable the analysis of big data and other challenging data-processing applications, and they have become feasible because of the increase in RAM capacities and a corresponding decrease in RAM costs. In-memory databases perform best on multiple multi- core CPUs that can process parallel requests to the data, further speeding access to and processing of large amounts of data.29 Furthermore, the advent of 64-bit processors enabled the direct addressing of larger amounts of main memory. Some of the leading providers of IMDBs are shown in Table 5.6.
KDDI Corporation is a Japanese telecommunications company that pro- vides mobile cellular services for some 40 million customers. The company consolidated 40 existing servers into a single Oracle SuperCluster running the Oracle Times Ten in-memory database to make its authentication system that manages subscriber and connectivity data run faster and more efficiently. This move reduced its data center footprint by 83 percent and power con- sumption by 70 percent while improving the overall performance and
MapReduce program: A compos- ite program that consists of a Map procedure that performs filtering and sorting and a Reduce method that performs a summary operation.
in-memory database (IMDB): A database management system that stores the entire database in random access memory (RAM).
CHAPTER 5 • Database Systems and Big Data 227
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
availability of the system. As a result, system costs were reduced and customer service improved.30
Telefonica Brasil Telefonica Brasil is one of the largest telecommunications companies in Brazil, and it provides landline and mobile services under the brand name Vivo for mil- lions of consumers. The company is considering using big data to perform cus- tomer churn analysis in order to anticipate when a customer is unhappy and likely to drop its service for that of a competitor.
Review Questions 1. What sources of data might Telefonica Brasil use to perform customer churn
analysis? 2. What database technology options might the firm elect to use?
Critical Thinking Questions 1. Why is it unlikely that a traditional SQL database would be able to meet the
firm’s needs? 2. In addition to a database management system, what other information system
technology and resources are likely needed for this type of project?
Summary
Principle: The database approach to data management has become broadly accepted.
Data is one of the most valuable resources that a firm possesses. It is orga- nized into a hierarchy that builds from the smallest element to the largest. The smallest element is the bit, a binary digit. A byte (a character such as a letter or numeric digit) is made up of eight bits. A group of characters, such as a name or number, is called a field (an object). A collection of related fields is a record; a collection of related records is called a file. The database, at the top of the hierarchy, is an integrated collection of records and files.
An entity is a generalized class of objects (such as a person, place, or thing) for which data is collected, stored, and maintained. An attribute is a character- istic of an entity. Specific values of attributes—called data items—can be found in the fields of the record describing an entity. A data key is a field within a record that is used to identify the record. A primary key uniquely identifies a record, while a secondary key is a field in a record that does not uniquely identify the record.
TABLE 5.6 IMDB providers Database Software Manufacturer Product Name Major Customers
Altibase HDB E*Trade, China Telecom
Oracle Times Ten Lockheed Martin, Verizon Wireless
SAP High-Performance Analytic Appliance (HANA)
eBay, Colgate
Software AG Terracotta Big Memory AdJuggler
228 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Principle: Data modelling is a key aspect of organizing data and information.
When building a database, an organization must consider content, access, logical structure, physical organization, archiving, and security of the database. One of the tools that database designers use to show the logical structure and relationships among data is a data model. A data model is a map or diagram of entities and their relationships. Enterprise data modeling involves analyzing the data and information needs of an entire organization and provides a roadmap for building database and information systems by creating a single definition and format for data that can ensure compatibility and the ability to exchange and integrate data among systems. Entity-relationship (ER) diagrams can be used to show the relationships among entities in the organization.
The relational database model places data in two-dimensional tables. Tables can be linked by common data elements, which are used to access data when the database is queried. Each row in a relational database table represents a record, and each column represents an attribute (or field). The allowable values for each attribute are called the attribute’s domain. Basic data manipulations include selecting, projecting, joining, and linking. The rela- tional model is easier to control, more flexible, and more intuitive than other database models because it organizes data in tables.
Data cleansing is the process of detecting and then correcting or deleting incomplete, incorrect, inaccurate, or irrelevant records that reside in the data- base. The goal of data cleansing is to improve the quality of the data used in decision making.
Principle: A well-designed and well-managed database is an extremely valuable tool in supporting decision making.
A database management system (DBMS) is a group of programs used as an interface between a database and its users and between a database and other application programs. When an application program requests data from the database, it follows a logical access path. The actual retrieval of the data follows a physical access path. Records can be considered in the same way: A logical record is what the record contains; a physical record is where the record is stored on storage devices. Schemas are used to describe the entire database, its record types, and its relationships to the DBMS. Schemas are entered into the computer via a data definition language, which describes the data and relationships in a specific database. Another tool used in database management is the data dictio- nary, which contains detailed descriptions of all data in the database.
A DBMS provides four basic functions: offering user views, creating and modifying the database, storing and retrieving data, and manipulating data and generating reports. After a DBMS has been installed, the database can be accessed, modified, and queried via a data manipulation language. A type of specialized data manipulation language is the query language, the most com- mon being Structured Query Language (SQL). SQL is used in several popular database packages today and can be installed in PCs and mainframes.
A database administrator (DBA) plans, designs, creates, operates, secures, monitors, and maintains databases. A data administrator is a person position responsible for defining and implementing consistent principles for a variety of data issues, including setting data standards and data definitions that apply across all the databases in an organization.
Selecting a DBMS begins by analyzing the information needs of the organi- zation. Important characteristics of databases include the size of the database, the number of concurrent users, the performance of the database, the ability of the DBMS to be integrated with other systems, the features of the DBMS, the vendor considerations, and the cost of the database management system.
CHAPTER 5 • Database Systems and Big Data 229
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
In database as a service (DaaS) arrangement, the database is stored on a service provider’s servers and accessed by the subscriber over a network, typi- cally the Internet. In DaaS, database administration is provided by the service provider.
Principle: We have entered an era where organizations are grappling with a tremen- dous growth in the amount of data available and struggling how to man- age and make use of it.
“Big data” is the term used to describe data collections that are so enor- mous and complex that traditional data management software, hardware, and analysis processes are incapable of dealing with them.
There are many challenges associated with big data, including how to choose what subset of data to store, where and how to store the data, how to find those nuggets of data that are relevant to the decision making at hand, how to derive value from the relevant data, and how to identify which data needs to be protected from unauthorized access.
Data management is an integrated set of 10 functions that defines the pro- cesses by which data is obtained, certified fit for use, stored, secured, and pro- cessed in such a way as to ensure that the accessibility, reliability, and timeliness of the data meet the needs of the data users within an organization. Data governance is the core component of data management; it defines the roles, responsibilities, and processes for ensuring that data can be trusted and used by the entire organization with people identified and in place who are responsible for fixing and preventing issues with data.
Principle: A number of available tools and technologies allow organizations to take advantage of the opportunities offered by big data.
Traditional online transaction processing (OLTP) systems put data into databases very quickly, reliably, and efficiently, but they do not support the types of data analysis that today’s businesses and organizations require. To address this need, organizations are building data warehouses specifically designed to support management decision making.
An extract, transform, load process takes data from a variety of sources, edits and transforms it into the format to be used in the data warehouse, and then loads the data into the warehouse.
Data marts are subdivisions of data warehouses and are commonly devoted to specific purposes or functional business areas.
A data lake (also called an enterprise data hub) takes a “store everything” approach to big data, saving all the data in its raw and unaltered form.
A NoSQL database provides a means to store and retrieve data that is mod- elled using some means other than the simple two-dimensional tabular rela- tions used in relational databases. There are four types of NoSQL databases— key-value, document, graph, and column.
Hadoop is an open-source software framework that includes several soft- ware modules that provide a means for storing and processing extremely large data sets. Hadoop has two primary components—a data processing component (MapReduce) and a distributed file system (Hadoop Distributed File System or HDFS) for data storage. Hadoop divides data into subsets and distributes the subsets onto different servers for processing. A Hadoop cluster may consist of thousands of servers. A subset of the data within the HDFS and the MapReduce system are housed on every server in the cluster.
An in-memory database (IMDB) is a database management system that stores the entire database in random access memory to improve storage and retrieval speed.
230 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Key Terms
ACID properties
attribute
bit
character
concurrency control
data administrator
data cleansing (data cleaning or data scrubbing)
data definition language (DDL)
data dictionary
data governance
data item
data lake (enterprise data hub)
data lifecycle management (DLM)
data management
data manipulation language (DML)
data mart
data model
data steward
database
database administrators (DBAs)
database approach to data management
database as a service (DaaS)
database management system (DBMS)
domain
enterprise data model
entity
entity-relationship (ER) diagram
field
file
Hadoop
Hadoop Distributed File System (HDFS)
hierarchy of data
in-memory database (IMDB)
joining
linking
MapReduce program
NoSQL database
primary key
projecting
record
relational database model
schema
selecting
SQL
Chapter 5: Self-Assessment Test
The database approach to data management has become broadly accepted.
1. A field or set of fields that uniquely identifies a record in a database is called a(n) . a. attribute b. data item c. record d. primary key
2. The key concept of the database approach to data management is that . a. all records in the database are stored in a two-
dimensional table b. multiple information systems share access to a
pool of related data c. only authorized users can access the data d. a database administrator “owns” the data
Data modeling is a key aspect of organizing data and information.
3. A(n) provides an organizational- level roadmap for building databases and infor- mation systems by creating a single definition and format for data.
a. database b. enterprise data model c. entity relationship diagram d. database management system
4. The model is a simple but highly useful way to organize data into collections of two-dimensional tables called relations.
5. The ability to combine two or more tables through common data attributes to form a new table with only the unique data attributes is called .
6. SQL databases conform to ACID properties, which include atomicity, consistency, isolation, and .
A well-designed and well-managed database is an extremely valuable tool in supporting decision making.
7. The process of detecting and then correcting or deleting incomplete, incorrect, inaccurate, or irrelevant records that reside in a database is called .
CHAPTER 5 • Database Systems and Big Data 231
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
8. Because the DBMS is responsible for providing access to a database, one of the first steps in installing and using a relational database involves “telling” the DBMS the logical and physical structure of the data and relationships among the data in the database. This description of an entire database is called a(n) .
9. A(n) is an individual responsible for the management of critical data elements, includ- ing identifying and acquiring new data sources; creating and maintaining consistent reference data and master data definitions; and analyzing data for quality and reconciling data issues.
10. Data administrators are skilled and trained IS professionals who hold discussions with users to define their data needs; apply database program- ming languages to craft a set of databases to meet those needs; and assure that data is secure from unauthorized access. True or False?
11. With , the database is stored on a service provider’s servers and accessed by the service subscriber over the Internet, with the database administration handled by the service provider.
We have entered an era where organizations are grappling with a tremendous growth in the amount of data available and struggling to understand how to manage and make use of it.
12. Three characteristics associated with big data include volume, velocity, and .
13. The Data Management Association has defined 10 major functions of data management, with the core component being . a. data quality management b. data security management c. data governance d. data architecture management
A number of available tools and technologies allow organizations to take advantage of the opportu- nities offered by big data.
14. A(n) database provides a means to store and retrieve data that is modeled using some means other than simple two-dimensional relations used in relational databases.
15. Hadoop has two primary components—a data processing component and a distributed file sys- tem called . a. MapReduce and HDFS b. TaskTracker and JobTracker c. Key-value and graph d. SQL and NoSQL
16. An is a database management sys- tem that stores the entire database in random access memory to provide fast access.
Chapter 5: Self-Assessment Test Answers
1. d 2. b 3. b 4. relational database 5. linking 6. durability 7. data cleansing, data cleaning, or data scrubbing 8. schema
9. data steward 10. False 11. database as a service (DaaS) 12. variety 13. c 14. NoSQL 15. a 16. in-memory database
Review Questions
1. How would you define the term “database”? How would you define the term “database manage- ment system”?
2. In the hierarchy of data, what is the difference between a data attribute and a data item? What is the domain of an attribute?
3. What is meant by the database approach to data management?
4. What is meant by data archiving? Why is this an important consideration when operating a database?
5. What is an entity-relationship diagram, and what is its purpose?
6. Identify four basic data manipulations performed on a relational database using SQL.
7. What is data scrubbing? 8. What is database as a service (DaaS)? What are
the advantages and disadvantages of using the DaaS approach?
9. What is Hadoop? What are its primary compo- nents, and what does each do?
10. What is a schema, and how is it used? 11. What is concurrency control? Why is it
important? 12. What is in-memory database processing, and
what advantages does it provide?
232 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
13. What is the difference between projecting and joining?
14. What is big data? Identify three characteristics associated with big data.
15. What is a data warehouse, and how is it different from a traditional database used to support OLTP?
16. What is a data lake, and how is it different from a data warehouse?
17. How does an in-memory database provide fast access to data?
Discussion Questions
1. What concerns might be raised by performing data cleansing on a large set of raw data before it is used for analysis? How might these concerns be addressed?
2. Outline some specific steps an organization might take to perform data cleansing to ensure the accuracy and completeness of its customer data- base before adding this data to a data warehouse. How would you decide when the data is accurate enough?
3. SQL databases conform to ACID properties. Briefly describe the ACID properties, and state the purpose of each. How does conformance to ACID properties affect the performance of SQL databases?
4. Describe how a NoSQL database differs from a relational database. Identify and briefly discuss the four types of NoSQL databases.
5. Review Table 5.4, which provides a list of rules, regulations, and standards with which U.S. infor- mation systems organizations must comply. Which of these standards do you think has the most impact on safeguarding the security of per- sonal information? Which of these standards have minimal impact on you personally?
6. Identify and briefly describe the steps in the ETL process. What is the goal of the ETL process?
7. Consider three organizations that have databases that likely store information about you—the Fed- eral Internal Revenue Service, your state’s Bureau of Motor Vehicles, and Equifax, the consumer reporting agency. Go to the home page of each of these organizations, and find answers to the fol- lowing questions. How is the data in each data- base captured? Is it possible for you to request a printout of the contents of your data record from each database? Is it possible for you to correct errors you find in your data record? What data privacy concerns do you have concerning how these databases are managed?
8. Identity theft, where people steal personal infor- mation, continues to be a problem for consumers and businesses. Assume that you are the database administrator for a corporation with a large data- base that is accessible from the Web. What steps would you implement to prevent people from stealing personal information from the corporate database?
9. Read the article “Why ‘Big Data’ Is a Big Deal” by Jonathan Shaw in the March-April 2014 Harvard Magazine. What does Shaw think is the revolu- tion in big data? Which of the many big data applications that he mentions do you find to be the most interesting? Why?
Problem-Solving Exercises
1. Develop a simple data model for a student data- base that includes student contact data, student demographic data, student grades data, and stu- dent financial data. Determine the data attributes that should be present in each table, and identify the primary key for each table. Develop a com- plete ER diagram that shows how these tables are related to one another.
2. A company that provides a movie-streaming sub- scription service uses a relational database to store information on movies to answer customer questions. Each entry in the database contains the following items: Movie ID (the primary key), movie title, year made, movie type, MPAA rating, starring actor #1, starring actor #2, starring
actor #3, and director. Movie types are action, comedy, family, drama, horror, science fiction, and western. MPAA ratings are G, PG, PG-13, R, NC-17, and NR (not rated). Using a graphics pro- gram, develop an entity-relationship diagram for a database application for this database.
3. Use a database management system to build a data-entry screen to enter this data. Build a small database with at least a dozen entries.
4. To improve service to their customers, the employees of the movie-streaming company have proposed several changes that are being consid- ered for the database in the previous exercise. From this list, choose two database modifications, and then modify the data-entry screen to capture
CHAPTER 5 • Database Systems and Big Data 233
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
and store this new information. The proposed changes are as follows:
a. Add the date that the movie was first released to the theaters.
b. Add the executive producer’s name.
c. Add a customer rating of one, two, three, four, or five stars, based on the number of rentals.
d. Add the number of Academy Award nominations.
Team Activities
1. Imagine that you and your team have been hired to develop an improved process for evaluating which students should be accepted to your col- lege and, of those, which should be awarded academic scholarships. What data besides college entrance scores and high school transcripts might you consider using to make these determinations? Where might you get this data? Develop an ER diagram showing the various tables of data that might be used.
2. You and your team have been selected to repre- sent the student body in defining the user requirements for a new student database for your school. What actions would you take to ensure that the student reporting needs and data privacy concerns of the students are fully identified? What
other resources might you enlist to help you in defining these requirements?
3. As a team of three or four classmates, interview managers from three different organizations that have implemented a customer database. What data entities and data attributes are contained in each database? What database management sys- tem did each company select to implement its database, and why? How does each organization access its database to perform analysis? Have the managers and their staff received training in any query or reporting tools? What do they like about their databases, and what could be improved? Weighing the information obtained, identify which company has implemented the best cus- tomer database.
Web Exercises
1. Do research to find out more about the contro- versial NSA PRISM program. What is the source of data for this program? What is the purpose of the program? Are you a supporter of the PRISM program? Why or why not?
2. Do research to find an example of an organiza- tion struggling to deal with the rapid growth of
the big data it needs for decision making. What are the primary issues it is facing? What is the organization doing to get a good grip on data management and data governance?
3. Do research to find three different estimates of the rate at which the amount of data in our digital uni- verse is growing. Discuss why these estimates differ.
Career Exercises
1. Describe the role of a database administrator. What skills, training, and experiences are neces- sary to fulfill this role? How does this differ from the role of a data administrator? What about the
role of a data steward? Is any one of these roles of interest to you? Why or why not?
2. How could you use big data to do a better job at work? Give some specific examples of how you might use big data to gain valuable new insights.
Case Studies
Case One
WholeWorldBand: Digital Recording Studio WholeWorldBand is a collaborative online music and video platform that enables anyone to collaborate with others to create music videos. The service was founded by Kevin Godley, a musician and music video director, and is accessible via a Web-based app available on the iPhone and iPad and on Windows and MacOS computers. Anyone can
contribute to WholeWorldBand using just the camera and microphone in their computer or mobile device. The service enables users—whatever their level of musical ability—to record and perform with music legends and friends. Using WholeWorldBand, you can start a video-recording session that others may join, create your own personal video mix with up to six performers, and then share the results with your friends and fans via Facebook, Twitter, or YouTube. Users can also pay to collaborate with other musicians who
234 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
have posted their own content. Collaborating on a project might mean providing new audio or video components or remixing existing ones.
WholeWorldBand uses a sophisticated digital rights management system to ensure that artists earn revenue for the work they contribute—if your work gets used, you get paid. WholeWorldBand provides users the opportunity to perform and record with popular artists. A number of major recording artists have already uploaded tracks including The Edge (U2), Ronnie Wood (Rolling Stones), Taylor Hawkins (Foo Fighters), Stewart Copeland (The Police), Liam Ó Maonlaí (Hot House Flowers), Michael Bublé, Phil Manzanera (Roxy Music), Dave Stewart (Eurythmics), and Danny O’Reilly (The Coronas).
The platform generates revenue from registered users who purchase subscriptions (or sessions) and from royalties paid by third parties in situations where users have shared and distributed content using the app or the Web site. Each session artist is entitled to receive a share of the revenue generated when other registered users purchase sessions for the purpose of creating contributions and/or mixes in relation to their original track. Keeping track of contributing artists, royalty payments, and the necessary revenue splits among artists, third parties, and WholeWorldBand can become quite detailed and tedious.
Critical Thinking Questions: 1. Identify some of the challenges associated with build-
ing an information system infrastructure to support this new service. Would cloud computing be an appropriate solution to address these challenges? Why or why not?
2. Would WholeWorldBand be likely to employ SQL, NoSQL, or a mix of both kinds of databases? Explain your answer.
3. Go to the WholeWorldBand Web site at www.whole worldband.com/about, and find its Terms of Use. Summarize the measures outlined to protect the unauthorized use of copyrighted material. Do you think these measures are adequate? Why or why not?
SOURCES: “WholeWorldBand” YouTube video, 0:33, www. youtube.com/user/WholeWorldBand, accessed October 7, 2015; “EnterpriseDB’sPostgres Plus Cloud Database Strikes a Chord with WholeWorldBand,” EntrepriseDB, www .enterprisedb.com/success-stories/enterprisedb-s-postgres- plus-cloud-database-strikes-chord-wholeworldband, accessed October 7, 2015; John, “WholeWorldBand Wins “Buma Music Meets Tech” Award at EurosonicNoorsderslag in Holland,” Irish Tech News, January 18, 2014, http://irishtechnews.net /ITN3/wholeworldband-wins-buma-music-meets-tech-award -at-eurosonic-noorsderslag-in-holland; “WholeWorldBand Terms of Use,” WholeWorldBand, www.wholeworldband .com/about, accessed October 7, 2015.
Case Two
Mercy’s Big Data Project Aims to Boost Operations Making the most of the data it collects is a challenge for any organization, and those in the healthcare industry are no exception. Based in St. Louis, Missouri, Mercy health system includes 46 acute care and specialty hospitals, with more
than 700 outpatient facilities and physician practices in Arkansas, Kansas, Missouri, and Oklahoma. With more than 40,000 employees, including over 2,000 physicians, Mercy’s vision is to deliver a “transformative health experience” through a new model of care. With such ambitious goals, Mercy has a compelling interest in harnessing the power of the data it collects. To do so, the health system needed to overhaul is data-management infrastructure and move into the world of big data.
To make that move, Mercy partnered with software provider Hortonworks to create the Mercy Data Library, a Hadoop-based data lake that contains batch data as well as real-time data (stored in HBase, a distributed nonrelational database structure) from sources such as the Mercy’s ERP and electronic health record (EHR) systems. According to Paul Boal, director of data engineering and analytics at Mercy, “The blending of base batch data and real-time updates happens on demand when a query is run against the system.” Mercy’s new Hadoop environment, which contains information on more than 8 million patients, holds over 40 terabytes of data housed on 41 servers spread out over four clusters.
Outside of improving patient care, a primary motive for the move to Hadoop was to improve Mercy’s administrative efficiency, particularly in the areas of medical documentation and claims generation. Ensuring that physicians, nurses, and lab staff complete the necessary documentation for a patient prior to discharge improves the chances that the hospital will generate an accurate and complete claim-reimbursement request. Prior to its Hadoop implementation, the health system had already initiated an automatic-documentation-review process. Now, Mercy plans to make use of real-time data along with the power of Hadoop to further improve upon this process. For instance, documentation specialists can generate reports that help them follow up with physicians regarding missing documentation during each morning’s clinical rounds. The hospital expects the new system will generate more than $1 million annually in new revenue based on claims that accurately reflect hospital patients’ diagnoses and treatment.
Mercy is also focusing the power of its new technology on areas directly related to clinical care. “What we’re building out is a real-time clinical applications platform, so we’re looking for other opportunities to turn that into decision support,” says Boal. One such project involves leveraging the Hadoop environment to make better use of data generated by the electronic monitors in the intensive care units (ICUs) across the health system. Mercy now gathers 900 times more detailed data from its ICUs than it did before its implementation of Hadoop. The previous database system was only capable of pulling vital sign information for Mercy’s most critically ill patients every 15 minutes; the new system can do it once every second. The goal is to use the real-time data for better analysis, such as refining the health system’s predictive models on the early-warning signs of life- threatening medical problems in the ICU setting.
Like all healthcare providers, Mercy is required to maintain an audit trail for its EHR system. The audit trail keeps track of everyone who accesses any piece of patient information via the EHR. In addition to satisfying this regulatory requirement, Mercy expects Hadoop will help it put that audit trail data to a new use—analyzing staff behavior patterns and developing a better understanding of
CHAPTER 5 • Database Systems and Big Data 235
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
how processes actually get done. And in another Hadoop- related project, lab staff are now able to quickly search through terabytes worth of lab notes that were previously inaccessible.
Critical Thinking Questions 1. One of the advantages of a Hadoop implementation is
that it provides a high level of computing redundancy. Why is that particularly important in a healthcare setting?
2. Explain how the three characteristics of big data (volume, velocity, and variety) apply to the data being collected by healthcare providers such as Mercy.
3. How might Mercy benefit from an enterprise data model? Does Mercy’s move into big data make it more or less important that it have a clearly developed model that states the organizations’ data needs and priorities?
SOURCES: “Transforming the Health of Our Communities,” Mercy, www.mercy.net/about/transforming-the-health-of -our-communities, accessed December 16, 2015; “Towards a Healthcare Data Lake: Hadoop at Mercy,” Hortonworks, http://hortonworks.com/customers, accessed December 16, 2015; Wilson, Linda, “Mercy’s Big Data Project Aims to Boost Operations,” Information Management, October 7, 2015, www.information-management.com/news/big-data- analytics/mercy-health-big-data-project-aims-to-boost- operations-10027562-1.html; Perna, Gabriel, “Moving Data Down I-44 and Making it Actionable,” Healthcare Informatics, May 13, 2015, www.healthcare-informatics .com/article/moving-data-down-i-44-and-making-it -actionable; Henschen, Doug, “Hadoop’s Growing Enterprise Presence Demonstrated by Three Innovative Use Cases,” ZDNet, September 10, 2015, www.zdnet.com/article/hadoop -growing-enterprise-presence-demonstrated-by-three- innovative-use-cases;“Handling Electronic Health Records (EHR) Access Logs with Hadoop,” StampedeCon, http:// stampedecon.com/sessions/handling-access-logs-with -hadoop, accessed December 17, 2015.
Notes
1. Zettel, Jonathan, “Offshore Leaks Database Allows Pub- lic to Search Offshore Tax Haven Info,” CTV News, June 15, 2013, www.ctvnews.ca/business/offshore-leaks-data base-allows-public-to-search-offshore-tax-haven-info -1.1327088.
2. “FAQ on Lost/Stolen Devices,” CTIA Wireless Associa- tion, www.ctia.org/your-wireless-life/consumer-tips/how -to-deter-smartphone-thefts-and-protect-your-data/faq -on-lost-stolen-devices, accessed September 4, 2015.
3. “Overview of the GTD,” Global Terrorism Database, www.start.umd.edu/gtd/about, accessed September 4, 2015.
4. Hibbard, Katharine, “New Resource Available in Recover- ing Stolen Property,” Leads Online, August 14, 2015, www.leadsonline.com/main/news/2015-news-archive /new-resource-available-in-recovering-stolen-property.php.
5. “I-Nexus Selects AppDynamics to Drive Continuous Perfor- mance Improvement,”AppDynamics, www.appdynamics .com/case-study/inexus, accessed September 6, 2015.
6. “IBM Healthcare Provider Data Model,” IBM, www-03.ibm.com/software/products/en/healthcare-pro vider-data-model, accessed September 6, 2015.
7. “Complete Data Integration from Legacy Systems and Epic, In Half the Time,”Perficient, www.perficient.com /About/Case-Studies/2014/UNC-Health-Care-System-Com pletes-Data-Integration-from-Legacy-Systems-and-Epic -In-Half-the-Time, accessed September 6, 2015.
8. “Banco Popular,” Trillium Software, www.trilliumsoft ware.com/uploadedFiles/Banco_CS_2010_screen.pdf, accessed September 7, 2015.
9. Proffitt, Brian, “FoundationDB’s NoSQL Breakthrough Challenges Relational Database Dominance,” Read Write, March 8, 2013, http://readwrite.com/2013/03/08 /foundationdbs-nosql-breakthrough-challenges-rela tional-database-dominance#awesm=~oncfIkqw3jiMOJ.
10. Golden, Matt, “New DOE Effort to Standardize the Energy Efficiency Data Dictionary,” EDF Blogs, August 8,
2013, http://blogs.edf.org/energyexchange/2013/08/08 /new-doe-effort-to-standardize-the-energy-efficiency -data-dictionary.
11. “Three Charged for Largest-Ever Bank Data Breach,” CBS News, November 10, 2015, www.cbsnews.com/news/three- charged-for-jpmorgan-data-breach-the-largest-ever.
12. “About TinyCo,”TinyCo, www.tinyco.com/about-us, accessed September 8, 2015.
13. “AWS Case Study: TinyCo,” Amazon Web Services, http:// aws.amazon.com/solutions/case-studies/tinyco, accessed September 8, 2015.
14. Laney, Doug, “3D Data Management: Controlling Data Volume, Velocity, and Variety,” META Group, February 6, 2001, http://blogs.gartner.com/doug-laney/files/2012 /01/ad949-3D-Data-Management-Controlling-Data-Vol ume-Velocity-and-Variety.pdf.
15. Turner, Vernon, David Reinsel, John F. Gantz, and Ste- phen Minton, “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things,” EMC2, April 2014, www.emc.com/collateral /analyst-reports/idc-digital-universe-2014.pdf.
16. “Seminars about Long-Term Thinking,” The Long Now Foundation, http://longnow.org/seminars/02013/mar /19/no-time-there-digital-universe-and-why-things -appear-be-speeding, accessed November 8, 2013.
17. Rosenbaum, Steven, “Is It Possible to Analyze Digital Data If It’s Growing Exponentially?” Fast Company, January 13, 2013, www.fastcompany.com/3005128 /it-possible-analyze-digital-data-if-its-growing- exponentially.
18. Ibid. 19. Krantz, Matt, “Lack of Accuracy Can Wreak Havoc on
Stock Market,” USA Today-The Enquirer, September 12, 2015, p. 6B.
20. Harris, Derrick, “Why Apple, eBay, and Walmart Have Some of the Biggest Data Warehouses You’ve Ever Seen,” GIGAOM, March 27, 2013, https://gigaom.com
236 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
/2013/03/27/why-apple-ebay-and-walmart-have-some -of-the-biggest-data-warehouses-youve-ever-seen/.
21. Davenport, Thomas H. and Jill Dyché, “Big Data in Big Companies,” International Institute for Analytics, www .sas.com/reg/gen/corp/2266746, accessed April 1, 2015.
22. “Yormari Customer Success Stories: Orscheln Farm & Home,”Yomari, www.yomari.com/clients/orscheln-farm -home-case-study.php, accessed November 12, 2015.
23. Davenport, Thomas H. and Jill Dyché, “Big Data in Big Companies,” International Institute for Analytics, www. sas.com/reg/gen/corp/2266746, accessed April 1, 2015.
24. “Customer Story:Criteo Boosts Performance, Scale of Digital Ad Platform with Couchbase Server,”Couchbase, www.couchbase.com/case-studies/criteo.html, accessed September 19, 2015.
25. Henschen, Doug, “Defending NSA Prism’s Big Data Tools,” InformationWeek, June 11, 2013, www.informa tionweek.com/big-data/big-data-analytics/defending -nsa-prisms-big-data-tools/d/d-id/1110318?.
26. “AWS Case Study: MLB Advanced Media,” Amazon Web Services, aws.amazon.com/solutions/case-studies /major-league-baseball-mlbam, accessed November 12, 2015.
27. Dvorkin, Eugene, “Scalable Big Data Stream Processing with Storm and Groovy,” November 4, 2014, www.slide share.net/SpringCentral/storm-twtterwebmd.
28. “Press Release: WebMD Medscape,” Newswire, April 24, 2014, www.multivu.com/mnr/7040259-meds cape-launches-new-medpulse-app-for-iphone-and -ipad.
29. Brocke, Jan vom, “In-Memory Database Business Value,” Business Innovation, July 25, 2013. www.business2com munity.com/business-innovation/in-memory-database -business-value-0564636.
30. “Oracle Press Release:KDDI Selects Oracle SuperCluster to Strengthen Authentication System for Mobile Core Net- work and Support Rapid Data Growth,” Oracle, January 22, 2014, www.oracle.com/us/corporate/press/2111600.
CHAPTER 5 • Database Systems and Big Data 237
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
CHAPTER
6 Networks and Cloud Computing
Macrovector/Shutterstock.com
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Know?Did Yo u
• Auto insurers are testing usage-based insurance programs in which premiums are based on data gathered from a device installed in the insured auto. Instead of setting the premium based on traditional factors such as the driver’s age and gender, it is based on miles driven per year, where and when they drive, and how safely they drive. The goal is to charge a premium that is commensurate with the risk associated with auto owner’s driving habits.
• Sensors embedded in General Electric (GE) aircraft engines collect some 5,000 individual data points per second. This data is analyzed while the aircraft is in
flight to adjust the way the aircraft performs, thereby reducing fuel consumption. The data is also used to plan predictive maintenance on the engines based on engine component wear and tear. In 2013, this tech- nology helped GE earn $1 billion in incremental income by delivering performance improvements, less down- time, and more flying miles.
• It is estimated that in just one year, mobile operators lost $23 billion in revenue as teens shifted away from texting over cellular networks in favor of communicating with their friends over the Internet using instant messaging apps.1
Principles Learning Objectives
• A network has many fundamental components, which—when carefully selected and effectively integrated—enable people to meet personal and orga- nizational objectives.
• Identify and briefly describe three network topologies and four different network types, including the uses and limitations of each.
• Identify and briefly discuss several types of both guided and wireless communications.
• Identify several network hardware devices and define their functions.
• Together, the Internet and the World Wide Web provide a highly effective infrastructure for delivering and accessing information and services.
• Briefly describe how the Internet and the Web work, including various methods for connecting to the Internet.
• Outline the process and tools used in developing Web content and applications.
• List and briefly describe several Internet and Web applications.
• Explain how intranets and extranets use Internet tech- nologies, and describe how the two differ.
• Organizations are using the Internet of Things (IoT) to capture and analyze streams of sensor data to detect patterns and anomalies—not after the fact, but while they are occurring—in order to have a considerable impact on the event outcome.
• Define what is meant by the Internet of Things (IoT), and explain how it works.
• Identify and briefly discuss several practical applica- tions of the Internet of Things (IoT).
• Categorize and summarize several potential issues and barriers associated with the expansion of the Internet of Things (IoT).
• Cloud computing provides access to state-of-the-art technology at a fraction of the cost of ownership and without the lengthy delays that can occur when an organization tries to acquire its own resources.
• Discuss how cloud computing can increase the speed and reduce the costs of new product and service launches.
• Summarize three common problems organizations encounter in moving to the cloud.
• Discuss the pros and cons of private and hybrid cloud computing compared to public cloud computing.
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Why Learn about Networks and Cloud Computing? Today’s decision makers need to access data wherever it resides. They must be able to establish fast, reliable connections to exchange messages, upload and download data and software, route business transactions to processors, connect to databases and network services, and send output to wherever it is needed. Regardless of your chosen major or future career field, you will make use of the communications capabilities provided by networks, including the Internet, intranets, and extranets. This is especially true for those whose role is connected to the supply chain and who rely heavily on networks to support cooperation and communication among workers in inbound logistics, warehouse and storage, production, finished product storage, outbound logistics, and, most importantly, with customers, suppliers, and shippers. Many supply chain organizations make use of the Internet to purchase raw materials, parts, and supplies at competitive prices. All members of the supply chain must work together effectively to increase the value perceived by the customer, so partners must communicate well. Other employees in human resources, finance, research and development, marketing, manufacturing, and sales positions must also use communications technology to communicate with people inside and outside the organization. To be a successful member of any organization, you must be able to take advantage of the capabilities that these technologies offer you. This chapter begins by discussing the importance of effective communications.
As you read this chapter, consider the following:
• How are organizations using networks to support their business strategies and achieve organizational objectives?
• What benefits do search engines, social networks, and other Internet services provide to make orga- nizations successful?
In today’s high-speed global business world, organizations need always-on, always-connected computing for traveling employees and for network connections to their key business partners and customers. Forward-thinking organizations strive to increase revenue, reduce time to market, and enable collaboration with their suppliers, customers, and business partners by using networks. Here are just a few examples of organizations using networks to move ahead:
• Many retail organizations are launching their own mobile payment sys- tem, with the hopes of reducing payments to financial services organiza- tions while also increasing customer loyalty. Some of these new systems include Android Pay, Apple Pay, Chase Pay, PayPal, Paydiant, Samsung Pay, Urban Airship, and Walmart Pay.2
• Networks make it possible for you to access a wealth of educational material and earn certifications or an online degree. A wide range of courses are available online from such leading educational institutions as Cornell, Carnegie Mellon, Harvard, MIT, and Yale. Many educational orga- nizations such as Coursera, ed2Go, and Kahn Academy offer continuing education, certification programs, and professional development courses. Hundreds of schools such as DeVry, Kaplan University, University of Phoenix, and Strayer University enable students to earn online degrees.
• Levi Stadium, home of the San Francisco 49ers, is deploying new wireless technology to make it easier for fans to use a special stadium navigation app on their smartphones and other devices. With the app, fans can watch instant replays and order food directly from their mobile devices.3
• Telemedicine provides remote access to a physician via a network (typically via a phone or videoconference) to address a healthcare issue. Its use has become well established in rural areas for specialty consultations and even many primary care practices like pediatrics. There are currently about 200 telemedicine networks, with 3,500 service sites in the United States alone.4
240 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Advances in network technology allow us to communicate in real time with customers, clients, business partners, and coworkers almost anywhere in the world. Networks also reduce the amount of time required to transmit information necessary for driving and concluding business transactions.
Network Fundamentals
A computer network consists of communications media, devices, and soft- ware connecting two or more computer systems or devices. Communications media are any material substance that carries an electronic signal to support communications between a sending and a receiving device. The computers and devices on the networks are also sometimes called network nodes. Orga- nizations can use networks to share hardware, programs, and databases and to transmit and receive information, allowing for improved organizational effectiveness and efficiency. Networks enable geographically separated work- groups to share documents and opinions, which fosters teamwork, innovative ideas, and new business strategies. Effective use of networks can help a com- pany grow into an agile, powerful, and creative organization, giving it a long- term competitive advantage.
Network Topology Network topology is the shape or structure of a network, including the arrangement of the communication links and hardware devices on the net- work. The transmission rates, distances between devices, signal types, and physical interconnection may differ between networks, but they may all have the same topology. The three most common network topologies in use today are the star, bus, and mesh.
In a star network, all network devices connect to one another through a single central device called the hub node. See Figure 6.1. Many home networks employ the star topology. A failure in any link of the star net- work will isolate only the device connected to that link. However, should the hub fail, all devices on the entire network will be unable to communicate.
In a bus network, all network devices are connected to a common backbone that serves as a shared communications medium. See Figure 6.2. To communicate with any other device on the network, a device sends a broadcast message onto the communications medium. All devices on the net- work can “see” the message, but only the intended recipient actually accepts and processes the message.
FIGURE 6.1 Star network In a star network, all network devices connect to one another through a single central hub node.
computer network: The communi- cations media, devices, and software connecting two or more computer sys- tems or devices.
communications medium: Any material substance that carries an electronic signal to support communi- cations between a sending and a receiving device.
network topology: The shape or structure of a network, including the arrangement of the communication links and hardware devices on the network.
star network: A network in which all network devices connect to one another through a single central device called the hub node.
bus network: A network in which all network devices are connected to a common backbone that serves as a shared communications medium.
Vl ad ru /S hu tt er st oc k. co m
CHAPTER 6 • Networks and Cloud Computing 241
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Mesh networks use multiple access points to link a series of devices that speak to each other to form a network connection across a large area. See Figure 6.3. Communications are routed among network nodes by allowing for continuous connections and by bypassing blocked paths by “hopping” from node to node until a connection can be established. Mesh networks are very robust: if one node fails, all the other nodes can still communicate with each other, directly or through one or more intermediate nodes.
Saudi Telecom Company (STC) is the largest communications services provider in the Middle East and North Africa. STC recently deployed a mesh network that ensures network survivability even in the case of multiple outages with restoration of service within 50 milliseconds. The mesh network offers STC customers faster transmission speeds and improved network reliability.5
Network Types A network can be classified as personal area, local area, metropolitan, or wide area network depending on the physical distance between the nodes on the network and the communications and services it provides.
FIGURE 6.2 Bus network In a bus network, all network devices are connected to a common back- bone that serves as a shared com- munications medium.
FIGURE 6.3 Mesh network Mesh networks use multiple access points to link a series of devices that speak to each other to form a net- work connection across a large area.
mesh network: A network that uses multiple access points to link a series of devices that speak to each other to form a network connection across a large area.
M am
an am
sa i/ Sh ut te rs to ck .c om
Vl ad ru /S hu tt er st oc k. co m
242 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Personal Area Networks A personal area network (PAN) is a wireless network that connects informa- tion technology devices close to one person. With a PAN, you can connect a laptop, digital camera, and portable printer without cables. You can download digital image data from the camera to the laptop and then print it on a high- quality printer—all wirelessly. A PAN could also be used to enable data cap- tured by sensors placed on your body to be transmitted to your smartphone as input to applications that can serve as calorie trackers, heart monitors, glu- cose monitors, and pedometers.
Local Area Networks A network that connects computer systems and devices within a small area, such as an office, home, or several floors in a building is a local area net- work (LAN). Typically, LANs are wired into office buildings and factories, as shown in Figure 6.4. Although LANs often use unshielded twisted-pair copper wire, other media—including fiber-optic cable—is also popular. Increasingly, LANs use some form of wireless communications. You can build LANs to connect personal computers, laptop computers, or powerful mainframe computers.
A basic type of LAN is a simple peer-to-peer network that a small busi- ness might use to share files and hardware devices, such as printers. In a peer-to-peer network, you set up each computer as an independent com- puter, but you let other computers access specific files on its hard drive or share its printer. These types of networks have no server. Instead, each computer is connected to the next machine. Examples of peer-to-peer net- works include ANts, BitTorrent, StealthNet, Tixati, and Windows 10 Home- group. Performance of the computers on a peer-to-peer network is usually slower because one computer is actually sharing the resources of another computer.
Increasingly, home and small business networks are being set up to con- nect computers, printers, scanners, and other devices. A person working on one computer on a home network, for example, can use data and programs stored on another computer’s hard disk. In addition, several computers on the network can share a single printer.
FIGURE 6.4 Typical LAN All network users within an office building can connect to each other’s devices for rapid communication. For instance, a user in research and development could send a docu- ment from her computer to be printed at a printer located in the desktop publishing center. Most computer labs employ a LAN to enable the users to share the use of high-speed and/or color printers and plotters as well as to download soft- ware applications and save files.
Executive computers and devices
Production center computers and devices
Marketing and sales computers
and devices
Research and development
computers and devices
Finance and accounting computers and devices
Copy center, printing, and desktop publishing computers and devices
personal area network (PAN): A network that supports the interconnec- tion of information technology devices close to one person.
local area network (LAN): A net- work that connects computer systems and devices within a small area, such as an office, home, or several floors in a building.
CHAPTER 6 • Networks and Cloud Computing 243
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Metropolitan Area Networks A metropolitan area network (MAN) is a network that connects users and their computers in a geographical area that spans a campus or city. A MAN might redefine the many networks within a city into a single larger network or connect several LANs into a single campus MAN. Often, the MAN is owned either by a consortium of users or by a single network provider who sells the service to users. PIONIER is a Polish national research and education network created to provide high-speed Internet access and to conduct network-based research. The network connects 21 MANs and 5 high-performance computing centers using 6,467 km of fiber optic transmission media.6
Wide Area Networks A wide area network (WAN) is a network that connects large geographic regions. A WAN might be privately owned or rented and includes public (shared-users) networks. When you make a long-distance phone call or access the Internet, you are using a WAN. WANs usually consist of computer equip- ment owned by the user, together with data communications equipment and network links provided by various carriers and service providers.
WANs often provide communications across national borders, which involves national and international laws regulating the electronic flow of data across international boundaries, often called transborder data flow. Some countries have strict laws limiting the use of networks and databases, making normal business transactions such as payroll processing costly, slow, or extremely difficult.
Client/Server Systems In client/server architecture, multiple computer platforms are dedicated to special functions, such as database management, printing, communications, and program execution. These platforms are called servers. Each server is accessible by all computers on the network. Servers can be computers of all sizes; they store both application programs and data files and are equipped with operating system software to manage the activities of the network. The server distributes programs and data to the other computers (clients) on the network as they request them. An application server holds the programs and data files for a particular application, such as an inventory database. The cli- ent or the server might do the actual data processing.
A client is any computer (often a user’s personal computer) that sends messages requesting services from the servers on the network. A client can converse with many servers concurrently. Consider the example of a user at a personal computer who initiates a request to extract data that resides in a database somewhere on the network. A data request server intercepts the request and determines on which database server the data resides. The server then formats the user’s request into a message that the database server will understand. When it receives the message, the database server extracts and formats the requested data and sends the results to the client. The database server sends only the data that satisfies a specific query—not the entire file. When the downloaded data is on the user’s machine, it can then be analyzed, manipulated, formatted, and displayed by a program that runs on the user’s personal computer.
Channel Bandwidth Network professionals consider the capacity of the communications path or channel when they recommend transmission media for a network. Channel bandwidth refers to the rate at which data is exchanged, usually measured in bits per second (bps)—the broader the bandwidth, the more information can
metropolitan area network (MAN): A network that connects users and their computers in a geographical area that spans a campus or city.
wide area network (WAN): A network that connects large geographic regions.
client/server architecture: An approach to computing wherein multi- ple computer platforms are dedicated to special functions, such as database management, printing, communica- tions, and program execution.
channel bandwidth: The rate at which data is exchanged, usually measured in bits per second (bps).
244 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
be exchanged at one time. In the context of Internet access, the term broad- band communications refers to any high-speed Internet access that is always on and that is faster than traditional dial-up access. Most organizations need high bandwidth to accommodate the transaction volume and transmission speed required to carry out their daily functions.
Communications Media The communications media selected for a network depends on the amount of information to be exchanged, the speed at which data must be exchanged, the level of concern about data privacy, whether the users are stationary or mobile, and a variety of business requirements. Transmission media can be divided into two broad categories guided (also called wired) transmission media, in which communications signals are guided along a solid medium, and wireless, in which the communications signal is broadcast over airwaves as a form of electromagnetic radiation.
Guided Transmission Media Types There are many different guided transmission media types. Table 6.1 summarizes the guided media types by physical media form. The three most common guided transmission media types are shown in Figure 6.5.
10-Gigabit Ethernet is a standard for transmitting data at the speed of 10 billion bps for limited distances over high-quality twisted-pair wire. The 10-Gigabit Ethernet cable can be used for the high-speed links that connect groups of computers or to move data stored in large databases on large com- puters to stand-alone storage devices.
Chi-X Japan provides investors with an alternative venue for trading in Tokyo-listed stocks. Its goal is to attract new international investors, in turn,
TABLE 6.1 Guided transmission media types Media Form Description Advantages Disadvantages
Twisted-pair wire Twisted pairs of copper wire, shielded or unshielded; used for telephone service
Widely available Limitations on transmission speed and distance
Coaxial cable Inner conductor wire surrounded by insulation
Cleaner and faster data trans- mission than twisted-pair wire
More expensive than twisted-pair wire
Fiber-optic cable Many extremely thin strands of glass bound together in a sheath- ing; uses light beams to transmit signals
Diameter of cable is much smal- ler than coaxial cable; less dis- tortion of signal; capable of high transmission rates
Expensive to pur- chase and install
FIGURE 6.5 Types of guided transmission media Common guided transmission media include twisted-pair wire, coaxial cable, and fiber-optic cable.
Twisted-pair wire
Coaxial cable
Fiber-optic cable
broadband communications: High-speed Internet access that is always on and that is faster than tradi- tional dial-up access.
Kr as ow
it/ Sh ut te rs to ck .c om
Fl eg er e/ Sh ut te rs to ck .c om
G al us hk o Se rg ey /S hu tt er st oc k. co m
CHAPTER 6 • Networks and Cloud Computing 245
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
increasing overall Japanese market volumes, reducing transaction costs, and improving investment performance.7 The firm implemented 10 Gbps Ethernet network adapters to upgrade its network and ensure customers minimal trans- action processing delays.
Wireless Technologies Wireless communications coupled with the Internet are revolutionizing how and where we gather and share information, collaborate in teams, lis- ten to music or watch video, and stay in touch with our families and cow- orkers while on the road. With wireless capability, a coffee shop can become our living room and the bleachers at a ballpark can become our office. The many advantages and freedom provided by wireless communi- cations are causing many organizations to consider moving to an all- wireless environment.
Wireless communication is the transfer of information between two or more points that are not connected by an electrical conductor. All wireless communications signals are sent within a range of frequencies of the elec- tromagnetic spectrum that represents the entire range of light that exists from long waves to gamma rays as shown in Figure 6.6.
The propagation of light is similar to waves crossing an ocean. Like any other wave, light has two fundamental properties that describe it. One is its frequency, measured in hertz (Hz), which counts the number of waves that pass by a stationary point in one second. The second fundamental property is wavelength, which is the distance from the peak of one wave to the peak of the next. These two attributes are inversely related so the higher the frequency, the shorter the wavelength.
All wireless communication devices operate in a similar way. A trans- mitter generates a signal, which contains encoded voice, video, or data at a specific frequency, that is broadcast into the environment by an antenna. This signal spreads out in the environment, with only a very small portion being captured by the antenna of the receiving device, which then decodes the information. Depending on the distance involved, the frequency of the transmitted signal, and other conditions, the received signal can be incredibly weak, perhaps one trillionth of the original signal strength.
The signals used in wireless networks are broadcast in one of three fre- quency ranges: microwave, radio, and infrared, as shown in Table 6.2.
Because there are so many competing uses for wireless communica- tion, strict rules are necessary to prevent one type of transmission from interfering with the next. And because the spectrum is limited—there are only so many frequency bands—governments must oversee appropriate licensing of this valuable resource to facilitate use in all bands. In the United States, the Federal Communications Commission (FCC) decides which frequencies of the communications spectrum can be used for which purposes. For example, the portion of the electromagnetic spectrum between 700 MHz and 2.6 GHz has been allocated for use by mobile phones. Most of the spectrum in this range has already been allocated for use. This means that when a wireless company wants to add more spec- trum to its service to boost its capacity, it may have problems obtaining the necessary licenses because other companies are already using the available frequencies.
Some of the more widely used wireless communications options are dis- cussed next.
Near field communication (NFC) is a very short-range wireless connec- tivity technology that enables two devices placed within a few inches of each other to exchange data. With NFC, consumers can swipe their credit cards— or even their smartphones—within a few inches of NFC point-of-sale
wireless communication: The transfer of information between two or more points that are not connected by an electrical conductor.
near field communication (NFC): A very short-range wireless connectivity technology that enables two devices placed within a few inches of each other to exchange data.
246 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
terminals to pay for purchases. Apple Pay, the mobile payment and digital wallet service that lets users make payments using an iPhone, an iPad, or an Apple Watch–compatible device, uses NFC to communicate between the user’s device and the point-of-sale terminal.
Many retailers—including Target, Macys, and Walgreens—already have NFC-based contactless pay terminals in place. Shoppers in these stores can also use their smartphones and NFC to gain access to loyalty programs to earn points, view marketing information, and share content and interact with brands via social media.
Bluetooth is a wireless communications specification that describes how cell phones, computers, printers, and other electronic devices can be intercon- nected over distances of 10 to 30 feet at a transmission rate of about 2 Mbps.
FIGURE 6.6 The electromagnetic spectrum The range of all possible frequencies of electromagnetic radiation. Source: https//upload.wikimedia.org /wikipedia/commons/2/25/Electro magnetic-Spectrum.svg
1019
F re
qu en
cy (H
z)
W av
el en
gt h
Gamma-rays
1018
1017
1016
1015
1014
1013
1012
1011 1000 MHz
UHF
VHF 7-13
VHF 2-6
FM
500 MHz
100 MHz
1010
109
108
107
106 50 MHz
Long-waves 1000 m
100 m
10 m
1 m
10 cm
1 cm
1000 mm 1mm
100 mm
10 mm
1000 nm 1 mm
100 nm
400 nm
500 nm
600 nm
700 nm
10 nm
1 nm
1 Å 0.1 nm
0.1 Å
AM
Far IR
Thermal IR
Infra-red
Near IR Visible
Ultraviolet
X-rays
Radio, TV
Microwaves
Radar
Bluetooth: A wireless communica- tions specification that describes how cell phones, computers, faxes, printers, and other electronic devices can be interconnected over distances of 10 to 30 feet at a rate of about 2 Mbps.
CHAPTER 6 • Networks and Cloud Computing 247
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Using Bluetooth technology, users of multifunctional devices can synchronize data on their device with information stored in a desktop computer, send or receive faxes, and print. The Bluetooth G-Shock watch enables you to make a connection between your watch and your smartphone. With a G-shock watch, you can control your phone’s music player from the watch and the watch’s timekeeping functions from your phone.
Wi-Fi is a wireless network brand owned by the Wi-Fi Alliance, which consists of about 300 technology companies, including AT&T, Dell, Microsoft, Nokia, and Qualcomm. The alliance exists to improve the interoperability of wireless local area network products based on the IEEE 802.11 series of com- munications standards. IEEE stands for the Institute of Electrical and Electron- ics Engineers, a nonprofit organization and one of the leading standards- setting organizations. Table 6.3 summarizes several variations of the IEEE 802.11 standard.
In a Wi-Fi wireless network, the user’s computer, smartphone, or other mobile device has a wireless adapter that translates data into a radio signal and transmits it using an antenna. A wireless access point, which consists of a transmitter with an antenna, receives the signal and decodes it. The access point then sends the information to the Internet over a wired connection.
TABLE 6.2 Frequency ranges used for wireless communications Technology Description Advantages Disadvantages
Radio frequency range
Operates in the 3 KHz– 300 MHz range
Supports mobile users; costs are dropping
Signal is highly susceptible to interception
Microwave— terrestrial and satel- lite frequency range
High-frequency radio signal (300 MHz–300 GHz) sent through the atmosphere and space (often involves com- munications satellites)
Avoids cost and effort to lay cable or wires; capable of high-speed transmission
Must have unobstructed line of sight between sender and receiver; signal is highly sus- ceptible to interception
Infrared frequency range
Signals in the 300 GHz– 400 THz frequency range
Lets you move, remove, and install devices without expensive wiring
Must have unobstructed line of sight between sender and receiver; transmission is effec- tive only for short distances
TABLE 6.3 IEEE 802.11 wireless local area networking standards Wireless Networking Protocol
Maximum Data Rate per Data Stream Comments
IEEE 802.11a 54 Mbps Transmits at 5 GHz, which means it is incompatible with 802.11b and 802.11g
IEEE 802.11b 11 Mbps First widely accepted wireless network standard and transmits at 2.4 GHz; equipment using this protocol may occasionally suffer from interference from microwave ovens, cordless telephones, and Bluetooth devices
IEEE 802.11g 54 Mbps Equipment using this protocol transmits at 2.4 GHz and may occasionally suffer from interference from microwave ovens, cordless telephones, and Bluetooth devices
IEEE 802.11n 300 Mbps Employs multiple input, multiple output (MIMO) technology, which allows multiple data streams to be transmitted over the same channel using the same bandwidth that is used for only a single data stream in 802.11a/b/g
IEEE 802.11ac 400 Mbps–1.3 Gbps An 802.11 standard that provides higher data transmission speeds and more stable connections; it can transmit at either 2.4 GHz or 5 GHz
Wi-Fi: A medium-range wireless com- munications technology brand owned by the Wi-Fi Alliance.
248 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
See Figure 6.7. When receiving data, the wireless access point takes the information from the Internet, translates it into a radio signal, and sends it to the device’s wireless adapter. These devices typically come with built-in wireless transmitters and software to enable them to alert the user to the existence of a Wi-Fi network. The area covered by one or more intercon- nected wireless access points is called a “hot spot.” Wi-Fi has proven so pop- ular that hot spots are popping up in places such as airports, coffee shops, college campuses, libraries, and restaurants. The availability of free Wi-Fi within a hotel’s premises has become very popular with business travelers. Meanwhile, hundreds of cities in the United States have implemented munic- ipal Wi-Fi networks for use by meter readers and other municipal workers and to provide Internet access to their citizens and visitors.
Microwave Transmission Microwave is a high-frequency (300 MHz to 300 GHz) signal sent through the air. Terrestrial (Earth-bound) microwaves are transmitted by line-of-sight devices, so the line of sight between the transmitter and receiver must be unobstructed. Typically, microwave stations are placed in a series—one sta- tion receives a signal, amplifies it, and retransmits it to the next microwave transmission tower. Such stations can be located roughly 30 miles apart before the curvature of the Earth makes it impossible for the towers to “see” one another. Because they are line-of-sight transmission devices, microwave dishes are frequently placed in relatively high locations, such as mountains, towers, or tall buildings.
A communications satellite also operates in the microwave frequency range. See Figure 6.8. The satellite receives the signal from the Earth station, amplifies the relatively weak signal, and then rebroadcasts it at a different fre- quency. The advantage of satellite communications is that satellites can receive and broadcast over large geographic regions. Problems such as the curvature of the Earth, mountains, and other structures that block the line- of-sight microwave transmission make satellites an attractive alternative. Geo- stationary, low earth orbit, and small mobile satellite stations are the most common forms of satellite communications.
A geostationary satellite orbits the Earth directly over the equator, approx- imately 22,300 miles above the Earth, so that it appears stationary. The U.S.
FIGURE 6.7 Wi-Fi network In a Wi-Fi network, the user’s com- puter, smartphone, or cell phone has a wireless adapter that translates data into a radio signal and transmits it using an antenna.
Existing wired network
Wireless network
Cable modem/routerWireless access point
Internet
Data transmitted and received
through airwaves
CHAPTER 6 • Networks and Cloud Computing 249
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
National Weather Service relies on the Geostationary Operational Environ- mental Satellite program for weather imagery and quantitative data to support weather forecasting, severe storm tracking, and meteorological research.
A low earth orbit (LEO) satellite system employs many satellites, each in an orbit at an altitude of less than 1,000 miles. The satellites are spaced so that, from any point on the Earth at any time, at least one satellite is in a line of sight. Iridium Communications provides a global communications network that spans the entire Earth, using 66 satellites in a near-polar orbit at an alti- tude of 485 miles. Calls are routed among the satellites to create a reliable connection between call participants that cannot be disrupted by natural dis- asters such as earthquakes, tsunamis, or hurricanes that may knock out ground-based wireless towers and wire- or cable-based networks.8 Every day, thousands of vessels and tankers traveling the world’s seas and oceans use Iri- dium’s network to establish reliable global communications to optimize their daily activities.
4G Wireless Communications Wireless communications has evolved through four generations of technology and services. The first generation (1G) of wireless communications standards originated in the 1980s and was based on analog communications. The second-generation (2G) networks were fully digital, superseding 1G networks in the early 1990s. With 2G networks, phone conversations were encrypted, mobile phone usage was expanded, and short message services (SMS)—or texting—was introduced. 3G wireless communications supports wireless voice and broadband speed data communications in a mobile environment at speeds of 2 to 4 Mbps. Additional capabilities include mobile video, mobile e-commerce, location-based services, mobile gaming, and the downloading and playing of music.
4G broadband mobile wireless delivers more advanced versions of enhanced multimedia, smooth streaming video, universal access, and portabil- ity across all types of devices; eventually 4G will also make possible world- wide roaming. 4G can deliver 3 to 20 times the speed of 3G networks for mobile devices such as smartphones, tablets, and laptops.
Each of the four major U.S. wireless network operators (AT&T, Verizon, Sprint, and T-Mobile) is rapidly expanding its 4G networks based on the Long Term Evolution (LTE) standard. Long Term Evolution (LTE) is a standard for wireless communications for mobile phones based on packet switching, which is an entirely different approach from the circuit-switching approach employed
FIGURE 6.8 Satellite transmission Communications satellites are relay stations that receive signals from one Earth station and rebroadcast them to another.
Communications satellite
Microwave station
Microwave station
Earth
Approximately 22,300 miles
Long Term Evolution (LTE): A standard for wireless communications for mobile phones based on packet switching.
250 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
in 3G communications networks. To convert to the LTE standard, carriers must reengineer their voice call networks.
The biggest benefit of LTE is how quickly a mobile device can connect to the Internet and how much data it can download or upload in a given amount of time. LTE makes it reasonable to stream video to your phone, using services such as Amazon Prime Instant Video, Hulu Plus, Netflix, or YouTube. It also speeds up Web browsing, with most pages loading in seconds. LTE enables video calling using services such as Skype or Google Hangouts. LTE’s faster speed also makes sharing photos and videos from your phone quick and easy.
5G Wireless Communications A new mobile communications generation has come on the scene about every 10 years since the first 1G system. 5G is a term used to identify the next major phase of mobile communications standards beyond 4G. No 5G mobile stan- dard has been formally defined yet, but 5G will bring with it higher data transmission rates, lower power consumption, higher connect reliability with fewer dropped calls, increased geographic coverage, and lower infrastructure costs. If 5G networks meet the goal of a 50 times faster data rate than the most advanced Wi-Fi networks today, they will be able to stream a two-hour movie in less than three seconds. Verizon plans to start field trials of 5G tech- nology by late 2016, with some level of commercial deployment to start by 2017—far sooner than the 2020 time frame that many industry observers anticipate for the initial adoption of 5G technology.9
Communications Hardware Networks require various communications hardware devices to operate, including modems, fax modems, multiplexers, private branch exchanges, front-end processors, switches, bridges, routers, and gateways. These devices are summarized in Table 6.4.
Communications Software A network operating system (NOS) is systems software that controls the computer systems and devices on a network and allows them to communicate
TABLE 6.4 Common communications devices Device Function
Modem Translates data from a digital form (as it is stored in the computer) into an analog signal that can be transmitted over ordinary telephone lines
Fax modem Combines a fax with a modem; facsimile devices, commonly called fax devices, allow businesses to transmit text, graphs, photographs, and other digital files via standard telephone lines
Multiplexer Allows several communications signals to be transmitted over a single communications medium at the same time, thus saving expensive long-distance communications costs
PBX (private branch exchange)
Manages both voice and data transfer within a building and to outside lines; PBXs can be used to connect hundreds of internal phone lines to a few outside phone company lines
Front-end processor Manages communications to and from a computer system serving many people
Switch Uses the physical device address in each incoming message on the network to determine which output port it should forward the message to reach another device on the same network
Bridge Connects one LAN to another LAN where both LANs use the same communications protocol
Router Forwards data packets across two or more distinct networks toward their destinations through a process known as routing; often, an Internet service provider (ISP) installs a router in a subscri- ber’s home that connects the ISP’s network to the network within the home
Gateway Serves as an entrance to another network, such as the Internet
network operating system (NOS): Systems software that controls the computer systems and devices on a network and allows them to communi- cate with each other.
CHAPTER 6 • Networks and Cloud Computing 251
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
with each other. The NOS performs similar functions for the network as operating system software does for a computer, such as memory and task management and coordination of hardware. When network equipment (such as printers, plotters, and disk drives) is required, the NOS makes sure that these resources are used correctly. Linux (used on workstations), OS X (used on Apple MACs), UNIX (used on servers), and Windows Server (used on workstations and servers) are common network operating systems.
Because companies use networks to communicate with customers, busi- ness partners, and employees, network outages or slow performance can mean a loss of business. Network management includes a wide range of tech- nologies and processes that monitor the network and help identify and address problems before they can create a serious impact.
Software tools and utilities are available for managing networks. With network-management software, a manager on a networked personal com- puter can monitor the use of individual computers and shared hardware (such as printers), scan for viruses, and ensure compliance with software licenses. Network-management software also simplifies the process of updat- ing files and programs on computers on the network—a manager can make changes through a communications server instead of having to visit each indi- vidual computer. In addition, network-management software protects soft- ware from being copied, modified, or downloaded illegally. It can also locate communications errors and potential network problems. Some of the many benefits of network-management software include fewer hours spent on rou- tine tasks (such as installing new software), faster response to problems, and greater overall network control.
Banks use a special form of network-management software to monitor the performance of their automated teller machines (ATMs). Status messages can be sent over the network to a central monitoring location to inform sup- port people about situations such as low cash or receipt paper levels, card reader problems, and printer paper jams. Once a status message is received, a service provider or branch location employee can be dispatched to fix the ATM problem.
Today, most IS organizations use network-management software to ensure that their network remains up and running and that every network component and application is performing acceptably. The software enables IS staff to identify and resolve fault and performance issues before they affect end users. The latest network-management technology even incorporates automatic fixes: The network-management system identifies a problem, noti- fies the IS manager, and automatically corrects the problem before anyone outside the IS department notices it.
The Covell Group is a small IT consulting group in San Diego that pro- vides server and Web site monitoring for small- and medium-sized companies. The firm uses network-monitoring software to watch sensors and remote probes that track CPU, disk space, and Windows services. Constant monitor- ing enables the firm to detect if a communications line is down or if there is a power failure overnight so that everything is up and ready by the start of the next work day.10
Mobile device management (MDM) software manages and trouble- shoots mobile devices remotely, pushing out applications, data, patches, and settings. With the software, a central control group can maintain group poli- cies for security, control system settings, ensure malware protection is in place for mobile devices used across the network, and make it mandatory to use passwords to access the network. In addition to smartphones and tablets, laptops and desktops are sometimes supported using MDM software as mobile device management becomes more about basic device management and less about a specific mobile platform.
network-management software: Software that enables a manager on a networked desktop to monitor the use of individual computers and shared hardware (such as prin- ters), scan for viruses, and ensure compliance with software licenses.
mobile device management (MDM) software: Software that manages and troubleshoots mobile devices remotely, pushing out applica- tions, data, patches, and settings while enforcing group policies for security.
252 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
Software-Defined Networking (SDN) A typical network is comprised of hundreds or thousands of network devices that perform such tasks as routing and switching of data through the network, providing network access and control, and enabling access to a variety of applications and services. In today’s current network environment, each net- work device must be configured individually, usually via manual keyboard input. For a network of any size, this becomes a labor-intensive and error- prone effort, making it difficult to change the network so it can meet the changing needs of the organization. Software-defined networking (SDN) is an emerging approach to networking that allows network administrators to manage a network via a controller that does not require physical access to all the network devices. This approach automates tasks such as configuration and policy management and enables the network to dynamically respond to application requirements. As a result, new applications can be made available sooner, the risk of human error (a major contributor to network downtime) is reduced, and overall network support and operations costs are reduced.
Google is implementing Andromeda, the underlying software-defined net- working architecture that will enable Google’s cloud computing services to scale better, more cheaply and more quickly. With software-defined network- ing, even though many customers are sharing the same network, they can be configured and managed independently with their own address management, firewalls, and access control lists. Google competitors in cloud services like Microsoft and Amazon also employ software-defined networks.11
Network-Management Software for a University The Ohio State University has over 58,000 undergraduate students spread across several major campuses and research centers located around Ohio. Its information system administrators are considering the use of network-management software and are evaluating the use of mobile device management software from various vendors.
Review Questions 1. What features should the administrators look for in choosing its network-
management software? 2. What specific benefits would be gained by installing network-management
software?
Critical Thinking Questions 1. Should a goal of a mobile device management software implementation be to
reduce the number of information systems support staff dedicated to support the university’s students, administrators, and faculty? Or should any productiv- ity gains be applied to providing new services and superior support?
2. Identify common issues that students may have with the use of their devices that could be addressed through the use of mobile device management software.
The Internet and World Wide Web
The Internet has grown rapidly (see Figure 6.9) and is truly international in scope, with users on every continent—including Antarctica. Although the United States has high Internet penetration among its population, it does not constitute the majority of people online. As of November 2015, citizens of Asian countries make up about 48 percent, Europeans about 18 percent,
software-defined networking (SDN): An emerging approach to net- working that allows network adminis- trators to have programmable central control of the network via a controller without requiring physical access to all the network devices.
CHAPTER 6 • Networks and Cloud Computing 253
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Latin America/Caribbean about 10 percent, and North Americans about 9 percent of all Internet users. China is the country with the most Internet users, with 674 million—which is more users than the next two countries combined (India 354 million and United States 280 million).12 Being con- nected to the Internet provides global economic opportunity to individuals, businesses, and countries.
The Internet and social media Web sites have emerged as important new channels for learning about world events, protesting the actions of organiza- tions and governments, and urging others to support one’s favorite causes or candidates. For example, some believe that Barack Obama’s effective use of the Internet and social media provided him with a distinct advantage over his opponents in the presidential elections of 2008 and 2012.13 In another exam- ple, Syrian rebels used the Internet to communicate about events within the country and to provide a useful link to others around the world.14
On the other hand, Internet censorship, the control or suppression of the publishing or accessing of information on the Internet, is a growing problem. For example, in May 2015, the Chinese-language version of Wikipedia was blocked in China.15 The organizations Human Rights Watch and Amnesty Inter- national allege that the Saudi Arabian government uses malicious spyware to target activists and it hunts down, silences, and flogs bloggers who criticize the government.16 In Hungary the government employs fines, licensing, and taxes to coerce critical media, and it directs state advertising to friendly outlets.17
The ancestor of the Internet was the ARPANET, a project started by the U.S. Department of Defense (DoD) in 1969. The ARPANET was both an experiment in reliable networking and a means to link DoD and military research contractors, including many universities doing military-funded research. (ARPA stands for the Advanced Research Projects Agency, the branch of the DoD in charge of awarding grant money. The agency is now known as DARPA—the added D is for Defense.) The ARPANET was highly successful, and every university in the country wanted to use it. This wildfire growth made it difficult to manage the ARPANET, particularly the rapidly growing number of university sites. So, the ARPANET was broken into two
1,200,000,000
1,00,000,000
800,000,000
600,000,000
400,000,000
200,000,000
0 Ja
n- 94
Ju l-
94 Ja
n- 95
Ju l-
95 Ja
n- 96
Ju l-
96 Ja
n- 97
Ju l-
97 Ja
n- 98
Ju l-
98 Ja
n- 99
Ju l-
99 Ja
n- 00
Ju l-
00 Ja
n- 01
Ju l-
01 Ja
n- 02
Ju l-
02 Ja
n- 03
Ju l-
03 Ja
n- 04
Ju l-
04 Ja
n- 05
Ju l-
05 Ja
n- 06
Ju l-
06 Ja
n- 07
Ju l-
07 Ja
n- 08
Ju l-
08
Ju l-
09 Ja
n- 09
Ju l-
10 Ja
n- 10
Ju l-
11 Ja
n- 11
Ju l-
12 Ja
n- 12
Ju l-
13 Ja
n- 13
Ju l-
14 Ja
n- 14
Ju l-
15 Ja
n- 15
N um
be r
of In
te rn
et D
om ai
n S
er ve
rs
Date
FIGURE 6.9 Internet growth: Number of Internet hosts The number of worldwide Internet users is expected to continue growing. Source: Data from “ISC Domain Survey,” https://www.isc.org/network/survey/.
254 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
networks: MILNET, which included all military sites, and a new, smaller ARPANET, which included all the nonmilitary sites. The two networks remained connected, however, through use of the Internet protocol (IP), which enables traffic to be routed from one network to another as needed. All the networks connected to the Internet use IP, so they all can exchange messages.
How the Internet Works In the early days of the Internet, the major communications companies around the world agreed to connect their networks so that users on all the networks could share information over the Internet. These large communica- tions companies, called network service providers (NSPs), include Verizon, Sprint, British Telecom, and AT&T. The cables, routers, switching stations, communication towers, and satellites that make up these networks are the hardware over which Internet traffic flows. The combined hardware of these and other NSPs—the fiber-optic cables that span the globe over land and under sea—make up the Internet backbone.
The Internet transmits data from one computer (called a host) to another. See Figure 6.10. If the receiving computer is on a network to which the first computer is directly connected, it can send the message directly. If the receiv- ing and sending computers are not directly connected to the same network, the sending computer relays the message to another computer that can for- ward it. The message is typically sent through one or more routers to reach its destination. It is not unusual for a message to pass through several routers on its way from one part of the Internet to another.
The various communications networks that are linked to form the Internet work much the same way—they pass data around in chunks called packets, each of which carries the addresses of its sender and receiver along with other technical information. The set of rules used to pass packets from one host to another is the IP protocol. Many other communications protocols are used in connection with IP. The best known is the Transmission Control Pro- tocol (TCP). Many people use “TCP/IP” as an abbreviation for the combina- tion of TCP and IP used by most Internet applications. After a network following these standards links to the Internet’s backbone, it becomes part of the worldwide Internet community.
Each computer on the Internet has an assigned address, called its IP address, that identifies it on the Internet. An IP address is a 64-bit number that identifies a computer on the Internet. The 64-bit number is typically
FIGURE 6.10 Routing messages over the Internet Data is transmitted from one host computer to another on the Internet.
Router/Gateway
Router/Gateway
Router/Gateway
Router/Gateway
Router/Gateway
Host computer 1
Host computer 2
Host computer 3
Host computer 4
Internet Protocol (IP): A commu- nication standard that enables compu- ters to route communications traffic from one network to another as needed.
Internet backbone: One of the Internet’s high-speed, long-distance communications links.
IP address: A 64-bit number that identifies a computer on the Internet.
CHAPTER 6 • Networks and Cloud Computing 255
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
divided into four bytes and translated to decimal; for example, 69.32.133.79. The Internet is migrating to Internet Protocol version 6 (IPv6), which uses 128-bit addresses to provide for many more devices; however, this change is expected to take years.
Because people prefer to work with words rather than numbers, a system called the Domain Name System (DNS) was created. Domain names such as www. cengage.com are mapped to IP addresses such as 69.32.133.79 using the DNS. To make room for more Web addresses, efforts are underway to increase the number of available domain names.
A Uniform Resource Locator (URL) is a Web address that specifies the exact location of a Web page using letters and words that map to an IP address and a location on the host. The URL gives those who provide infor- mation over the Internet a standard way to designate where Internet resources such as servers and documents are located. Consider the URL for Cengage Learning, http://www.cengage.com/us/.
The “http” specifies the access method and tells your software to access a file using the Hypertext Transport Protocol. This is the primary method for interacting with the Internet. In many cases, you don’t need to include http:// in a URL because it is the default protocol. The “www” part of the address sig- nifies that the address is associated with the World Wide Web service. The URL www.cengage.com is the domain name that identifies the Internet host site. The part of the address following the domain name—/us—specifies an exact location on the host site.
Domain names must adhere to strict rules. They always have at least two parts, with each part separated by a dot (period). For some Internet addresses, the far right part of the domain name is the country code, such as au for Australia, ca for Canada, dk for Denmark, fr for France, de (Deutschland) for Germany, and jp for Japan. Many Internet addresses have a code denoting affiliation categories, such as com for business sites and edu for education sites. Table 6.5 contains a few popular domain affiliation categories. The far left part of the domain name identifies the host network or host provider, which might be the name of a university or business. Other countries use different top-level domain affiliations from the U.S. ones described in the table.
The Internet Corporation for Assigned Names and Numbers (ICANN) is responsible for managing IP addresses and Internet domain names. One of ICANN’s primary concerns is to make sure that each domain name represents only one individual or entity—the one that legally registers it. For example, if your teacher wanted to use www.cengage.com for a course Web site, he or she would discover that domain name has already been registered by Cen- gage Learning and is not available. ICANN uses companies called accredited domain name registrars to handle the business of registering domain names.
TABLE 6.5 Number of domains in U.S. top-level domain affiliations—Winter 2015 Affiliation ID Affiliation Number of Hosts
Biz Business sites 2,428,269
Com All types of entities including nonprofits, schools, and private individuals
123,743,892
Edu Post-secondary educational sites 7,446
Gov Government sites 5,503
Net Networking sites 15,805,152
Org Nonprofit organization sites 10,984,293
Source: Domain Count Statistics for TLDs, http//research.domaintools.com/statistics/tld-counts/.
Uniform Resource Locator (URL): A Web address that specifies the exact location of a Web page using letters and words that map to an IP address and a location on the host.
256 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
For example, you can visit www.namecheap.com, an accredited registrar, to find out if a particular name has already been registered. If not, you can regis- ter the name for around $9 per year. Once you do so, ICANN will not allow anyone else to use that domain name as long as you pay the yearly fee.
Accessing the Internet You can connect to the Internet in numerous ways. See Figure 6.11. Which access method you choose is determined by the size and capability of your organization or system, your budget, and the services available to you.
Connecting via a LAN Server This approach is used by businesses and organizations that manage a local area network (LAN). By connecting a server on the LAN to the Internet using a router, all users on the LAN are provided access to the Internet. Business LAN servers are typically connected to the Internet at very fast data rates, sometimes in the hundreds of Mbps.
Connecting via Internet Service Providers Companies and residences unable to connect directly to the Internet through a LAN server must access the Internet through an Internet service provider. An Internet service provider (ISP) is any organization that provides Internet access to people. Thousands of organizations serve as ISPs, ranging from uni- versities that make the Internet available to students and faculty to small Inter- net businesses to major communications giants such as AT&T and Comcast. To connect to the Internet through an ISP, you must have an account with the service provider (for which you usually pay) along with software (such as a browser) and devices (such as a computer or smartphone) that support a connection via TCP/IP.
Modem
Modem
Cell Tower
4. Cell Phone
I n t e r
n e
t
1. Connect via a
LAN server
Router/Gateway
Router/Gateway
Router/Gateway
Router/Gateway
2. Connect via dial-up
3. Connect via high-speed service
Host computer
for an online
service
LANLAN
FIGURE 6.11 Several ways to access the Internet Users can access the Internet in several ways, including using a LAN server, telephone lines, a high- speed service, or a wireless network.
Internet service provider (ISP): Any organization that provides Internet access to people.
CHAPTER 6 • Networks and Cloud Computing 257
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Perhaps the least expensive but also slowest connection provided by ISPs is a dial-up connection. A dial-up Internet connection uses a modem and stan- dard phone line to “dial up” and connect to the ISP server. Dial-up is consid- ered the slowest of connections. A dial-up connection also ties up the phone line so that it is unavailable for voice calls. While dial-up was originally the only way to connect to the Internet from home, it is rapidly becoming replaced by high-speed services.
Several high-speed Internet services are available for home and business. They include cable modem connections from cable television companies, DSL connections from phone companies, and satellite connections from satellite television companies.
Wireless Connection In addition to connecting to the Internet through wired systems such as phone lines and fiber optic cables, wireless Internet service over cellular and Wi-Fi networks has become common. Thousands of public Wi-Fi services are available in coffee shops, airports, hotels, and elsewhere, where Internet access is provided free, for an hourly rate, or for a monthly subscription fee. Wi-Fi has even made its way into aircraft, allowing business travelers to be productive during air travel by accessing email and corporate networks.
Cell phone carriers also provide Internet access for smartphones, note- books, and tablets. The 4G mobile phone services rival wired high-speed connections enjoyed at home and work. Sprint, Verizon, AT&T, and other popular carriers are working to bring 4G service to subscribers, beginning in large metropolitan areas as shown in Figure 6.12.
VERIZON 4G LTE COVERAGE
FIGURE 6.12 Verizon 4G LTE Coverage While Verizon’s 4G LTE coverage is extensive, there are still vast expanses where there is no coverage. Source: Verizon 4G LTE Coverage, http://www.bing.com/images/search?q=verizon+4g+coverage+2016&view=detailv2&&id=C8AA6A1F887
C24A96743E0CC64307E5D5E5BF96A&selectedIndex=7&ccid=SAETURq2&simid=608035862426485414&thid=OIP.
M480113511ab6e162480a92e07aeff074o0&ajaxhist=0.
258 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
How the Web Works The World Wide Web was developed by Tim Berners-Lee at CERN, the Euro- pean Organization for Nuclear Research in Geneva. He originally conceived of it as an internal document-management system. From this modest begin- ning, the Web has grown to become a primary source of news and informa- tion, an indispensable conduit for commerce, and a popular hub for social interaction, entertainment, and communication.
While the terms Internet and Web are often used interchangeably, techni- cally, the two are different technologies. The Internet is the infrastructure on which the Web exists. The Internet is made up of computers, network hard- ware such as routers and fiber-optic cables, software, and the TCP/IP proto- cols. The World Wide Web (Web), on the other hand, consists of server and client software, the hypertext transfer protocol (http), standards, and markup languages that combine to deliver information and services over the Internet.
The Web was designed to make information easy to find and organize. It connects billions of documents, called Web pages, stored on millions of ser- vers around the world. Web pages are connected to each other using hyper- links, specially denoted text or graphics on a Web page, that, when clicked, open a new Web page containing related content. Using hyperlinks, users can jump between Web pages stored on various Web servers—creating the illu- sion of interacting with one big computer. Because of the vast amount of information available on the Web and the wide variety of media, the Web has become the most popular means of accessing information in the world today.
In short, the Web is a hyperlink-based system that uses the client/server model. It organizes Internet resources throughout the world into a series of linked files, called pages, which are accessed and viewed using Web client soft- ware called a Web browser. Google Chrome, Mozilla Firefox, Microsoft Edge, Internet Explorer, Apple Safari, and Opera are popular Web browsers. See Figure 6.13. A collection of pages on one particular topic, accessed under one Web domain, is called a Web site. The Web was originally designed to support formatted text and pictures on a page. It has evolved to support many more types of information and communication including user interactivity, animation, and video. Web plug-ins help provide additional features to standard Web sites. Adobe Flash and Real Player are examples of Web plug-ins.
Hypertext Markup Language (HTML) is the standard page description language for Web pages. HTML is defined by the World Wide Web Consortium
FIGURE 6.13 Google Chrome Web browsers such as Google Chrome let you access Internet resources such as email and other online applications.
hyperlink: Highlighted text or gra- phics in a Web document that, when clicked, opens a new Web page con- taining related content.
Web browser: Web client software— such as Chrome, Edge, Firefox, Inter- net Explorer, and Safari—used to view Web pages.
Hypertext Markup Language (HTML): The standard page descrip- tion language for Web pages.
Co ur te sy
of G oo gl e
CHAPTER 6 • Networks and Cloud Computing 259
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
(referred to as “W3C”) and has developed through numerous revisions. It is currently in its fifth revision—HTML5. HTML tells the browser how to display font characteristics, paragraph formatting, page layout, image placement, hyper- links, and the content of a Web page. HTML uses tags, which are codes that tell the browser how to format the text or graphics as a heading, list, or body text, for example. Web site creators “mark up” a page by placing HTML tags before and after one or more words. For example, to have the browser display a sentence as a heading, you place the <h1> tag at the start of the sentence and an </h1> tag at the end of the sentence. When that page is viewed in a browser, the sentence is displayed as a heading. HTML also provides tags to import objects stored in files—such as photos, graphics, audio, and movies—into a Web page. In short, a Web page is made up of three components: text, tags, and references to files. The text is your Web page content, the tags are codes that mark the way words will be displayed, and the references to files insert photos and media into the Web page at specific locations. All HTML tags are enclosed in a set of angle brackets (< and >), such as <h2>. The closing tag has a forward slash in it, such as </b> for closing bold. Consider the following text and tags.
Extensible Markup Language (XML) is a markup language for Web documents containing structured information, including words and pictures. XML does not have a predefined tag set. With HTML, for example, the <hl> tag always means a first-level heading. The content and formatting are con- tained in the same HTML document. XML Web documents contain the content of a Web page. The formatting of the content is contained in a style sheet. A few typical instructions in XML follow:
<book> <chapter>Hardware</chapter> <topic>Input Devices</topic> <topic>Processing and Storage Devices</topic> <topic>Output Devices</topic> </book>
A Cascading Style Sheet (CSS) is a file or portion of an HTML file that defines the visual appearance of content in a Web page. Using CSS is conve- nient because you only need to define the technical details of the page’s appearance once, rather than in each HTML tag. CSS uses special HTML tags to globally define characteristics for a variety of page elements as well as how those elements are laid out on the Web page. Rather than having to specify a font for each occurrence of an element throughout a document, formatting can be specified once and applied to all occurrences. CSS styles are often defined in a separate file and then can be applied to many pages on a Web site.
For example, the visual appearance of the preceding XML content could be contained in the following style sheet:
chapter (font-size 18pt; color blue; font-weight bold; display block; font-family Arial; margin-top 10pt; margin-left 5pt) topic (font-size 12pt; color red; font-style italic; display block; font-family Arial; margin-left 12pt)
This style sheet specifies that the chapter title “Hardware” is displayed on the Web page in a large Arial font (18 points). “Hardware” will also appear in bold blue text. The “Input Devices” title will appear in a smaller Arial font (12 points) and italic red text.
XML is extremely useful for organizing Web content and making data easy to find. Many Web sites use CSS to define the design and layout of Web pages, XML to define the content, and HTML to join the design (CSS) with the content (XML). See Figure 6.14. This modular approach to Web design allows Web site developers to change the visual design without affecting the content and to change the content without affecting the visual design.
tag: A code that tells the Web browser how to format text—as a heading, as a list, or as body text—and whether images, sound, and other elements should be inserted.
Extensible Markup Language (XML): The markup language designed to transport and store data on the Web.
Cascading Style Sheet (CSS): A markup language for defining the visual design of a Web page or group of pages.
260 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Web Programming Languages Many of the services offered on the Web are delivered through the use of pro- grams and scripts. A Web program may be something as simple as a menu that expands when you click it or as complicated as a full-blown spreadsheet application. Web applications may run on a Web server, delivering the results of the processing to the user, or they may run directly on a client, such as a user’s PC. These two categories are commonly referred to as server-side and client-side software.
JavaScript is a popular programming language for client-side applications. Using JavaScript, you can create interactive Web pages that respond to user actions. JavaScript can be used to validate data entry in a Web form, to dis- play photos in a slideshow style, to embed simple computer games in a Web page, and to provide a currency conversion calculator. Java is a programming language from Sun Microsystems based on the Cþþ programming language, which allows small programs, called applets, to be embedded within an HTML document. When the user clicks the appropriate part of an HTML page to retrieve an applet from a Web server, the applet is downloaded onto the client workstation where it begins executing. Unlike other programs, Java software can run on any type of computer. It can be used to develop client- side or server-side applications. Programmers use Java to make Web pages come alive, adding splashy graphics, animation, and real-time updates. ASP.NET, C, Cþþ, Perl, PHP, and Python are among other widely used client-side programming languages.
FIGURE 6.14 XML, CSS, and HTML Today’s Web sites are created using XML to define content, CSS to define the visual style, and HTML to put it all together.
CSS File
-Fonts
-Colors
-Layout
XML File
-Content
HTML File
CSS+ XML
CHAPTER 6 • Networks and Cloud Computing 261
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Web Services Web services consist of standards and tools that streamline and simplify com- munication among Web sites and make it simpler to develop and use the Web for business and personal purposes. The key to Web services is XML. Just as HTML was developed as a standard for formatting Web content into Web pages, XML is used within a Web page to describe and transfer data between Web service applications.
Internet companies, including Amazon, eBay, and Google, are now using Web services.
Amazon Web Services (AWS) is the basic infrastructure that Amazon employs to make the contents of its huge online catalog available to other Web sites or software applications. Airbnb is an online marketplace that enables property owners and travelers to interact for the purpose of renting distinctive vacation spaces in more than 34,000 cities in 190 countries. Shortly after Airbnb began operations, it migrated its cloud computing functions to AWS, which distributes incoming traffic to ensure high availability and fast response time. AWS also allows Airbnb to store backups and static files, including 10 TB of user pictures, and to monitor all of its server resources.18
Developing Web Content and Applications If you need to create a Web site, you have lots of options. You can hire some- one to design and build it, or you can do it yourself. If you do it yourself, you can use an online service to create the Web pages, use a Web page creation software tool, or use a plain text editor to create the site. The software includes features that allow the developer to work directly with the HTML code or to use auto-generated code. Web development software also helps the designer keep track of all files in a Web site and the hyperlinks that con- nect them.
Popular tools for creating Web pages and managing Web sites include Adobe Dreamweaver, RapidWeaver (for Mac developers), and Nvu (pro- nounced n-view). See Figure 6.15.
FIGURE 6.15 Creating Web pages Nvu makes Web design nearly as easy as using a word processor. Source: Nvu Tutorial: by Tim VanSlyke at http://faculty.chemeketa.edu/tvanslyk/com puterskills/tutorials/nvu_tutorial.pdf.
262 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Many products make it easy to develop Web content and interconnect Web services. Microsoft, for example, provides a development and Web services platform called .NET, which allows developers to use various pro- gramming languages to create and run programs, including those for the Web. The .NET platform also includes a rich library of programming code to help build XML Web applications. Other popular Web development platforms include JavaServer Pages, Microsoft ASP.NET, and Adobe ColdFusion.
After you create Web pages, your next step is to place or publish the con- tent on a Web server. Popular publishing options include using ISPs, free sites, and Web hosting services. Web hosting services provide space on their Web servers for people and businesses that don’t have the financial resources, time, or skills to host their own Web sites. A Web host can charge $15 or more per month, depending on services. Some Web hosting sites include domain name registration, Web authoring software, activity reporting, and Web site monitoring. Some ISPs also provide limited storage space, typically 1 to 6 megabytes, as part of their monthly fee. If more disk space is needed, additional fees are charged. Free sites offer limited space for a Web site. In return, free sites often require the user to view advertising or agree to other terms and conditions.
Some Web developers are creating programs and procedures to combine two or more Web applications into a new service, called a mashup—named after the process of mixing two or more hip-hop songs into one song. Map applications such as Google Maps provide tool kits that allow them to be combined with other Web applications. For example, Google Maps can be used with Twitter to display the location where various tweets were posted. Likewise, Google Maps combined with Flickr can overlay photos of specific geographic locations.
Internet and Web Applications The variety of Internet and Web applications available to individuals and organizations around the world is vast and ever expanding. Using the Inter- net, entrepreneurs can start online companies and thrive. For example, Aaron Goldstein and Colin Hill met at the University of Pennsylvania’s Whar- ton School. At the time, Hill was battling Hodgkin’s lymphoma and undergo- ing chemotherapy. It was necessary for him to monitor his temperature constantly to avoid infections while his immune system was weakened. Although his temperature would be normal when he fell asleep, Hill often woke up during the night with a high fever and had to be rushed to inten- sive care. He became exasperated with his inability to track his temperature continuously and frustrated that his doctor couldn’t monitor him remotely. So Goldstein and Hill set out to find a solution. After two years and an investment of a few hundred thousand dollars, they developed Fever Smart, a small electronic monitor worn under the armpit that sends temperature readings to a relay device, which forwards the data to Fever Smart’s servers and, finally, to a smartphone or other device. Using a smartphone or any Internet-connected device, a Fever Smart user, be it a parent or healthcare provider, can constantly monitor the patient’s temperature in real time and even receive alerts when the patient’s temperature begins to rise or reaches unsafe levels.19
Web 2.0 and the Social Web Over the years, the Web has evolved from a one-directional resource where users only obtain information to a two-directional resource where users obtain and contribute information. Consider Web sites such as YouTube, Wikipedia, and Facebook as just a few examples. The Web has also grown
CHAPTER 6 • Networks and Cloud Computing 263
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
in power to support full-blown software applications such as Google Docs and is becoming a computing platform itself. These two major trends in how the Web is used and perceived have created dramatic changes in how people, businesses, and organizations use the Web, creating a paradigm shift to Web 2.0.
The original Web—Web 1.0—provided a platform for technology-savvy developers and the businesses and organizations that hired them to publish information for the general public to view. Web sites such as YouTube and Flickr allow users to share video and photos with other people, groups, and the world. Microblogging sites such as Twitter allow people to post thoughts and ideas throughout the day for friends to read. See Figure 6.16.
Social networking Web sites provide Web-based tools for users to share information about themselves and to find, meet, and converse with other members. Instagram is a popular social networking service through which users can share photos and videos—either publicly or with a set group of friends. Another social network, LinkedIn, is designed for professional use to assist its members with creating and maintaining valuable professional con- nections. Ning provides tools for Web users to create their own social net- works dedicated to a topic or interest.
Social networks have become very popular for finding old friends, staying in touch with current friends and family, and making new friends. Besides their personal value, these networks provide a wealth of consumer information and opportunities for businesses as well. Some businesses are including social networking features in their workplaces.
The use of social media in business is called Enterprise 2.0. Enterprise 2.0 applications, such as Salesforce’s Chatter, Jive Software’s Engage Dialog, and Yammer, enable employees to create business wikis, support social network- ing, perform blogging, and create social bookmarks to quickly find informa- tion. Tyco, a fire protection and security company, recently went through a major restructuring, changing from a conglomerate of holding companies to a united global enterprise with more than 69,000 employees in 50 countries. Throughout its transition, Tyco relied on Yammer rather than email to educate its workforce on the differences between the old Tyco and the new Tyco and to increase employee engagement across the company.20
Not everyone is happy with social networking sites, however. Employers might use social networking sites to get personal information about you.
FIGURE 6.16 Flickr Flickr allows users to share photos with other people around the world. Source: www.flickr.com
Web 2.0: The Web as a computing platform that supports software appli- cations and the sharing of information among users.
264 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Some people worry that their privacy will be invaded or their personal infor- mation used without their knowledge or consent.
News The Web is a powerful tool for keeping informed about local, state, national, and global news. It has an abundance of special-interest coverage and pro- vides the capacity to deliver deeper analysis of the subject matter. Text and photos are supported by the HTML standard. Video (sometimes called a Web- cast) and audio are provided in a browser through plug-in technology and in podcasts.
As traditional news sources migrate to the Web, new sources are emerg- ing from online companies. News Web sites from Google, Yahoo!, Digg, and Newsvine provide popular or interesting stories from a variety of news sources. In a trend some refer to as social journalism or citizen journalism, ordinary citizens are more involved in reporting the news than ever before. Although social journalism provides important news not available elsewhere, its sources may not be as reliable as mainstream media sources. It is also sometimes difficult to discern news from opinion.
Education and Training Today, institutions and organizations at all levels provide online education and training, which can be accessed via PCs, tablets, and smartphones. Kahn Academy, for example, provides free online training and learning in econom- ics, math, banking and money, biology, chemistry, history, and many other subjects.21 NPower helps nonprofit organizations, schools, and individuals develop information system skills. The nonprofit organization provides train- ing to hundreds of disadvantaged young adults through a 22-week training program that can result in certification from companies such as Microsoft and Cisco.22
High school and college students are using mobile devices to read elec- tronic textbooks instead of carrying heavy printed textbooks to class. And educational support products, such as Blackboard, provide an integrated Web environment that includes virtual chat for class members; a discussion group for posting questions and comments; access to the class syllabus and agenda, student grades, and class announcements; and links to class-related material. Conducting classes over the Web with no physical class meetings is called dis- tance learning.
Job Information The Web is also an excellent source of job-related information. People looking for their first jobs or seeking information about new job opportunities can find a wealth of information online. Search engines, such as Google or Bing (discussed next), can be a good starting point for searching for specific com- panies or industries. You can use a directory on Yahoo’s home page, for example, to explore industries and careers. Most medium and large compa- nies have Web sites that list open positions, salaries, benefits, and people to contact for further information. The IBM Web site, www.ibm.com, has a link to “Careers.” When you click this link, you can find information on jobs with IBM around the world. In addition, several sites specialize in helping you find job information and even apply for jobs online, including www.linkedin .com (see Figure 6.17), www.monster.com, and www.careerbuilder.com.
Search Engines and Web Research A search engine is a valuable tool that enables you to find information on the Web by specifying words or phrases known as keywords, which are related to a topic of interest. You can also use operators such as AND, OR, and NOT for more precise search results.
search engine: A valuable tool that enables you to find information on the Web by specifying words that are key to a topic of interest, known as keywords.
CHAPTER 6 • Networks and Cloud Computing 265
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The search engine market is dominated by Google. Other popular search engines include Yahoo! Search, Bing, Ask, Dogpile, and China’s Baidu. Google has taken advantage of its market dominance to expand into other Web-based services, most notably email, scheduling, maps, social networking, Web-based applications, and mobile device software. Search engines like Google often have to modify how they display search results, depending on pending litigation from other Internet companies and government scrutiny, such as antitrust investigations.
The Bing search engine has attempted to innovate with its design. Bing refers to itself as a decision engine because it attempts to minimize the amount of information that it returns in its searches that is not useful or perti- nent. Bing also includes media—music, videos, and games—in its search results. See Figure 6.18.
Savvy Web site operators know that the search engine results are tools that can draw visitors to certain Web sites. Many businesses invest in search engine optimization (SEO)—a process for driving traffic to a Web site by using techniques that improve the site’s ranking in search results. Normally, when a user gets a list of results from a Web search, the links listed highest on the first page of search results have a far greater chance of being clicked. SEO professionals, therefore, try to get the Web sites of their businesses to be listed with as many appropriate keywords as possible. They study the algorithms that search engines use, and then they alter the contents of their
FIGURE 6.17 LinkedIn jobs listing LinkedIn and many other Web sites specialize in helping people get information about jobs and apply for jobs online. Source: LinkedIn
FIGURE 6.18 Microsoft Bing decision engine Microsoft calls its search engine a decision engine to distinguish it from other search software.
search engine optimization (SEO): A process for driving traffic to a Web site by using techniques that improve the site’s ranking in search results.
M ic ro so ft pr od uc t sc re en sh ot s us ed
w ith
pe rm is si on
fr om
M ic ro so ft Co rp or at io n
266 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Web pages to improve the page’s chance of being ranked number one. SEO professionals use Web analytics software to study detailed statistics about visi- tors to their sites.
Search engines offer just one option for performing research on the Web. Libraries typically provide access to online catalogs as well as links to public and sometimes private research databases on the Web. Online research data- bases allow visitors to search for information in thousands of journal, magazine, and newspaper articles. Information database services are valuable because they offer the best in quality and convenience. They conveniently provide full- text articles from reputable sources over the Web. College and public libraries typically subscribe to many databases to support research. One of the most popular private databases is LexisNexis Academic Universe. See Figure 6.19.
Instant Messaging Instant messaging is online, real-time communication between two or more people who are connected via the Internet. With instant messaging, partici- pants build contact lists of people they want to chat with. Some applications allow you to see which of your contacts are currently logged on to the Inter- net and available to chat. If you send messages to one of your contacts, that message appears within the messaging app on a smartphone or other mobile device, or, for those working on PCs, the message opens in a small dialog box on the recipient’s computer. Although chat typically involves exchanging text messages with one other person, many messaging apps allow for group chats. And today’s instant messaging software supports not only text messages but also the sharing of images, videos, files, and voice communications. Popu- lar instant messaging services include Facebook Messenger, KIK, Instagram, Skype, Snapchat, WhatsApp, and WeChat. It is estimated that mobile opera- tors lost $23 billion in 2012 alone as teens shifted away from texting over cel- lular networks in favor of communicating with their friends over the Internet using instant messaging apps.23
Instant messaging: The online, real-time communication between two or more people who are connected via the Internet.
FIGURE 6.19 LexisNexis At LexisNexis Academic Universe, you can search the news, legal cases, company information, peo- ple, or a combination of categories. Source: www.lexisnexis.com
CHAPTER 6 • Networks and Cloud Computing 267
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Microblogging, Status Updates, and News Feeds Referred to as a microblogging service, Twitter is a Web application that allows users to send short text updates (up to 140 characters) from a smart- phone or a Web browser to their Twitter followers. While Twitter has been hugely successful for personal use, many businesses are finding value in the service as well. Business people use Twitter to stay in touch with associates by sharing their location and activities throughout the day. Businesses also find Twitter to be a rich source of consumer sentiment that can be tapped to improve marketing, customer relations, and product development. Many busi- nesses have a presence on Twitter, dedicating personnel to communicate with customers by posting announcements and reaching out to individual users. Village Books, an independent bookstore in Bellingham, Washington, uses Twitter to build relationships with its customers and to make them feel part of their community.
The popularity of Twitter has caused social networks, such as Face- book, LinkedIn, and Tumblr, to include Twitter-like news or blog post feeds. Previously referred to as Status Updates, Facebook users share their thoughts and activities with their friends by posting messages to Face- book’s News Feed.
Conferencing Some Internet technologies support real-time online conferencing. Partici- pants dial into a common phone number to share a multiparty phone conver- sation and, in many cases, live video of the participants. The Internet has made it possible for those involved in teleconferences to share computer desktops. Using services such as WebEx or GoToMeeting, conference partici- pants log on to common software that allows them to broadcast their com- puter display to the group. This ability is quite useful for presenting with PowerPoint, demonstrating software, training, or collaborating on documents. Participants verbally communicate by phone or PC microphone. Some confer- encing software uses Webcams to broadcast video of the presenter and group participants. The Addison Fire Protection District provides professional fire protection and paramedic services to the 35,000 residents of Addison, Illinois. The district uses GoToMeeting to enable its employees to attend training and to support chief-to-chief meetings without requiring personnel to leave their assigned stations.24
Telepresence takes videoconferencing to the ultimate level. Telepresence systems, such as those from Cisco and Polycom, use high-resolution video and audio with high-definition displays to make it appear that conference par- ticipants are actually sitting around a table. Participants enter a telepresence studio where they sit at a table facing display screens that show other partici- pants in other locations. Cameras and microphones collect high-quality video and audio at all locations and transmit them over high-speed network connec- tions to provide an environment that replicates actual physical presence. Doc- ument cameras and computer software are used to share views of computer screens and documents with all participants.
You don’t need to be a big business to enjoy the benefits of video conver- sations. Free software is available to make video chat easy to use for anyone with a computer, a Webcam, and a high-speed Internet connection. Online applications such as Google Voice support video connections between Web users. For spontaneous, random video chat with strangers, you can go to the Chatroulette Web site. Software, such as FaceTime and Skype, provide computer-to-computer video chat so users can speak to each other face- to-face. In addition to offering text, audio, and video chat on computers and mobile devices, Facetime and Skype offer video phone service over Internet- connected TVs. Recent Internet-connected sets from Panasonic and Samsung
268 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
ship with the Skype software preloaded. You attach a Webcam to your TV to have a video chat from your sofa.
Blogging and Podcasting A Web log, typically called a blog, is a Web site that people and businesses use to share their observations, experiences, and opinions on a wide range of topics. The community of blogs and bloggers is often called the blogo- sphere. A blogger is a person who creates a blog, whereas blogging refers to the process of placing entries on a blog site. A blog is like a journal. When people post information to a blog, it is placed at the top of the blog page. Blogs can include links to external information and an area for com- ments submitted by visitors. Many organizations launch blogs as a way to communicate with customers and generate new business. Video content can also be placed on the Internet using the same approach as a blog. This is often called a video log or vlog.
A podcast is an audio broadcast you can listen to over the Internet. The name podcast originated from Apple’s iPod combined with the word broad- cast. A podcast is like an audio blog. Using PCs, recording software, and microphones, you can record podcast programs and place them on the Inter- net. Apple’s iTunes provides free access to tens of thousands of podcasts, which are sorted by topic and searchable by key word. See Figure 6.20. After you find a podcast, you can download it to your PC (Windows or Mac), to an MP3 player such as an iPod, or to any smartphone or tablet. You can also sub- scribe to podcasts using RSS software included in iTunes and other digital audio software.
Online Media and Entertainment Like news and information, all forms of media and entertainment have fol- lowed their audiences online. Music, movies, television program episodes,
FIGURE 6.20 Podcasts iTunes and other sites provide free access to tens of thousands of podcasts. Source: www.learnoutloud.com
Web log (blog): A Web site that people and businesses use to share their observations, experiences, and opinions on a wide range of topics.
podcast: An audio broadcast you can listen to over the Internet.
CHAPTER 6 • Networks and Cloud Computing 269
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
user-generated videos, e-books, and audio books are all available online to download and purchase or stream.
Content streaming is a method of transferring large media files over the Internet so that the data stream of voice and pictures plays more or less con- tinuously as the file is being downloaded. For example, rather than wait for an entire 5 MB video clip to download before they can play it, users can begin viewing a streamed video as it is being received. Content streaming works best when the transmission of a file can keep up with the playback of the file.
Music The Internet and the Web have made music more accessible than ever, with artists distributing their songs through online radio, subscription services, and download services. Spotify, Pandora, Napster, and Google Play Music are just a few examples of Internet music sites. Rhapsody Interna- tional has more than 3 million subscribers globally for its premium music services, including Napster, Rhapsody, and its Internet radio service, Rhap- sody unRadio.25 See Figure 6.21. Internet music has even helped sales of classical music by Mozart, Beethoven, and others. Internet companies, including Facebook, are starting to make music, movies, and other digital content available on their Web sites. Facebook, for example, allows online music companies, such as Spotify and Rdio, to post music-related news on its Web site.
Apple’s iTunes was one of the first online music services to find success. Microsoft, Amazon, Walmart, and other retailers also sell music online. Down- loaded music may include digital rights management (DRM) technology that prevents or limits the user’s ability to make copies or to play the music on multiple players.
Podcasts are yet another way to access music on the Web. Many indepen- dent artists provide samples of their music through podcasts. Podcast Alley includes podcasts from unsigned artists.
Movies, Video, and Television Television and movies are expanding to the Web in leaps and bounds. Online services such as Amazon Instant Video, Hulu, and Netflix provide television programming from hundreds of provi- ders, including most mainstream television networks. Walmart’s acquisition of Vudu has allowed the big discount retailer to successfully get into the Internet movie business. Increasingly, TV networks offer apps for streaming TV con- tent to tablets and other mobile devices. Some TV networks charge viewers to watch episodes of their favorite shows online. The Roku LT Streaming Media Box connects wirelessly to your TV and streams TV shows and movies from
FIGURE 6.21 Rhapsody Rhapsody provides streaming music by subscription. Source: rhapsody.com
content streaming: A method for transferring large media files over the Internet so that the data stream of voice and pictures plays more or less contin- uously as the file is being downloaded.
270 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
online sources such as Amazon Instant, Crackle, Disney, Hulu, Netflix, Pan- dora, and Xfinity TV.
Popcorn Time is a free program that uses peer-to-peer networking to download movies and TV programs. However, the software explicitly states that its users may be violating copyright law in their country. And indeed, Voltage Pictures has filed mass lawsuits against people who downloaded The Hurt Locker and Dallas Buyers Club. While these lawsuits aren’t always suc- cessful, they do create a risk for users who don’t anonymize their activity through a VPN service.26
No discussion of Internet video would be complete without mentioning YouTube. YouTube supports the online sharing of user-created videos. YouTube videos tend to be relatively short and cover a wide range of cate- gories from the nonsensical to college lectures. See Figure 6.22. It is esti- mated that 100 hours of video are uploaded to YouTube every minute and that over 6 billion hours of video are watched each month on YouTube. YouTube reaches more U.S. adults in the 18 to 34 age category than any cable network.27 Other video-streaming sites include AOL Video, Metacafe, and Vimeo. As more companies create and post videos to Web sites like YouTube, some IS departments are creating a new position—video content manager.
Online Games and Entertainment Video games have become a huge industry with worldwide annual revenue projected to exceed $100 billion by 2017.28
Zynga, a fast-growing Internet company, sells virtual animals and other virtual items for games, such as FarmVille. The company, for example, sells a clown pony with colorful clothes for about $5. Zynga has a VIP club for people that spend a lot on virtual items it offers for sale. Some Internet companies also sell food for virtual animals. People can feed and breed virtual animals and sell their offspring. The market for online gaming is very competitive and con- stantly changing. After Google included online games on its Web site, Face- book updated its online gaming offerings. Many video games are available online. They include single-user, multiuser, and massively multiuser games. The Web offers a multitude of games for all ages, including role-playing games, strategy games, and simulation games.
FIGURE 6.22 YouTube EDU YouTube EDU provides thousands of educational videos from hundreds of universities. Source: youtube.com/edu
CHAPTER 6 • Networks and Cloud Computing 271
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Game consoles such as the PlayStation, Wii, and Xbox provide multi- player options for online gaming over the Internet. Subscribers can play with or against other subscribers in 3D virtual environments. They can even talk to each other using a microphone headset.
Shopping Online Shopping on the Web can be convenient, easy, and cost effective. You can buy almost anything online, from books and clothing to cars and sports equipment. Groupon, for example, offers discounts at restaurants, spas, auto repair shops, music performances, and almost any other product or service offered in your area or city. Revenues for Groupon exceeded $3.1 billion in 2015.29
Other online companies offer different services. Dell and many other computer retailers provide tools that allow shoppers to specify every aspect and component of a computer system to purchase. ResumePlanet.com would be happy to create your professional résumé. AmazonFresh, Instacart, and Peapod are all willing to deliver groceries to your doorstep. Products and ser- vices abound online.
Many online shopping options are available to Web users. Online versions of retail stores often provide access to products that may be unavailable in local stores. JCPenney, Target, Walmart, and many others carry only a per- centage of their inventory in their retail stores; the other inventory is available online. To add to their other conveniences, many Web sites offer free shipping and pickup for returned items that don’t fit or otherwise meet a cus- tomer’s needs.
Web sites such as www.mySimon.com, www.DealTime.com, www.Price SCAN.com, www.PriceGrabber.com, and www.NexTag.com provide product price quotations from numerous online retailers to help you to find the best deal. Apps such as BuyVia, Purchx, RedLaser, and Shop Savvy enable users to compare prices at national and local outlets and lets you set up alerts (including location-based) for products. At a store and unsure if the price on the shelf is the lowest you can find? Use the UPC barcode scanner to get an answer on the spot.
Online clearinghouses, Web auctions, and marketplaces offer a platform for businesses and individuals to sell their products and belongings. Online clearinghouses, such as www.uBid.com, provide a method for manufacturers to liquidate stock and for consumers to find a good deal. Outdated or over- stocked items are put on the virtual auction block and users bid on the items. The highest bidder when the auction closes gets the merchandise—often for less than 50 percent of the advertised retail price.
The most popular online auction or marketplace is eBay, shown in Figure 6.23. The site provides a public platform for global trading where any- one can buy, sell, or trade practically anything. It offers a wide variety of
FIGURE 6.23 eBay eBay provides an online market- place where anyone can buy, sell, or trade practically anything. Source: www.ebay.com
272 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
features and services that enable members to buy and sell on the site quickly and conveniently. Buyers have the option to purchase items at a fixed price or in an auction-style format, where the highest bid wins the product.
Auction houses such as eBay accept limited liability for problems that buyers or sellers may experience in their transactions. Transactions that make use of the PayPal service are protected on eBay. Others, however, may be more risky. Participants should be aware that auction fraud is the most preva- lent type of fraud on the Internet.
Craigslist is a network of online communities that provides free online classified advertisements. It is a popular online marketplace for purchasing items from local individuals. Many shoppers turn to Craigslist rather than going to the classifieds in the local paper.
Businesses benefit from shopping online as well. Global supply manage- ment online services provide methods for businesses to find the best deals on the global market for raw materials and supplies needed to manufacture their products. Electronic exchanges provide an industry-specific Web resource cre- ated to deliver a convenient centralized platform for B2B e-commerce among manufacturers, suppliers, and customers.
Travel, Geolocation, and Navigation The Web has had a profound effect on the travel industry and the way people plan and prepare for trips. From getting assistance with short trips across town to planning long holidays abroad, travelers are turning to the Web to save time and money and to overcome much of the risk involved in visiting unknown places.
Travel Web sites such as Travelocity, Expedia, Kayak, and Priceline help travelers find the best deals on flights, hotels, car rentals, vacation packages, and cruises. Priceline offers a slightly different approach from the other Web sites. It allows shoppers to name a price they’re willing to pay for an airline ticket or a hotel room and then works to find an airline or hotel that can meet that price.
Mapping and geolocation tools are among the most popular and success- ful Web applications. MapQuest, Google Maps, and Bing Maps are examples. See Figure 6.24. By offering free street maps for locations around the world,
FIGURE 6.24 Google Maps Mapping software, such as Google Maps, provides streetside views of Times Square. Source: Google
CHAPTER 6 • Networks and Cloud Computing 273
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
these tools help travelers find their way. Provide your departure location and destination, and these online applications produce a map that displays the fastest route. Using GPS technologies, these tools can detect your current loca- tion and provide directions from where you are.
Google Maps also provides extensive location-specific business informa- tion, satellite imagery, up-to-the-minute traffic reports, and Street View. The latter is the result of Google employees driving the streets of the world’s cities in vehicles with high-tech camera gear, taking 360-degree images. These images are integrated into Google Maps to allow users to get a “street view” of an area that can be manipulated as if the viewer were actually walking down the street looking around. Bing Maps and Google Maps both offer high-resolution aerial photos and street-level 3D photographs.
Geographic information systems (GISs) provide geographic information layered over a map. For example, Google Earth provides options for viewing traffic, weather, local photos and videos, underwater features such as ship- wrecks and marine life, local attractions, businesses, and places of interest. Software such as Connect, Find My Friends, Phone Tracker, and Tracker allow you to find your friends on a map—with their permission—and will automatically notify you if a friend is near.
Geo-tagging is technology that allows for tagging information with an associated location. For example, Flickr and other photo software and services allow photos to be tagged with the location they were taken. Once tagged, it becomes easy to search for photos taken, for example, in Florida. Geo- tagging also makes it easy to overlay photos on a map, as Google Maps and Bing Maps have done. Facebook, Instagram, Snapchat, Twitter, and many other social networks have also made it possible for users to geo-tag photos, comments, tweets, and posts.
Geolocation information does pose a risk to privacy and security. Many people prefer that their location remain unknown, at least to strangers and often to acquaintances and even friends. Recently, criminals have made use of location information to determine when people are away from their resi- dences so that they can burglarize without fear of interruption.
Intranets and Extranets An intranet is an internal corporate network built using Internet and World Wide Web standards and products. Employees of an organization can use an intranet to gain access to corporate information. After getting their feet wet with public Web sites that promote company products and services, corpora- tions are seizing the Web as a swift way to streamline—even transform— their organizations. These private networks use the infrastructure and stan- dards of the Internet and the World Wide Web. Using an intranet offers one considerable advantage: many people are already familiar with Internet tech- nology, so they need little training to make effective use of their corporate intranet.
An intranet is an inexpensive yet powerful alternative to other forms of internal communication, including conventional computer setups. One of an intranet’s most obvious virtues is its ability to reduce the need for paper. Because Web browsers run on all types of computers, the same electronic information can be viewed by any employee. That means that all sorts of documents (such as internal phone books, procedure manuals, training man- uals, and requisition forms) can be inexpensively converted to electronic form, posted online, and easily updated. An intranet provides employees with an easy and intuitive approach to accessing information that was previ- ously difficult to obtain. For example, it is an ideal solution to providing information to a mobile salesforce that needs access to rapidly changing information.
274 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
A growing number of companies offer limited network access to selected customers and suppliers. Such networks are referred to as extranets, which connect people who are external to the company. An extranet is a network built using Web technologies that links selected resources of the intranet of a company with its customers, suppliers, or other business partners.
Corporate executives at a well-known global fast food chain wanted to improve their understanding of what was happening at each restaurant location and needed to communicate with franchisees to better serve their customers. The firm implemented an extranet, enabling individual franchi- sees to fine-tune their location-specific advertising and get it approved quickly by corporate-level staff. In addition, with the extranet, corporate employees now have a much better understanding of customers, both by location and in aggregate, based on information they are receiving from franchisees.30
Security and performance concerns are different for an extranet than for a Web site or network-based intranet. User authentication and privacy are criti- cal on an extranet so that information is protected. Obviously, the network must also be reliable and provide quick response to customers and suppliers. Table 6.6 summarizes the differences between users of the Internet, intranets, and extranets.
Secure intranet and extranet access applications usually require the use of a virtual private network (VPN), a secure connection between two points on the Internet. VPNs transfer information by encapsulating traffic in IP packets and sending the packets over the Internet, a practice called tunneling. Most VPNs are built and run by ISPs. Companies that use a VPN from an ISP have essentially outsourced their networks to save money on wide area network equipment and personnel. To limit access to the VPN to just individuals authorized to use it, authorized users may be issued a logon ID and a security token assigned to that logon ID. The security token displays a 10- to 12-digit password that changes every 30 seconds or so. A user must enter their logon ID and the security password valid for that logon ID at that moment in time.
Extranet to Support Craft Brewers There are currently more than 3,000 breweries in the United States, double the number a decade ago. Much of this growth has come from the popularity of craft brewers who, by definition, produce no more than 6 million barrels annually— many produce much less. Within the craft brewing industry, there is a strong trend toward packaging beer in four packs of 16-ounce cans rather than the six packs of 12-ounce cans, bottles, or jugs associated with the major U.S. brewers. One practi- cal reason for cans is that glass bottles typically cost more, which can make a big impact on the bottom line of many small brewers. In addition, craft brewers use 16-ounce cans as a means to set themselves apart from traditional beverage compa- nies, and many have built their identities around the distinct look of their 16-ounce can. However, most craft brewers are too small to afford their own canning lines.
Ball, Crown, and Rexam are major can manufacturers who work with many craft brewers to fill their cans. But these companies recently raised their minimum can order to the industry-standard truckload, which can range from roughly
TABLE 6.6 Summary of Internet, intranet, and extranet users Type User Need User ID and Password?
Internet Anyone No
Intranet Employees Yes
Extranet Business partners Yes
virtual private network (VPN): A secure connection between two points on the Internet; VPNs transfer informa- tion by encapsulating traffic in IP pack- ets and sending the packets over the Internet.
CHAPTER 6 • Networks and Cloud Computing 275
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
155,000 to 200,000, depending on the size of the can. Typically the smaller brewer- ies and their distributors need only a few thousand cans at a time. The new mini- mum can orders translate into a lot of cash up front as well as an increased amount of storage space for breweries. Many small breweries are struggling as a result.
A new type of company, mobile canners, has emerged to address this prob- lem. These firms haul their equipment to breweries, spend less than a day filling and labeling a few thousand cans, and then move on to the next customer. Over the past three years, about two dozen mobile canner companies have started offering mobile canning across the United States.
You are the owner of one of these mobile canners serving craft breweries across the Midwest, a very competitive market. One of your employees has approached you with an idea to set up an extranet that will allow your craft brew- ery customers to communicate their production schedules to you electronically. Their individual production schedules would be fed into a master schedule that would enable you to see and plan three to six months into the future. This way you could commit the people, equipment, and other resources to ensure that your cus- tomers’ needs will be met. Running out of cans is catastrophic for the brewers. If you let down a customer who is depending on you, you’ve lost a customer for life.
Review Questions 1. What advantages does use of an extranet provide versus more conventional
methods of communication—over the phone, via fax, etc.? 2. What measures can you take to control access to the master production sched-
ule so that only authorized customers may enter their data?
Critical Thinking Questions 1. What potential start-up issues may be involved in preparing your craft brew-
ery customers to use this new system? How will you overcome these issues? 2. Can you identify any other purposes for the extranet in addition to one-way
communication of production schedules? Briefly elaborate.
The Internet of Things
The Internet of Things (IoT) is a network of physical objects or “things” embedded with sensors, processors, software, and network connectivity capability to enable them to exchange data with the manufacturer of the device, device operators, and other connected devices. See Figure 6.25.
Sensors are being installed in a variety of machines and products, ranging from home appliances and parking garages to clothing and grocery products. A sensor is a device that is capable of sensing something about its surround- ings such as pressure, temperature, humidity, pH level, motion, vibration, or level of light. The sensor detects an event or changes in quantity and pro- duces a corresponding output, usually an electrical or optical signal. To be truly part of the IoT, these networked devices need IP addresses and a con- nection to the public Internet. The data is then transmitted over the Internet to an operational historical database containing data from many sensors. The database may be on a data storage device in a local control room, in an enter- prise data center in another state, or hundreds of miles away in the cloud. The operational data can be accessed via the Internet and analyzed by users with personal computers or portable devices including smartphones. Updates, alerts, or even automatic adjustments may be sent to the devices on the IoT based on this analysis. According to Don DeLoach, CEO and president of Infobright Inc., “manufacturing has been automated at various levels for many years, but IoT brings automation to a deep, broad level—one where
276 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
interconnectivity between various elements in manufacturing exists in a way it did not before.”31
Enlightened organizations apply analytics to these streams of data—even before the data is stored for post event analysis. This enables workers to detect patterns and potential problems as they are occurring and to make appropriate adjustments in the operation of the devices being measured. For example, sensors embedded in General Electric (GE) aircraft engines collect some 5,000 individual data points per second. This data is analyzed while the aircraft is in flight to adjust the way the aircraft performs, thereby reducing fuel consumption. The data is also used to plan predictive maintenance on the engines based on engine component wear and tear. In 2013, this technol- ogy helped GE earn $1 billion in incremental income by delivering perfor- mance improvements, less downtime, and more flying miles.32
Here are additional examples of organizations using sensors and the IoT to monitor and control key operational activities:
● Asset monitoring. Food and drug manufacturers can monitor shipping containers for changes in temperatures that could affect product quality and safety using cheap battery-powered sensors and 4G LTE connectivity.
● Construction. SK Solutions is using IoT technology to prevent cranes from colliding on a crowded construction site with 37 cranes and 5,000 workers near the world’s tallest building in the United Arab Emi- rates (UAE) city of Dubai. The Internet-connected system collects data from sensors mounted to the cranes and other equipment to detect if con- struction cranes are swinging too close to each other, and, if so, halts them from moving further.33
● Agriculture. Farmers are using IoT technology to collect data about water moisture and nitrogen levels to improve yields while conserving water, a precious commodity in many places.
● Manufacturing. IoT enabled sensors on plant-floor equipment, such as a conveyor line, can alert plant floor personnel to problems in real time.
FIGURE 6.25 The Internet of Things The IoT is a network of physical objects or “things” embedded with sensors, processors, software, and network connectivity capability to enable them to exchange data with the manufac- turer of the device, device operators, and other connected devices.
5. Data is analyzed to gain insights Into operation of devices on IoT
1. Sensors gather data 2. Data passes over network
3. Data from across the IoT Is gathered and stored- often in the cloud
R u sl
a n K
u d
ri n /S
h u tt
e rs
to c k.
c o m
M a xx
-S tu
d io
/S h u tt
e rs
to c k.
c o m
R a w
p ix
e l.c
o m
/S h u tt
e rs
to c k.
c o m
0 6 p
h o to
/S h u tt
e rs
to c k.
c o m
ro b
u a rt
/S h u tt
e rs
to c k.
c o m
4. Data is combined with other data from other systems
6. Alerts sent to people, Enterprise systems, or IoT Devices based on these insights
CHAPTER 6 • Networks and Cloud Computing 277
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The data can also be analyzed to uncover patterns to allow technicians to predict potential failures or redeploy resources in a more optimal fashion.
● Monitoring parking spaces. San Francisco uses connected sensors and meters to determine the demand for parking on certain streets, periodi- cally adjusting hourly rates so drivers are more likely to find a space when they arrive. Rates go up on more-crowded blocks and down on less-crowded ones. The city has deployed a low-power wide area net- work similar to a cellular network but designed for low-power IoT equipment—such as parking meters—to provide a low energy way for devices that are slower and cheaper than the typical LTE cellular network.
● Predictive Maintenance. Sensors are used extensively in the utilities industry to capture operational data to achieve 24/7 uptime. Sensor data is carefully analyzed to predict when critical pieces of equipment or power lines are about to fail so that quick, anticipatory corrective action can take place before any failure.
● Retailing. Retailers use in-store sensors to detect in-store behavior and optimize the shopping experience in order to increase revenue and mar- ket share. Streaming data from sensors is analyzed, along with other information (like inventory, social media chatter, and online-shop user profiles), to send customized and personal offers while the shopper is in the process of making a purchase decision.
● Traffic monitoring. The Aegan motorway is the oldest and most impor- tant motorway of Greece, connecting the country’s largest cities, Athens and Thessaloniki. More than 5,000 devices are deployed along a 200-km (124-mile) stretch of the highway to keep drivers safe and the roadway running efficiently. All these devices must work in a smooth and coordi- nated fashion to monitor traffic, detect traffic incidents using traffic cam- eras, warn travelers of road conditions via electronic billboards, and operate toll booths. The devices are connected to a central control system using Cisco’s Internet of Everything system to connect data, people, pro- cesses, and things.34
IoT applications can be classified into one of four types as shown in Table 6.7.
Unfortunately, there can be many issues with simply receiving and recog- nizing usable sensor data. Sometimes a faulty sensor or bad network connec- tion results in missing data or sensor data lacking time stamps indicating when the reading occurred. As a result, sensor data can be incomplete or con- tain inconsistent values indicating a potential sensor failure or a drop in a net- work. Developers of IoT systems must be prepared for and be able to detect faulty sensor data.
TABLE 6.7 Types of IoT applications Type of IoT Application Degree of Sensing Degree of Action
Connect and monitor Individual devices each gathering a small amount of data
Enables manual monitoring using simple threshold-based exception alerting
Control and react Individual devices each gathering a small amount of data
Automatic monitoring combined with remote control with trend analysis and reporting
Predict and adapt External data is used to augment sensor data Data used to preform predictive analysis and initiate preemptive action
Transform and explore
Sensor and external data used to provide new insights
New business models, products, and services are created
278 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
Security is a very major issue with IoT applications. In today’s manufactur- ing environment, the factory network is a closed environment designed to com- municate with plant sensors and devices but not typically with the outside world. So a key decision organizations must make when considering imple- mentation of an IoT is as follows: Are the benefits of doing so sufficient to overcome the risk of making detailed company information accessible through the Internet and exposing internal systems to hacking, viruses, and destructive malware? Hackers who gain access to an organization’s IoT can steal data, transfer money out of accounts, and shut down Web sites, and they can also wreck physical havoc by tampering with critical infrastructure like air traffic control systems, healthcare devices, power grids, and supervisory control and data acquisition (SCADA) systems. One of the first things developers of IoT application should focus on is building in security from the start. This needs to include ways of updating the system in a secure manner.
Manufacturer Weighs Converting to Internet of Things You are a member of the plant information systems group for a small manufac- turer of all-natural ingredient cosmetics. Your firm promotes itself as adhering to the highest standards of compliance and quality. Manufacturing is rigorously mon- itored via sensors and computer controls throughout the entire process, and auto- mated temperature controls ensure complete stability in the manufacturing environment. Sensor tracking is performed from the moment that raw materials enter your facility, throughout the manufacturing process, packaging, and on to distribution. The sensors and computer controls were installed when the plant was built in the 1990s and use proprietary communications protocols and are not Internet enabled. Data from these sensors is monitored by a group of three tech- nicians in the computer control room. Twelve workers are required to staff the control room 24/7, including weekends and most holidays.
Your company has just purchased a plant previously owned by one of your competitors in a nearby state. Your group has been asked to look at the feasibility of upgrading the sensors used in both plants to Internet-enabled sensors con- nected to the Internet of Things. This would make it possible for technicians in one control room to monitor the operation of both plants. Plant staffing could be reduced by 12 workers saving $1.2 million in labor expenses per year. It is esti- mated that the cost of replacing the existing sensors and converting to the Inter- net of Things is in the vicinity of $1.5 million.
Review Questions 1. Why is it necessary to replace the existing sensors to implement an IoT
network? 2. What additional benefits may arise from converting the plants to the Internet
of Things?
Critical Thinking Questions 1. What new risks are raised by placing the new system of sensors on the Inter-
net of Things? 2. What actions could be taken to reduce these risks?
Cloud Computing
Cloud computing refers to a computing environment in which software and storage are provided as an Internet service and accessed by users with their Web browser. See Figure 6.26. Google and Yahoo!, for example, store the email of many users, along with calendars, contacts, and to-do lists. Apple
cloud computing: A computing environment where software and stor- age are provided as an Internet service and are accessed with a Web browser.
CHAPTER 6 • Networks and Cloud Computing 279
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
developed its iCloud service to allow people to store their documents, music, photos, apps, and other content on its servers.35 In addition to its social net- working features, Facebook offers users the ability to store personal photos in the cloud—as does Flickr and a dozen other photo sites. Pandora delivers music, and Hulu and YouTube deliver movies via the cloud. Apache OpenOf- fice, Google Apps, Microsoft Office 365, Zoho, and others provide Web- delivered productivity and information management software. Communica- tions, contacts, photos, documents, music, and media are available to you from any Internet-connected device with cloud computing.
Cloud computing offers many advantages to businesses. With cloud com- puting, organizations can avoid large, up-front investments in hardware as well as the ongoing investment in the resources that would be required to manage that hardware. Instead, they can provision just the right type and size of information system resources from their cloud computing provider, pay for it on an ongoing basis, and let the service provider handle the system support and maintenance. In most cases, the cloud computing service provider pro- vides access to state-of-the-art technology at a fraction of the cost of owning it and without the lengthy delays that can occur when an organizations tries to acquire its own resources. This can increase the speed and reduce the costs of new product and service launches. For example, Spotify offers its users instant access to over 16 million licensed songs. The company faces an ongoing struggle to keep pace with the rapid release of new music, adding over 20,000 tracks to its catalog each day. Emil Fredriksson, operations direc- tor for Spotify, explains why the company employs cloud computing, “Spotify needed a storage solution that could scale very quickly without incurring long lead times for upgrades. This led us to cloud storage.” While establishing new storage previously required several months of preparation, it can now be obtained instantly through cloud computing.36
Cloud computing can be deployed in several different ways. The methods discussed thus far in this chapter are considered public cloud services. Public cloud computing refers to deployments in which service providers offer their cloud-based services to the general public, whether that is an individual using Google Calendar or a corporation using the Salesforce.com application. In a private cloud deployment, cloud technology is used within the confines of a private network.
Since 1992, The College Network and its partner universities have pro- vided accessible educational programs for individuals seeking degrees or pro- fessional certificates, entirely through distance learning. The College Network
FIGURE 6.26 Cloud computing Cloud computing uses applications and resources delivered via the Web. He
ld er
A lm ei da /S hu tt er st oc k. co m
280 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
chose EarthLink to provide a customized private cloud with dedicated servers. Conversion to the private network reduced the capital required for computer hardware and software, increased systems availability and avoided outages, and reallocated its valuable IT resources while EarthLink resources trouble- shoot any systems issues.37
Many organizations are turning to cloud computing as an approach to outsource some or all of their IT operations. This section defines cloud com- puting and its variations and points out some of its advantages as well as some potential issues, including problems associated with cost, scalability, security, and regulatory compliance.
Public Cloud Computing In a public cloud computing environment, a service provider organization owns and manages the infrastructure (including computing, networking, and storage devices) with cloud user organizations (called tenants) accessing slices of shared hardware resources via the Internet. The service provider can deliver increasing amounts of computing, network, and storage capacity on demand and without requiring any capital investment on the part of the cloud users. Thus, public cloud computing is a great solution for organiza- tions whose computing needs vary greatly depending on changes in demand. Amazon, Cisco Systems, IBM, Microsoft, Rackspace, Verizon Com- munications Inc., and VMWare are among the largest cloud computing ser- vice providers. These firms typically offer a monthly or annual subscription service model; they may also provide training, support, and data integration services.38
Public cloud computing can be a faster, cheaper, and more agile approach to building and managing your own IT infrastructure. However, since cloud users are using someone else’s data center, potential issues with service levels, loss of control, disaster recovery, and data security should not be overlooked. Data security in particular is a key concern because when using a public cloud computing service, you are relying on someone else to safeguard your data. In addition, your organization’s data may reside on the same storage device as another organization’s (perhaps even a competitor’s) data. All of the potential issues of concern must be investigated fully before entering into a public cloud computing arrangement. Organizations subject to tight regulation and complex regulatory requirements (e.g., financial, healthcare, and public utility organizations) must ensure that their own processes and applications as well as those of the cloud provider are compliant.
A major start-up issue is the effort of getting your organization’s data moved to the cloud in the first place. That introduces an issue of vendor lock- in—meaning once an organization’s servers and data are hosted with one cloud provider, it is not likely to be willing to go through the time-consuming migration process a second time to move to a different provider in the future. So choose your cloud provider wisely, as it is a business relationship that you and your business will likely need to live with for the foreseeable future.
Cloud computing can be divided into three main types of services (see Figure 6.27)
● Infrastructure as a service (IaaS) is an information systems strategy in which an organization outsources the equipment used to support its data processing operations, including servers, storage devices, and networking components. The service provider owns the equipment and is responsible for housing, running, and maintaining it. The outsourcing organization may pay on a per-use or monthly basis.
● Software as a service (SaaS) is a software delivery approach that pro- vides users with access to software remotely as a Web-based service. SaaS pricing is based on a monthly fee per user and typically results in lower
infrastructure as a service (IaaS): An information systems strat- egy in which an organization out- sources the equipment used to support its data processing operations, includ- ing servers, storage devices, and net- working components.
CHAPTER 6 • Networks and Cloud Computing 281
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
costs than a licensed application. Another advantage of SaaS is that because the software is hosted remotely, users do not need to purchase and install additional hardware to provide increased capacity. Further- more, the service provider handles necessary software maintenance and upgrades.
● Platform as a service (PaaS) provides users with a computing platform, typically including operating system, programming language execution environment, database services, and a Web server. The user can create an application or service using tools and/or libraries from the provider. The user also controls software deployment and configuration settings. The PaaS provider provides the networks, servers, storage, and other services required to host the consumer’s application. PaaS enables application developers to develop, test, and run their software solutions on a cloud platform without the cost and complexity of buying and managing the underlying hardware and software.
Organizations contemplating moving to the cloud are advised to proceed carefully, as almost one in three organizations encounter major challenges during the transition. Frequent problems include complex pricing arrange- ments and hidden costs that reduce expected cost savings, performance issues that cause wide variations in performance over time, poor user support, and greater than expected downtime.39
Condé Nast, publisher of Vogue, The New Yorker, and Wired magazines, among many others, decommissioned its 67,000-square-foot data center and migrated its data and processing capacity to Amazon Web Services (AWS). Over a period of just three months in 2014, the firm migrated 500 servers; 1 petabyte of storage; 100 database servers; 100 switches, routers, and fire- walls; and all of its mission-critical applications to AWS. According to Condé Nast, operating costs have been cut by 40 percent and performance has improved by 30 percent to 40 percent since the transition, which created a dynamic environment that can adjust as the company needs it to. The old data center facilities were eventually put on the market and sold.40
FIGURE 6.27 The cloud computing environment Cloud computing can be divided into three main types of services: infra- structure as a service (IaaS), soft- ware as a service (SaaS), and platform as a service (PaaS).
Cloud users– smartphones,
tablets, laptops, desktops
IaaS–Virtual machines, servers, storage devices, network devices
SaaS–Customer relationship Management, email, collaboration
PaaS–Operting system, programming language, database,
Web server
H el de r A lm ei da /S hu tt er st oc k. co m
platform as a service (PaaS): An approach that provides users with a computing platform, typically including operating system, programming lan- guage execution environment, data- base services, and Web server.
282 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Critical Thinking Exercise
Private Cloud Computing A private cloud environment is a single tenant cloud. Organizations that implement a private cloud often do so because they are concerned that their data will not be secure in a public cloud. Private clouds can be divided into two distinct types. Some organizations build their own on-premise private cloud, and others elect to have a service provider build and manage their pri- vate cloud (sometimes called a virtual private cloud). A general rule of thumb is that companies that spend $1 million or more per month on outsourced com- puting are better off implementing an on-premise private cloud.41 Many compli- cations must be overcome—and deep technical skills and sophisticated software are needed—to build and manage a successful private cloud. An orga- nization might establish several private clouds with one for finance, another one for product development, and a third for sales, for example. Each private cloud has a defined set of available resources and users, with predefined quotas that limit how much capacity users of that cloud can consume.
Revlon is a global cosmetics, hair color, fragrance, and skin-care company with recent annual sales exceeding $1.9 billion.42 The firm implemented an on-premises private cloud that includes 531 applications and makes up 97 percent of the company’s computing power. The private cloud has helped reduce application deployment time by 70 percent and, as a result of virtuali- zation and consolidation, reduced data center power consumption by 72 per- cent. In addition, the company achieved a net dollar savings of $70 million over a two-year period.43
Hybrid Cloud Computing Many IT industry observers believe that the desire for both agility and security will eventually lead organizations to adopt a hybrid cloud approach.44 A hybrid cloud is composed of both private and public clouds integrated through net- working. Organizations typically use the public cloud to run applications with less sensitive security requirements and highly fluctuating capacity needs, but run more critical applications, such as those with significant compliance require- ments, on the private portion of their hybrid cloud. So a hospital may run its Web conferencing and email applications on a public cloud while running its applica- tions that access patient records on a private cloud to meet Health Insurance Portability and Accountability Act (HIPAA) and other compliance requirements.
Autonomic Computing An enabling technology for cloud computing is autonomic computing or the ability of IT systems to manage themselves and adapt to changes in the com- puting environment, business policies, and operating objectives. The goal of autonomic computing is to create complex systems that run themselves, while keeping the system’s complexity invisible to the end user. Autonomic comput- ing addresses four key functions: self-configuring, self-healing, self-optimizing, and self-protecting.45 As cloud computing environments become increasingly complex, the number of skilled people required to manage these environments also increases. Software and hardware that implement autonomic computing are needed to reduce the overall cost of operating and managing complex cloud computing environments. While this is an emerging area, software products such as Tivoli from IBM are partially filling the need.
Should Heel Swaps Move to the Cloud? Heel Swaps is a Chicago-based start-up that sells a stretchable high heel shoe cover that contains a slip resistant out-sole, comes in a variety of sizes, colors and patterns, and slips on in seconds. The product enables you to transform your heels to match different outfits and is sold online at www.heelswaps.com.
private cloud environment: A single tenant cloud.
hybrid cloud: A cloud computing environment is composed of both pri- vate and public clouds integrated through networking.
autonomic computing: The ability of IT systems to manage themselves and adapt to changes in the computing environment, business policies, and operating objectives.
CHAPTER 6 • Networks and Cloud Computing 283
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The firm is just three weeks away from debuting its advertising on the popu- lar Steve Harvey show with some 3 million viewers per episode. Sales and prod- uct demand are expected to skyrocket. Unfortunately, management at Heel Swaps has just realized that the firm’s Web site does not have the processing capacity to serve the expected increase in shoppers. An IT consulting firm was hired to con- firm the need for additional capacity and to make recommendations on how to proceed. Their recommendation is that the current Web site platform be moved to Amazon Web Services (AWS) with elastic load. This service is capable of auto- matically scaling its request handling capacity to meet the demands of application traffic.
Review Questions 1. What advantages will moving the Web site to the cloud provide for Heel
Swaps? 2. What form of cloud computing is best for Heel Swaps—public, private, or
hybrid? Why?
Critical Thinking Questions 1. What common start-up problems should the IT consulting firm advise Heel
Swaps to avoid? 2. What future changes and developments should be planned for the Heel Swaps
Web site as the volume of business grows?
Summary
Principle: A network has many fundamental components, which—when carefully selected and effectively integrated—enable people to meet personal and organizational objectives.
A computer network consists of communications media, devices, and soft- ware connecting two or more computer systems or devices. Communications media are any material substance that carries an electronic signal to support communications between a sending and a receiving device. The computers and devices on the networks are also sometimes called network nodes.
The effective use of networks can help a company grow into an agile, pow- erful, and creative organization, giving it a long-term competitive advantage. Networks let users share hardware, programs, and databases across the organi- zation. They can transmit and receive information to improve organizational effectiveness and efficiency. They enable geographically separated workgroups to share documents and opinions, which fosters teamwork, innovative ideas, and new business strategies.
Network topology indicates how the communications links and hardware devices of the network are arranged. The three most common network topolo- gies are the star, bus, and mesh.
A network can be classified as personal area, local area, metropolitan, or wide area network depending on the physical distance between nodes on the network and the communications and services it provides.
The electronic flow of data across international and global boundaries is often called transborder data flow.
A client/server system is a network that connects a user’s computer (a cli- ent) to one or more server computers (servers). A client is often a PC that requests services from the server, shares processing tasks with the server, and displays the results.
284 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Channel bandwidth refers to the rate at which data can be exchanged mea- sured in bits per second.
Communications media can be divided into two broad categories: guided transmission media, in which a communications signal travels along a solid medium, and wireless media, in which the communications signal is sent over airwaves. Guided transmission media include twisted-pair wire cable, coaxial cable, and fiber-optic cable.
Wireless communication is the transfer of information between two or more points that are not connected by an electrical conductor. Wireless com- munications involves the broadcast of communications in one of three fre- quency ranges: microwave, radio, and infrared. Wireless communications options include near field communications, Bluetooth, Wi-Fi, and a variety of 3G and 4G communications options.
Networks require various communications hardware devices to operate, including modems, fax modems, multiplexers, private branch exchanges, front-end processors, switches, bridges, routers, and gateways.
Network management includes a wide range of technologies and processes that monitor the network and help identify and address problems before they can create a serious impact.
A network operating system (NOS) controls the computer systems and devices on a network, allowing them to communicate with one another. Network-management software enables a manager to monitor the use of indi- vidual computers and shared hardware, scan for viruses, and ensure compli- ance with software licenses.
Mobile device management (MDM) software manages and troubleshoots mobile devices remotely, pushing out applications, data, patches, and settings.
Software-defined networking (SDN) is an emerging approach to network- ing that allows network administrators to manage a network via a controller that does not require physical access to all the network devices.
Principle: Together, the Internet and the World Wide Web provide a highly effective infrastructure for delivering and accessing information and services.
The Internet is truly international in scope, with users on every continent. It is the world’s largest computer network. Actually, it is a collection of intercon- nected networks, all freely exchanging information.
The Internet transmits data from one computer (called a host) to another. The set of conventions used to pass packets from one host to another is known as the Internet Protocol (IP). Many other protocols are used with IP. The best known is the Transmission Control Protocol (TCP). TCP is so widely used that many people refer to the Internet protocol as TCP/IP, the combination of TCP and IP used by most Internet applications.
Each computer on the Internet has an assigned IP address for easy identifi- cation. A Uniform Resource Locator (URL) is a Web address that specifies the exact location of a Web page using letters and words that map to an IP address and a location on the host.
People can connect to the Internet backbone in several ways: via a LAN whose server is an Internet host, or via a dial-up connection, high-speed ser- vice, or wireless service. An Internet service provider is any company that pro- vides access to the Internet. To connect to the Internet through an ISP, you must have an account with the service provider and software that allows a direct link via TCP/IP.
The Internet and social media Web sites have emerged as important new channels for learning about world events, protesting the actions of organiza- tions and governments, and urging others to support one’s favorite causes or candidates. On the other hand, Internet censorship, the control or suppression
CHAPTER 6 • Networks and Cloud Computing 285
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
of the publishing or accessing of information on the Internet, is a growing problem.
The Web was designed to make information easy to find and organize. It connects billions of documents, which are now called Web pages, stored on millions of servers around the world. Web pages are connected to each other using hyperlinks, specially denoted text or graphics on a Web page, that, when clicked, open a new Web page containing related content. The pages are accessed and viewed using Web client software called a Web browser.
Many Web sites use CSS to define the design and layout of Web pages, XML to define the content, and HTML to join the content (XML) with the design (CSS).
Internet companies, including Amazon, eBay, and Google, use Web ser- vices to streamline and simplify communication among Web sites.
XML is also used within a Web page to describe and transfer data between Web service applications.
Today’s Web development applications allow developers to create Web sites using software that resembles a word processor. The software includes features that allow the developer to work directly with the HTML code or to use auto-generated code.
The use of social media in business is called Enterprise 2.0. Enterprise 2.0 applications, such as Salesforce’s Chatter, Jive Software’s Engage Dialog, and Yammer, enable employees to create business wikis, support social net- working, perform blogging, and create social bookmarks to quickly find information.
Social journalism provides important news not available elsewhere; how- ever, its sources may not be as reliable as mainstream media sources.
Today, schools at all levels provide online education and training. The Web is also an excellent source of job-related information.
A search engine is a valuable tool that enables you to find information on the Web by specifying words or phrases known as keywords, which are related to a topic of interest. Search engine optimization (SEO) is a process for driving traffic to a Web site by using techniques that improve the site’s ranking in search results.
Instant messaging is online, real-time communication between two or more people who are connected via the Internet.
Twitter is a Web application that allows users to send short text updates (up to 140 characters) from a smartphone or a Web browser to their Twitter followers.
Internet technologies support real-time online conferencing where partici- pants dial into a common phone number to share a multiparty phone conver- sation and, in many cases, live video of the participants.
A Web log, typically called a blog, is a Web site that people that people and businesses use to share their observations, experiences, and opinions on a wide range of topics.
A podcast is an audio broadcast you can listen to over the Internet. Content streaming is a method of transferring large media files over the
Internet so that the data stream of voice and pictures plays more or less contin- uously as the file is being downloaded.
The Internet and the Web have made music more accessible than ever, with artists distributing their songs through online radio, subscription services, and download services.
Television and movies are expanding to the Web in leaps and bounds. Online services such as Amazon Instant Video, Hulu, and Netflix provide tele- vision programming from hundreds of providers, including most mainstream television networks.
Video games have become a huge industry with worldwide annual revenue projected to exceed $100 billion by 2017.
You can buy almost anything via the Web, from books and clothing to cars and sports equipment.
286 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Travel Web sites help travelers find the best deals on flights, hotels, car rentals, vacation packages, and cruises. They have profoundly changed the travel industry and the way people plan trips and vacations.
An intranet is an internal corporate network built using Internet and World Wide Web standards and products. Employees of an organization can use an intranet to access corporate information.
A growing number of companies offer limited network access to selected customers and suppliers. Such networks are referred to as extranets, which connect people who are external to the company.
Principle: Organizations are using the Internet of Things (IoT) to capture and ana- lyze streams of sensor data to detect patterns and anomalies—not after the fact, but while they are occurring—in order to have a considerable impact on the event outcome.
The Internet of Things (IoT) is a network of physical objects or “things” embedded with sensors, processors, software, and network connectivity capa- bility to enable them to exchange data with the manufacturer of the device, device operators, and other connected devices.
There can be many issues with simply receiving and recognizing usable sensor data resulting in missing data or sensor data lacking time stamps indi- cating when the reading occurred.
One of the first things developers of IoT applications should focus on is building in security from the start.
Principle: Cloud computing provides access to state-of-the-art technology at a frac- tion of the cost of ownership and without the lengthy delays that can occur when an organization tries to acquire its own resources.
Cloud computing refers to a computing environment in which software and storage are provided as an Internet service and can be accessed by users with their Web browser. Computing activities are increasingly being delivered over the Internet rather than from installed software on PCs.
Cloud computing offers many advantages to businesses. By outsourcing business information systems to the cloud, a business saves on system design, installation, and maintenance. Employees can also access corporate systems from any Internet-connected computer using a standard Web browser.
Cloud computing can be deployed in several different ways, including pub- lic cloud computing, private cloud computing, and hybrid cloud computing.
Public cloud computing refers to deployments in which service providers offer their cloud-based services to the general public, whether that is an individ- ual using Google Calendar or a corporation using the Salesforce.com application. In a private cloud deployment, cloud technology is used within the confines of a private network. Organizations that implement a private cloud often do so because they are concerned that their data will not be secure in a public cloud.
A hybrid cloud is composed of both private and public clouds integrated through networking. Organizations typically use the public cloud to run appli- cations with less sensitive security requirements and highly fluctuating capacity needs, but run more critical applications, such as those with significant compli- ance requirements, on the private portion of their hybrid cloud.
Autonomic computing is an enabling technology for cloud computing that enables systems to manage themselves and adapt to changes in the computing environment, business policies, and operating objectives.
Cloud computing can be divided into three main types of services: infra- structure as a service (IaaS), software as a service (SaaS), and platform as a service (PaaS).
CHAPTER 6 • Networks and Cloud Computing 287
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Organizations contemplating moving to the cloud are advised to proceed carefully, as almost one in three organizations encounter major challenges in their move. Frequent problems include complex pricing arrangements and hid- den costs that reduce expected cost savings, performance issues that cause wide variations in performance over time, poor user support, and greater than expected downtime.
Key Terms
autonomic computing
Bluetooth
broadband communications
bus network
Cascading Style Sheet (CSS)
channel bandwidth
client/server architecture
cloud computing
communications medium
computer network
content streaming
Extensible Markup Language (XML)
hybrid cloud
hyperlink
Hypertext Markup Language (HTML)
infrastructure as a service (IaaS)
instant messaging
Internet backbone
Internet Protocol (IP)
Internet service provider (ISP)
IP address
local area network (LAN)
Long Term Evolution (LTE)
mesh network
metropolitan area network (MAN)
mobile device management (MDM) software
near field communication (NFC)
network operating system (NOS)
network topology
network-management software
personal area network (PAN)
platform as a service (PaaS)
podcast
private cloud environment
search engine
search engine optimization (SEO)
software-defined networking (SDN)
star network
tag
Uniform Resource Locator (URL)
virtual private network (VPN)
Web 2.0
Web browser
Web log (blog)
wide area network (WAN)
Wi-Fi
wireless communication
Chapter 6: Self-Assessment Test
A network has many fundamental components— which, when carefully selected and effectively integrated—enable people to meet personal and organizational objectives.
1. Communications media can be divided into two broad categories . a. infrared and microwave b. fiber optic and cable c. packet switching and circuit switching d. guided and wireless
2. refers to the rate at which data can be exchanged and is measured in bits per second. a. Communications frequency b. Channel bandwidth c. Communications wavelength d. Broadband
3. indicates how the communications links and hardware devices of the network are arranged. a. Communications protocol b. Transmission media
288 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
c. Network topology d. None of the above
4. Twisted-pair wire, cable, coaxial cable, and fiber optic cable are all examples of guided communi- cations media. True/False
5. Systems software that controls the computer systems and devices on a network and allows them to communicate with one another is called . a. network operating system b. mobile device management software c. network-management software d. software-defined networking
Together, the Internet and the World Wide Web provide a highly effective infrastructure for deliver- ing and accessing information and services.
6. The Internet transmits data in packets from one computer to another using a set of communica- tions conventions called the .
7. Every computer on the Internet has an assigned IP address for easy identification. True/False
8. A is a Web address that specifies the exact location of a Web page using letters and words that map to an IP address and the location on the host. a. Universal Resource Locator b. Uniform Reference Locator c. Universal Web address d. Uniform Resource Locator
9. Many Web sites use CSS to define the design and layout of Web pages, and XML to define the content, and HTML to join the content with the design. True/False
10. The use of social media in business is called . a. social journalism b. blogging c. business wikis d. Enterprise 2.0
11. A(n) is an internal corporate network built using Internet and World Wide Web standards and products.
Organizations are using the Internet of Things (IoT) to capture and analyze streams of sensor data to detect patterns and anomalies—not after the fact, but while they are occurring—in order to have a considerable impact on the event outcome.
12. There can be many issues with simply receiving and recognizing usable sensor data resulting in sensor data lacking time stamps indicating when the reading occurred or in data.
13. One of the first things developers of IoT applica- tions should focus on is building in from the start. a. redundancy and backup b. cost controls c. security d. disaster recovery
Cloud computing provides access to state- of-the-art technology at a fraction of the cost of ownership and without the lengthy delays that can occur when an organization tries to acquire its own resources.
14. Cloud computing is a computing environment in which software and storage are provided as an Internet service and accessed by users with their . a. Web browser b. mobile computing device such as a
smartphone or tablet c. search engine d. Virtual Private Network (VPN)
15. is an enabling technology for cloud computing that enables systems to manage themselves and adapt to changes in the comput- ing environment, business policies, and operating objectives.
Chapter 6: Self-Assessment Test Answers
1. d 2. b 3. c 4. True 5. a 6. Internet protocol or TCP/IP 7. True 8. d
9. True 10. d 11. intranet 12. missing 13. c 14. a 15. Autonomic computing
CHAPTER 6 • Networks and Cloud Computing 289
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
Review Questions
1. Define the term “computer network.” 2. Define the term “network topology,” and iden-
tify three common network topologies in use today.
3. What is meant by client/server architecture? Describe how this architecture works.
4. Define the term “channel bandwidth.” Why is this an important characteristic of a communication channel?
5. Identify the names of the three primary frequency ranges used for wireless communications.
6. What is Bluetooth wireless communication? Give an example of the use of this technology.
7. What advantage does a communications satellite have over a terrestrial microwave system?
8. What role does a network operating system play? 9. What is software-defined networking (SDN), and
what advantages does it offer?
10. What is Internet censorship? Identify some coun- tries in which this is a major issue.
11. What comprises the Internet backbone? 12. What is an IP address? What is a Uniform
Resource Locator, and how is it used? 13. What is XML, and how is it used? 14. What are CSS, and are how are they used? 15. What are Web services? Give an example of a
Web service. 16. What is Enterprise 2.0 and how is it used? 17. What is the Internet of Things (IoT), and how is it
used? 18. What is cloud computing? Identify three
approaches to deploying cloud computing. 19. What is autonomic computing, and how does it
benefit cloud computing?
Discussion Questions
1. Briefly discuss the differences between the star, bus, peer-to-peer, and mesh network topologies.
2. Briefly discuss the differences between a per- sonal area network, a local area network, a met- ropolitan area network, and a wide area network.
3. Identify and briefly discuss three common guided transmission media types.
4. Describe how near field communications works, and give an example of the use of this technology.
5. Describe how a Wi-Fi network works. 6. Describe how a terrestrial microwave system
works. 7. Summarize the differences among 1G, 2G, 3G,
and 4G wireless communications systems. 8. Discuss the role of network-management software—
including mobile device management software. 9. Provide a brief history of the Internet.
10. Briefly describe how the Internet works. 11. Identify and briefly describe five different ways to
access the Internet. 12. Briefly describe how the World Wide Web works.
13. Discuss the role of Hypertext Markup Language and HTML tags.
14. What is search engine optimization, and how is it accomplished?
15. Identify some of the issues and concerns associ- ated with connecting devices to the Internet of Things (IoT).
16. Identify and briefly discuss four problems fre- quently encountered by organizations moving to the cloud.
17. One of the key issues associated with the develop- ment of a Web site is getting people to visit it. If you were developing a Web site, how would you inform others about it and make it interesting enough that they would return and tell others about it?
18. Keep track of the amount of time you spend on social networking sites for one week. Do you think that this is time well spent? Why or why not?
19. Briefly summarize the differences in how the Internet, a company intranet, and an extranet are accessed and used.
Problem-Solving Exercises
1. Develop a spreadsheet to track the amount of time you spend each day on Twitter, Instagram, Facebook, and other social networks. Record your times on each network for a two-week period. How much of this time would you con- sider informative and worthwhile? How much time is just entertainment?
2. Do research to learn about the Amazon Web Ser- vices, Google Compute Engine, and Windows Azure cloud computing services. Write a paragraph summarizing each service. Prepare a spreadsheet to compare the three services based on ease of use, cost, and other key criteria of your choosing.
290 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
3. Think of a business that you might like to estab- lish. Use a word processor to define the business in terms of what product(s) or service(s) it pro- vides, where it is located, and its name. Go to www.godaddy.com, and find an appropriate domain name for your business that is not yet
taken. Shop around online for the best deal on Web site hosting. Write a paragraph about your experience finding a name, why you chose the name that you did, and how much it would cost you to register the name and host a site.
Team Activities
1. Form a team to identify IoT sensors in high demand in the medical device/pharma/bio-med industry. How are these sensors being used? What companies manufacture them? What do they cost if purchased in large quantities? Write a summary of your team’s findings.
2. Plan, set up, and execute a meeting with another team wherein you meet via the use of a Web service such as GoToMeeting or WebEx. What are
some of the problems you encountered in setting up and executing the meeting? How would you evaluate the effectiveness of the meeting? What could have been done to make the meeting more effective?
3. Try using the Chinese search engine Baidu to find information on several politically sensitive topics or people. Write a brief summary of your experience.
Web Exercises
1. Do research on the Web to identify the three to five countries that exercise the greatest amount of Internet censorship on its citizens. Briefly docu- ment each country’s censorship practices.
2. Net neutrality is the principle that Internet service providers should be required to treat all Internet traffic running over their wired and wireless net- works the same—without favoring content from some sources and/or blocking or slowing (also known as throttling) content from others. The debate over net neutrality raises questions about how best to keep the Internet open and impartial
while still offering Internet service providers suf- ficient incentive to expand their networks to serve more customers and to support new ser- vices. Do research to find out the current status of net neutrality in the United States. Write a report summarizing your findings.
3. Do research to identify the top ten social net- works in terms of number of worldwide active accounts. Which of these networks appears to be the fastest growing, slowest growing? Can you find a reason for the difference in growth rates? Write a report summarizing what you found.
Career Exercises
1. View the movie The Social Network or read the book The Boy Billionaire, which offers insights into Mark Zuckerberg, the founder of Facebook. How did Zuckerberg recognize the potential of social networking? How was he able to turn this basic idea into a billion dollar organization? What background, education, and experiences did he have that helped him in this endeavor?
2. Identify a social networking organization that interests you. Do research to identify current job
openings and the qualifications needed to fill open positions at the firm. Do any of these posi- tions appeal to you? Why or why not?
3. Explore LinkedIn, a social media network for professional networking. Use some of its features to find former students of your school or cowor- kers at your place of employment. What are some of the advantages of using such a Web site? What are some of the potential problems? Would you consider joining LinkedIn? Why or why not?
Case Studies
Case One
Cloud Helps Fight Cancer Each minute one person in the United States dies from cancer—over half a million deaths per year. Thousands of
scientists and physicians are working around the clock to fight cancer where it starts—in our DNA.
DNA is a molecule present in our cells that carries most of the genetic instructions used in the development, functioning, and reproduction of all known living organisms.
CHAPTER 6 • Networks and Cloud Computing 291
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
The information in DNA is stored as a code made up of four chemical bases adenine (A), guanine (G), cytosine (C), and thymine (T). Human DNA consists of about 3 billion bases, and more than 99 percent of those bases are the same in all people. The complete set of DNA instructions is called your genome, and it comes packaged into two sets of chromosomes, one set from your mother and one set from your father. Sometimes those instructions are miscoded or misread, which can cause cells to malfunction and grow out of control—resulting in cancer.
Doctors now routinely use patient genetic data along with personal data and health factors to design highly personalized treatments for cancer patients. However, genome sequencing is a highly complex effort—it takes about 100 gigabytes of data to represent just a single human genome. Only a few years ago, it was not even feasible to analyze an entire human genome. The Human Genome Project (HGP) was the international, collaborative research program whose goal was the complete mapping and understanding of all the genes of human beings. The HGP took over 15 years and cost in the neighborhood of $3 billion, but the result was the ability to read the complete genetic blueprint for humans.
It takes a computer with powerful processing power and prodigious amounts of storage capacity to process all the patient data required to sequence their genome. Most researchers simply do not have the in-house computing facilities equal to the challenge. As a result, they turn to cloud computing solutions, such as the Amazon Web Services public cloud system. Thanks to cloud computing and other technical advances, sequencing of a human genome can now be done in about 40 hours at a cost of under $5000.
Researchers at Nationwide Children’s Hospital in Columbus, Ohio invented Churchill, a software application that analyzes gene sequences very efficiently. Using cloud computing and this new algorithm, researchers at the hospital are now able to analyze a thousand individual genomes over the period of a week. Not only does this technology enable the hospital to help individual patients, it also helps large-scale research efforts exploring the genetic mutations that cause diseases.
Using the cloud also enables doctors and researchers worldwide to share information and collaborate more easily. The Cancer Genome Atlas (TCGA) is a research program supported by the National Cancer Institute and the National Human Genome Research Institute, whose goal is to identify genomic changes in more than 20 different types of human cancer. TCGA researchers compare the DNA samples of normal tissue with cancer tissue taken from the same patient to identify changes specific to that cancer. The researchers hope to analyze hundreds of samples for each type of cancer from many different patients to better understand what makes one cancer different from another cancer. This is critical because two patients with the same type of cancer can experience very different outcomes and respond very differently to the same treatment. Researchers hope to develop more effective, individualized treatments for each patient by connecting specific genomic changes with specific outcomes.
Critical Thinking Questions: 1. What advantages does cloud computing offer physi-
cians and researchers in their fight against cancer?
2. Estimate the amount of data required to analyze the human genome of 100 patients for each of 20 different types of cancer.
3. Physicians must abide by HIPAA regulations when transmitting data back and forth to the cloud. The penalties for noncompliance are based on the level of negligence and can range from $100 to $50,000 per violation (or per record). Violations can also carry criminal charges, resulting in jail time. What measures can be taken when using cloud computing to ensure that patient confidentiality will not be violated?
SOURCES: Gaudin, Sharon, “How The Cloud Helps Fight Cancer,” Computerworld, May 20, 2015, www.computerworld.com/article /2923753/cloud-computing/how-the-cloud-helps-fight-cancer.html; “Deoxyribonucleic Acid Fact Sheet,” www.genome.gov/25520880, accessed December 7, 2015; “Cancer Genomics What Does It Mean to You?,” The Cancer Genome Atlas, http//cancergenome.nih.gov/Publish edContent/Files/pdfs/1.1.0_CancerGenomics_TCGA-Genomics-Bro chure-508.pdf; “TCGA on AWS,” http//aws.amazon.com/public-data- sets/tcga, accessed December 7, 2015; “An Overview of the Human Genome Project,” National Human Genome Research Institute, www .genome.gov/12011238, accessed December 10, 2015.
Case Two
Globacom Invests in Its Mobile Network Infrastructure in Africa Approximately 46 percent of the world’s population now has access to the Internet—a key factor in encouraging economic activity and expanding educational opportunities. However, Internet access in Africa continues to trail that of the rest of the world. The continent contains 16 percent of the world’s population, but represents only about 9.8 percent of the world’s Internet users. Affordability and logistical barriers still prevent the vast majority of Africa’s population from accessing the wealth of information and services available online.
Increasingly, however, people in Africa—and around the globe—are breaking down those barriers by using mobile devices to gain access to the Internet. In 2015, there were an estimated 7 billion mobile broadband subscriptions worldwide, and that number is growing by almost 25 percent each year. As the world’s dependence on mobile technologies grows, telecommunications companies are increasing their investment in the networks that support those technologies.
Globacom Limited is one of the fastest-growing mobile communications companies in the world, operating mobile networks in Nigeria and several other West African countries under the GLO brand. The company is the second largest mobile network operator in Nigeria, where mobile devices account for over 76 percent of Web traffic (more than double the world average of 33 percent).
In order to provide reliable Internet access to its mobile subscribers, Globacom invests heavily in its network infrastructure. In 2011, the company became the first to lay a high-capacity, fiber-optic submarine cable from the United Kingdom to Nigeria. The large-scale project, which also connects points in Ghana, Senegal, Mauritania, Morocco, Portugal, and Spain, cost the company more than $800 million. The underwater cable system allowed Globacom to
292 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
expand its network, boost capacity, and increase Internet upload and download speeds. For Globacom’s mobile subscribers in Nigeria, the new cable also represented a significant jump in international connectivity—considered to be one of the critical requirements for the development of the Internet in any country.
Globacom also makes use of big data capabilities to improve its network performance and enhance the quality of the customer service it provides its subscribers. The company recently implemented Oracle’s Big Data Appliance platform, a hardware and software package that allows the company to analyze both structured and unstructured data related to issues such as terminated networks, event times and durations, event cost, quality of service, and overall network performance. Globacom’s IT staff uses the Oracle platform to capture and analyze more than 1 billion call-data records per day—the equivalent of 5 gigabytes of user-traffic information per second. According to Jameel Mohammed, Globacom’s group chief operating officer, the Oracle platform “enables us to capture, store, and analyze data that we were unable to access before. We now analyze network events 40 times faster.”
With a reduction in average query response time for network events from three minutes to five seconds, Globacom’s call center agents are better able to provide subscribers fast and reliable information regarding network performance. The company has significantly increased its “first-call resolution rate,” saving the company more than 13 million call center minutes, or the equivalent of 80 full- time customer service employees, annually.
For subscribers, Globacom’s investment in its network infrastructure along with its big data initiatives translate into improved network coverage and reliability, better customer service, and, perhaps most importantly, easier and more consistent access to the Internet—including a wide range of modern communications services, such as online banking and payment services, teleconferencing, distance learning, and telemedicine.
Critical Thinking Questions 1. What incentives does a mobile network operator have
to make ongoing, expensive investments in its network
infrastructure? What have been some of the benefits to Globacom’s subscribers of the company’s investment in its mobile network?
2. Big data applications and techniques allow network operators to make use of large quantities of network- related data, which was previously discarded due to the time and resources required to effectively ana- lyze the data. What are some data points that you think would be most useful for a communications company to analyze when looking for ways to improve their network performance? What data points related to network activity and performance might be useful from a customer service or market- ing standpoint?
3. Do research online to learn about some of the other factors that have impeded Internet access for most of the population in Africa and other parts of the world. What factors besides the level of network infrastruc- ture investment might affect Internet access rates in a given country?
SOURCES: “Ericsson Mobility Report,” Ericsson, www .ericsson.com/res/docs/2016/mobility-report/ericsson -mobility-report-feb-2016-interim.pdf, accessed February 20, 2016; “ICT Facts & Figures: The World in 2015,” International Telecommunication Union, www.itu.int/en/ITU-D/Statistics /Documents/facts/ICTFactsFigures2015.pdf, May 2015; “Internet Users in the World by Region, November 2015” Internet World Stats, www.internetworldstats.com/stats.htm, accessed February 20, 2016; “Internet Goes Mobile: Country Report Nigeria,” Ericsson www.ericsson.com/res/docs/2015 /consumerlab/ericsson-consumerlab-internet-goes-mobile- nigeria.pdf, accessed February 21, 2016; “Globacom Saves Over 35,000 Call-Processing Minutes Daily and Improves Data for Decision-Making and Customer Service,” Oracle, www.oracle.com/us/corporate/customers/customersearch /globacom-1-big-data-ss-2207715.html, accessed February 20, 2016; Banks, Roland, “There Are Now 3 Billion Internet Users Worldwide in 2015,” Mobile Industry Review, January 26, 2015, www.mobileindustryreview.com/2015 /01/3- billion-internet-users-2015.html; “Africa’s 50 Richest: #7 Mike Adenuga,” Forbes, accessed February 21, 2016.
Notes
1. Ignatescu, Adrian, “Most Popular Instant Messaging Apps in 2014-Review & Infographic,” Techchangers (blog), March 30, 2014, www.techchangers.com /instant-messaging-apps-review-most-popular-2014 -top10.
2. Claburn, Thomas, “Walmart Jumps into Crowded Mobile Payment Market,” Information Week, December 10, 2015, www.informationweek.com/mobile/mobile-appli cations/walmart-jumps-into-crowded-mobile-payment -market/d/d-id/1323518.
3. Hamblen, Matt, “Levi’s Stadium App Makes Use of Aruba Beacons to Help 49ers Fans Get Around,” Computer- world, November 4, 2014, www.computerworld.com /article/2842829/levis-stadium-app-makes-use-of -aruba-beacons-to-help-49ers-fans-get-around.html.
4. Wohnoutak, Bill, “Childhood Burn Care: A Telemedicine Success Story,” Dermatology Times, February 18, 2015, dermatologytimes.modernmedicine.com/dermatology -times/news/telemedicines-role-childhood-burn-care? page=full.
5. “Saudi Telecom Company Deploys High-Capacity International Mesh Network Powered by Ciena,” Investors.com, October 21, 2015, news.investors .com/newsfeed-business-wire/102115-141851937-saudi -telecom-company-deploys-high-capacity-international -mesh-network-powered-by-ciena.aspx.
6. “PIONIER-Polish Optical Internet,” PIONIER, http://blog. pionier.net.pl/sc2013/pionier, accessed January 6, 2014.
7. “About Chi-X Japan,” Chi-X Japan, www.chi-x.jp/ABOUT US.aspx, accessed February 19, 2016.
CHAPTER 6 • Networks and Cloud Computing 293
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203
8. “Iridium Everywhere,” Iridium, www.iridium.com, accessed December 11, 2015.
9. Cheng, Roger, “Verizon to Be First to Field-Test Crazy- Fast 5G Wireless,” CNET, September 8, 2015, www.cnet .com/news/verizon-to-hold-worlds-first-crazy-fast-5g -wireless-field-tests-next-year.
10. “PRTG Network Monitor Helps Small, Family-Owned IT Consulting Business Provide World-Class Reliability,” Paessler, www.paessler.com/company/casestudies/covell_ group_uses_prtg, accessed December 16, 2015.
11. Higginbotham, Stacy, “Google Launches Andromeda, a Software Defined Network Underlying Its Cloud,” Gigaom, April 2, 2014, https://gigaom.com/2014/04/02 /google-launches-andromeda-a-software-defined-net work-underlying-its-cloud.
12. “Internet World Stats,” Internet World Stats, www.internet worldstats.com/stats.htm, accessed December 13, 2015.
13. Alexandrova, Katerina, “Using New Media Effectively: An Analysis of Barack Obama’s Election Campaign Aimed at Young Americans,” Thesis, New York 2010, www.academia.edu/1526998/Using_New_Media_Effecti vely_an_Analysis_of_Barack_Obamas_Election_ Campaign_Aimed_at_Young_Americans.
14. “Electronic Weapons: Syria Shows the Way,” Strategy- Page, January 13, 2014, www.strategypage.com/htmw /htecm/articles/20140113.aspx.
15. Smith, Charlie, “Jimmy Wales on Censorship in China,” The WorldPost, September 4, 2015, www.huffingtonpost .com/charlie-smith/jimmy-wales-on-censor ship_b_8087400.html.
16. Howard, Alexander, “In Saudi Arabia, Embracing New ‘Freedom’ on Social Media May Come with Serious Risks,” Huffington Post, May 26, 2015, www.huffington post.com/2015/05/26/saudi-arabia-social-media_n _7444742.html.
17. Naím, Moisés and Bennet, Phillip, “The Anti-Information Age,” The Atlantic, February 16, 2015, www.theatlantic .com/international/archive/2015/02/government-censor ship-21st-century-internet/385528.
18. “AWS Case Study: Airbnb,” Amazon Web Services, http:// aws.amazon.com/solutions/case-studies/airbnb, accessed December 13, 2015.
19. Schnuer, Jenna, “Meet the Winners of Our Entrepreneur of 2014 Awards,” Entrepreneur, January 20, 2015, www .entrepreneur.com/slideshow/240844.
20. “Transforming Tyco with Yammer,” Yammer, https:// about.yammer.com/customers/tyco, accessed January 13, 2014.
21. “About Khan Academy,” Khan Academy, www.khana cademy.org/about, accessed December 13, 2015.
22. “About Us,” NPower, www.npower.org/Our-Purpose /Our-Purpose.aspx, accessed December 13, 2015.
23. Ignatescu, “Most Popular Instant Messaging Apps in 2014.”
24. “Addison Fire Saves $5K Yearly Using GoToMeeting with HDFaces Video Conferencing,” Citrix, http://news .citrixonline.com/wp-content/uploads/2013/07/Addision -Fire-District_G2M_ss.pdf, accessed January 30, 2014.
25. “Rhapsody Music Service Now Has 3 Million Paying Subscribers,” Variety, July 22, 2015, http://variety.com /2015/digital/news/rhapsody-music-service-now-has-3 -million-paying-subscribers-1201545576.
26. Newman, Jared, “Popcorn Time Users Sued again, This Time for Streaming 2015’s Survivor,” PC World, Sep- tember 2, 2015, www.pcworld.com/article/2979681/soft ware-entertainment/popcorn-time-users-sued-again -this-time-for-streaming-2015s-survivor.html.
27. “YouTube Statistics,” YouTube, www.youtube.com/yt /press/statistics.html, accessed January 21, 2014.
28. Takahashi, Dean, “Mobile Gaming Could Drive Entire Video Game Industry to $100 Billion in Revenue by 2017,” Gamesbeat, January 14, 2014, http://venturebeat .com/2014/01/14/mobile-gaming-could-drive-entire -game-industry-to-100b-in-revenue-by-2017.
29. “Groupon Announces Fourth Quarter and Fiscal Year 2015 Results,” Groupon, February 11, 2016, http://inves tor.groupon.com/releasedetail.cfm?releaseid=954580.
30. “4 Real-World Stories about the Interactive Intranet,” Jive (blog), June 18, 2015, www.jivesoftware.com/blog/real -world-stories-about-interactive-intranet.
31. Romeo, Jim, et al, “A Practical Guide to the Internet of Things,” Tech Target, (c) 2015.
32. Kepes, Ben, “The Internet of Things, Coming Soon to an Airline near You,” Runway Girl Network, March 14, 2015, www.runwaygirlnetwork.com/2015/03/14/the -internet-of-things-coming-soon-to-an-airline-near-you.
33. van Zyl, Gareth, “Internet of Everything Helps Prevent Dubai Crane Collisions,” Web Africa, June 4, 2014, www .itwebafrica.com/cloud/516-africa/233009-internet-of -everything-helps-prevent-dubai-crane-collisions.
34. “Reducing Costs with a Converged Roadway Network,” Cisco, www.cisco.com/c/dam/en/us/solutions/collateral /industry-solutions/Aegean-Motorway-voc-case-study .pdf, accessed January 4, 2015.
35. “iCloud,” Apple, www.apple.com/icloud, accessed January 8, 2014.
36. “AWS Case Study: Spotify,” Amazon Web Services, aws.amazon.com/solutions/case-studies/spotify, accessed December 17, 2015.
37. “The College Network,” www.slideshare.net/EarthLink Business/private-cloud-case-study-the-college-network -earth-link-business, accessed January 9, 2014.
38. “Cloud Computing Options,” PC Today, June 2014. 39. Ramel, David, “New Research Shows ‘Staggering’ Failure
Rates for Cloud Projects,” Enterprise Systems, June 26, 2014, http://esj.com/articles/2014/06/26/cloud-projects -fail.aspx.
40. Olavsrud, Thor, “Why a Media Giant Sold Its Data Center and Headed to the Cloud,” CIO, July 15, 2014, www.cio .com/article/2453894/data-center/why-a-media-giant- sold-its-data-center-and-headed-to-the-cloud.html.
41. Ovide, Shira and Boulton, Clint, “Flood of Rivals Could Burst Amazon’s Cloud,” The Wall Street Journal, July 25, 2014, www.wsj.com/articles/storm-clouds-over-amazon -business-1406328539?mg=id=wsj.
42. “Revlon Fact Sheet,” Revlon, www.revlon.com/about /fact-sheet, accessed February 21, 2016.
43. “Revlon, Inc. Moves to the Cloud with Juniper Networks to Increase Global Business Agility,” Juniper Networks, www.juniper.net/assets/us/en/local/pdf/case-studies /3520444-en.pdf, accessed October 6, 2014.
44. “Cloud Computing Options,” PC Today, June 2014. 45. “Autonomic Computing,” IBM, www.ibm.com/developer
works/tivoli/autonomic.html, accessed October 7, 2014.
294 PART 2 • Information Technology Concepts
Copyright 2018 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. WCN 02-200-203