Chapter6pdf.pdf

School of Computer & Information Sciences

ITS-532 Cloud Computing

Chapter 6 – Data Storage in the Cloud

Learning Objectives • Discuss the role of storage-area networks. • Discuss the role of network-attached storage. • Describe cloud-based storage solutions. • List the pros and cons of cloud-based storage. • Describe cloud-based database solutions. • List the pros and cons of cloud-based databases. • Describe specific cloud-based data storage solutions such as backups and

encrypted file storage. • Provide an example of an industry-specific cloud-based storage solution.

Network Storage Began with File Servers

• Years ago, local-area networks used special servers, called file servers, to support file sharing, file replication, and storage for large files.

Storage Area Networks • Make one or more storage devices appear to be directly connected to the network.

• Behind the scenes, the devices were actually connected to SAN hardware through the use of network cables.

• Software running within the SAN device made the devices appear directly accessible to the rest of the network.

Network-Attached Storage (NAS) • Plug directly into the network.

Advantages of NAS • Compatibility: NAS devices normally support common file systems, which, in turn,

make them fully compatible with common operating systems. • Ease of performing backups: NAS devices are commonly used for backup devices.

Within a home, for example, all devices can easily access and back up files to a NAS device.

• Reliability: A NAS device typically provides advanced data striping across multiple volumes within the device. If one (or more) volumes fail, the data striping would maintain the data and allow reconstruction of the file contents.

• Performance: Because the NAS device did not run a complete operating system, the hardware had less system overhead, which allowed it to outperform a file server.

Cloud-Based Storage

• Cloud-based data storage is the next step in the evolution of NAS devices.

• Data storage resides in the cloud.

• Across the web (the cloud), many providers offer data storage that resides in the cloud.

Cloud-Based Storage Continued • The cloud storage device mechanism represents storage devices

that are designed specifically for cloud-based provisioning. Instances of these devices can be virtualized, similar to how physical servers can spawn virtual server images

• Cloud storage device mechanisms provide common logical units of data storage, such as: – Files – Collections of data are grouped into files that are located in folders. – Blocks – The lowest level of storage and the closest to the hardware, a block is the smallest unit of data that is still individually accessible. – Datasets – Sets of data are organized into a table-based, delimited, or record format. – Objects – Data and its associated metadata are organized as Web-based resources.

Cloud-Based Storage Illustration

Figure 7.9 Different cloud service consumers utilize different technologies to interface with virtualized cloud storage devices

Real World: HomePipe • Many users now rely on cloud-based storage to provide them with

access to files from anywhere at any time, often with any device. • Despite that, users still encounter situations when the file they

need resides on a computer at their home or office—often because they made a last-minute change and forgot to upload the file to the cloud. That’s where HomePipe comes to rescue.

• HomePipe is a program that lets users access files on their own system from anywhere on the web.

Real World: ZumoDrive • ZumoDrive provides cloud-based storage that is

scalable to meet customer needs, initially at no charge. • The files that are stored on ZumoDrive are accessible

from a variety of devices. • From their own PC, customers can use a web interface

or map a drive letter to the ZumoDrive storage and the access the cloud-based files as they would access files from their local system.

Advantages of Cloud-Based Storage • Scalability: Most cloud-based data storage providers let

users scale their storage capacity (up or down) to align with their storage needs.

• Pay for use: With most cloud-based data storage facilities, users pay only for the storage (within a range) that they need.

• Reliability: Many cloud-based data storage facilities provide transparent data replication.

Advantages of Cloud-Based Storage Continued

• Ease of access: Most cloud-based data storage facilities support web-based access to files from any place, at any time, using a variety of devices.

• Ease of use: Many cloud-based data storage solutions let users map a drive letter to the remote file storage area and then access the files through the use of a logical drive.

Disadvantages of Cloud-Based Storage

• Performance: Because the cloud-based disk storage devices are accessed over the Internet, they will never be as fast as local drives.

• Security: Some users will never feel comfortable with their data in the cloud.

• Data orphans: Users may abandon data in cloud storage facilities, leaving confidential private or company data at risk.

Real World: DropBox • Dropbox is a cloud-based storage facility for photos, documents,

and other digital content. • After you download and install Dropbox, your system will have a

user-level Dropbox folder. • When you place a file into the Dropbox folder (either by cutting

and pasting, dragging and dropping, or saving), a copy of the file is automatically saved to the Dropbox cloud storage facility.

• Later if you need to access the file from another computer, you can simply log in to your Dropbox account on the web.

Real World: DropBox Continued • Dropbox also makes it very easy for users to share files.

If, for example, you place a file within the Dropbox Public folder, you can then send a link to other users that they can use to access the file.

• Dropbox supports a variety of devices.

• Dropbox lets users try the software free of charge and provides them with ample storage space to get started.

Real World: Microsoft SkyDrive • Cloud-based data storage systems allow users to access their

documents from any place at any time. • Microsoft SkyDrive provides cloud-based data storage.

Through the SkyDrive web interface, you can drag and drop files to and from the cloud.

• What makes SkyDrive special is that if the PC from which you are accessing the files does not have Microsoft Office installed, SkyDrive lets you launch Word, Excel, and PowerPoint documents within Microsoft Office Web Apps.

Real World: Gladnet • Most cloud-based data storage facilities provide a drag-and-drop

user interface that you can use to move files to and from the cloud.

• Some cloud storage systems also let you access your files using a logical disk drive letter, treating the files as if they reside on a local disk drive.

• Gladinet provides software to mount many cloud-based data storage services as a drive letter. In this way, you can access the files on the drive just as you would any files on your system.

Real World: BoxCryptor • BoxCryptor is a software tool that encrypts and decrypts cloud-based files

on a file-by-file basis. • When you install BoxCryptor, the installation will create a folder within

your cloud-based folder on your system and will map a drive letter to that folder.

• When you use the drive letter to store a file, BoxCryptor will encrypt the file and place the encrypted contents on the cloud.

• When you retrieve the file, BoxCryptor will decrypt the file. If a hacker gains access to your cloud storage, the encrypted files will be unusable.

Cloud-Based Backups • Files are backed up in an encrypted format. • Users can schedule when backup operations are to

occur. • Users can easily retrieve backup files from the

cloud. • Most systems support Windows, Linux, and Mac

OS.

Real World: Mozy Backups

• Mozy provides cloud-based backups for personal and business users. Mozy provides an encrypted backup and runs without the need for user intervention on Windows- and Mac-based systems.

• Mozy has existed as a company since 2005 and has millions of customers worldwide.

File Systems • Operating systems exist to allow users to run programs and to store and retrieve data

(files) from one user session to the next. • Within the operating system, special software, called the file system, oversees the

storage and retrieval of files to and from a disk. • When you copy a file, delete a file, or create and move files between folders, the file

system is performing the work. • Initially, file systems allowed users to manipulate only local files that reside on one of

the PC’s disk drives. • As networks became more prevalent, so too did network operating systems, which

allow users and programs to manipulate files residing on a device across the network.

Cloud-Based File System

• A cloud file system (CFS) allows users or applications to directly manipulate files that reside on the cloud.

• Today several cloud file systems are emerging that allow users and programs to manipulate files residing in the cloud.

Real World: Oracle Cloud File System

• Oracle is one of the world’s leading database solution providers. Oracle has on-site and cloud-based database solutions.

• Oracle offers a cloud-based file system that users can use to store and retrieve files that will reside outside of the database.

• The Oracle Cloud File System resides above cloud-based storage devices and supports Windows- and Linux-based applications.

Real World: Oracle Cloud File System Continued

• Advantages of Oracle’s Cloud File System include: – Snapshot-based file recovery: Files can be recovered to a specific data snapshot

that allows simpler fallback. – File group by tagging: Users can associate one or more files via a tag name

grouping for subsequent group-based file operations, such as replication. – File replication: Key files can be replicated across multiple volumes. – Access-control-based security: Administrators can control access to specific files

via access control lists. – Encryption: The Oracle Cloud File System supports file-by-file, directory, or file

system encryption.

Real World: Hadoop Distributed File System

• Apache Hadoop is an open source project, the goal of which is to support reliable, scalable distributed computing. Part of the project includes the Hadoop Distributed File System (HDFS), a Java-based file system that is well suited for cloud-based storage.

• HDFS is designed to be highly fault tolerant and robust to maintain operation in the event of a device failure.

Industry-Specific Data Storage • In the future, healthcare data will be accessible in real time to a wide range of

medical facilities, some on-ground and some mobile.

Real World: Microsoft HealthVault • Microsoft HealthVault provides a secure storage facility within

which people can store their medical records, prescriptions, and even measurements from a variety of medical devices.

• People can use Microsoft HealthVault to track their own medical records or those of family members for whom they assist with medical care.

• After you store records within Microsoft HealthVault, you can e- mail a link to a physician, other healthcare personnel, or a family member to grant access to all or specific records.

Cloud-Based Database Solutions • A database that can not only be used by applications that reside (are hosted) in the

cloud, but also by applications that reside within the customer’s on-site data center.

Advantages of Cloud-Based Databases

• Cost-effective database scalability: Scale dynamically to meet customer needs on a pay as you go basis.

• High availability: Normally reside on redundant hardware, which results in high system uptime.

• High data redundancy: Normally replicated behind the scenes to increase data availability.

• Reduced administration: The cloud-based database provider maintains the database version updates and patches.

Disadvantages of Cloud-Based Databases

• Data security concerns: Some users still do not feel comfortable storing a database system in the cloud.

• Performance: Because data queries may travel the Internet, the cloud-based database access will not be as fast as a local database solution.

Real World: SQL Azure

• Microsoft SQL Azure is a cloud-based database solution that supports not only the Windows Azure PaaS, but on-site applications as well.

• As you would expect, SQL Azure provides scalability, database replication, load balancing, and automatic server failover.

Real World: Database.com • Database.com provides applications with access to a

cloud-based database through a library of API calls.

• All access to the underlying database is via developer- written code.

• Database.com does not provide a user interface to the database—instead, its focus is on the database itself.

Real World: Database.com Continued

• Administration: Database.com administers all aspects of the database.

• Performance tuning: Database.com monitors and manages the overall database performance.

• Scalability: Database.com can scale a solution up or down dynamically to meet user demands.

• Backups: Database.com manages data backups and redundancy. • Disaster recovery: Database.com provides redundant hardware

and storage to reduce the risk of a disaster.

Cloud-Based Block Storage • In the simplest sense, a block of data storage is a fixed-

sized sequence of bits. The size of the block normally corresponds to an underlying unit of storage on the cloud-based block storage device.

• Some applications work with very large blocks of data, the format of which has meaning only to the application itself—meaning that the data may not map well to storage within a file system or database.

Real World: Amazon EBS

• To support applications with large data block needs, Amazon provides the Amazon Elastic Block Store (EBS), a highly reliable, scalable, and available block storage solution.

• EBS supports block sizes up to a terabyte.

Key Terms

References

Primary:

Jamsa, K. A. (2013). Cloud computing: SaaS, PaaS, IaaS, virtualization, business models, mobile, security

and more. Burlington, MA: Jones & Bartlett Learning.

Secondary:

Erl, T., Mahmood, Z., & Puttini, R. (2014). Cloud computing: concepts, technology, & architecture. Upper

Saddle River, NJ: Prentice Hall.