Cloud
School of Computer & Information Sciences
ITS-532 Cloud Computing
Chapter 6 – Data Storage in the Cloud
Learning Objectives • Discuss the role of storage-area networks. • Discuss the role of network-attached storage. • Describe cloud-based storage solutions. • List the pros and cons of cloud-based storage. • Describe cloud-based database solutions. • List the pros and cons of cloud-based databases. • Describe specific cloud-based data storage solutions such as backups and
encrypted file storage. • Provide an example of an industry-specific cloud-based storage solution.
Network Storage Began with File Servers
• Years ago, local-area networks used special servers, called file servers, to support file sharing, file replication, and storage for large files.
Storage Area Networks • Make one or more storage devices appear to be directly connected to the network.
• Behind the scenes, the devices were actually connected to SAN hardware through the use of network cables.
• Software running within the SAN device made the devices appear directly accessible to the rest of the network.
Network-Attached Storage (NAS) • Plug directly into the network.
Advantages of NAS • Compatibility: NAS devices normally support common file systems, which, in turn,
make them fully compatible with common operating systems. • Ease of performing backups: NAS devices are commonly used for backup devices.
Within a home, for example, all devices can easily access and back up files to a NAS device.
• Reliability: A NAS device typically provides advanced data striping across multiple volumes within the device. If one (or more) volumes fail, the data striping would maintain the data and allow reconstruction of the file contents.
• Performance: Because the NAS device did not run a complete operating system, the hardware had less system overhead, which allowed it to outperform a file server.
Cloud-Based Storage
• Cloud-based data storage is the next step in the evolution of NAS devices.
• Data storage resides in the cloud.
• Across the web (the cloud), many providers offer data storage that resides in the cloud.
Cloud-Based Storage Continued • The cloud storage device mechanism represents storage devices
that are designed specifically for cloud-based provisioning. Instances of these devices can be virtualized, similar to how physical servers can spawn virtual server images
• Cloud storage device mechanisms provide common logical units of data storage, such as: – Files – Collections of data are grouped into files that are located in folders. – Blocks – The lowest level of storage and the closest to the hardware, a block is the smallest unit of data that is still individually accessible. – Datasets – Sets of data are organized into a table-based, delimited, or record format. – Objects – Data and its associated metadata are organized as Web-based resources.
Cloud-Based Storage Illustration
Figure 7.9 Different cloud service consumers utilize different technologies to interface with virtualized cloud storage devices
Real World: HomePipe • Many users now rely on cloud-based storage to provide them with
access to files from anywhere at any time, often with any device. • Despite that, users still encounter situations when the file they
need resides on a computer at their home or office—often because they made a last-minute change and forgot to upload the file to the cloud. That’s where HomePipe comes to rescue.
• HomePipe is a program that lets users access files on their own system from anywhere on the web.
Real World: ZumoDrive • ZumoDrive provides cloud-based storage that is
scalable to meet customer needs, initially at no charge. • The files that are stored on ZumoDrive are accessible
from a variety of devices. • From their own PC, customers can use a web interface
or map a drive letter to the ZumoDrive storage and the access the cloud-based files as they would access files from their local system.
Advantages of Cloud-Based Storage • Scalability: Most cloud-based data storage providers let
users scale their storage capacity (up or down) to align with their storage needs.
• Pay for use: With most cloud-based data storage facilities, users pay only for the storage (within a range) that they need.
• Reliability: Many cloud-based data storage facilities provide transparent data replication.
Advantages of Cloud-Based Storage Continued
• Ease of access: Most cloud-based data storage facilities support web-based access to files from any place, at any time, using a variety of devices.
• Ease of use: Many cloud-based data storage solutions let users map a drive letter to the remote file storage area and then access the files through the use of a logical drive.
Disadvantages of Cloud-Based Storage
• Performance: Because the cloud-based disk storage devices are accessed over the Internet, they will never be as fast as local drives.
• Security: Some users will never feel comfortable with their data in the cloud.
• Data orphans: Users may abandon data in cloud storage facilities, leaving confidential private or company data at risk.
Real World: DropBox • Dropbox is a cloud-based storage facility for photos, documents,
and other digital content. • After you download and install Dropbox, your system will have a
user-level Dropbox folder. • When you place a file into the Dropbox folder (either by cutting
and pasting, dragging and dropping, or saving), a copy of the file is automatically saved to the Dropbox cloud storage facility.
• Later if you need to access the file from another computer, you can simply log in to your Dropbox account on the web.
Real World: DropBox Continued • Dropbox also makes it very easy for users to share files.
If, for example, you place a file within the Dropbox Public folder, you can then send a link to other users that they can use to access the file.
• Dropbox supports a variety of devices.
• Dropbox lets users try the software free of charge and provides them with ample storage space to get started.
Real World: Microsoft SkyDrive • Cloud-based data storage systems allow users to access their
documents from any place at any time. • Microsoft SkyDrive provides cloud-based data storage.
Through the SkyDrive web interface, you can drag and drop files to and from the cloud.
• What makes SkyDrive special is that if the PC from which you are accessing the files does not have Microsoft Office installed, SkyDrive lets you launch Word, Excel, and PowerPoint documents within Microsoft Office Web Apps.
Real World: Gladnet • Most cloud-based data storage facilities provide a drag-and-drop
user interface that you can use to move files to and from the cloud.
• Some cloud storage systems also let you access your files using a logical disk drive letter, treating the files as if they reside on a local disk drive.
• Gladinet provides software to mount many cloud-based data storage services as a drive letter. In this way, you can access the files on the drive just as you would any files on your system.
Real World: BoxCryptor • BoxCryptor is a software tool that encrypts and decrypts cloud-based files
on a file-by-file basis. • When you install BoxCryptor, the installation will create a folder within
your cloud-based folder on your system and will map a drive letter to that folder.
• When you use the drive letter to store a file, BoxCryptor will encrypt the file and place the encrypted contents on the cloud.
• When you retrieve the file, BoxCryptor will decrypt the file. If a hacker gains access to your cloud storage, the encrypted files will be unusable.
Cloud-Based Backups • Files are backed up in an encrypted format. • Users can schedule when backup operations are to
occur. • Users can easily retrieve backup files from the
cloud. • Most systems support Windows, Linux, and Mac
OS.
Real World: Mozy Backups
• Mozy provides cloud-based backups for personal and business users. Mozy provides an encrypted backup and runs without the need for user intervention on Windows- and Mac-based systems.
• Mozy has existed as a company since 2005 and has millions of customers worldwide.
File Systems • Operating systems exist to allow users to run programs and to store and retrieve data
(files) from one user session to the next. • Within the operating system, special software, called the file system, oversees the
storage and retrieval of files to and from a disk. • When you copy a file, delete a file, or create and move files between folders, the file
system is performing the work. • Initially, file systems allowed users to manipulate only local files that reside on one of
the PC’s disk drives. • As networks became more prevalent, so too did network operating systems, which
allow users and programs to manipulate files residing on a device across the network.
Cloud-Based File System
• A cloud file system (CFS) allows users or applications to directly manipulate files that reside on the cloud.
• Today several cloud file systems are emerging that allow users and programs to manipulate files residing in the cloud.
Real World: Oracle Cloud File System
• Oracle is one of the world’s leading database solution providers. Oracle has on-site and cloud-based database solutions.
• Oracle offers a cloud-based file system that users can use to store and retrieve files that will reside outside of the database.
• The Oracle Cloud File System resides above cloud-based storage devices and supports Windows- and Linux-based applications.
Real World: Oracle Cloud File System Continued
• Advantages of Oracle’s Cloud File System include: – Snapshot-based file recovery: Files can be recovered to a specific data snapshot
that allows simpler fallback. – File group by tagging: Users can associate one or more files via a tag name
grouping for subsequent group-based file operations, such as replication. – File replication: Key files can be replicated across multiple volumes. – Access-control-based security: Administrators can control access to specific files
via access control lists. – Encryption: The Oracle Cloud File System supports file-by-file, directory, or file
system encryption.
Real World: Hadoop Distributed File System
• Apache Hadoop is an open source project, the goal of which is to support reliable, scalable distributed computing. Part of the project includes the Hadoop Distributed File System (HDFS), a Java-based file system that is well suited for cloud-based storage.
• HDFS is designed to be highly fault tolerant and robust to maintain operation in the event of a device failure.
Industry-Specific Data Storage • In the future, healthcare data will be accessible in real time to a wide range of
medical facilities, some on-ground and some mobile.
Real World: Microsoft HealthVault • Microsoft HealthVault provides a secure storage facility within
which people can store their medical records, prescriptions, and even measurements from a variety of medical devices.
• People can use Microsoft HealthVault to track their own medical records or those of family members for whom they assist with medical care.
• After you store records within Microsoft HealthVault, you can e- mail a link to a physician, other healthcare personnel, or a family member to grant access to all or specific records.
Cloud-Based Database Solutions • A database that can not only be used by applications that reside (are hosted) in the
cloud, but also by applications that reside within the customer’s on-site data center.
Advantages of Cloud-Based Databases
• Cost-effective database scalability: Scale dynamically to meet customer needs on a pay as you go basis.
• High availability: Normally reside on redundant hardware, which results in high system uptime.
• High data redundancy: Normally replicated behind the scenes to increase data availability.
• Reduced administration: The cloud-based database provider maintains the database version updates and patches.
Disadvantages of Cloud-Based Databases
• Data security concerns: Some users still do not feel comfortable storing a database system in the cloud.
• Performance: Because data queries may travel the Internet, the cloud-based database access will not be as fast as a local database solution.
Real World: SQL Azure
• Microsoft SQL Azure is a cloud-based database solution that supports not only the Windows Azure PaaS, but on-site applications as well.
• As you would expect, SQL Azure provides scalability, database replication, load balancing, and automatic server failover.
Real World: Database.com • Database.com provides applications with access to a
cloud-based database through a library of API calls.
• All access to the underlying database is via developer- written code.
• Database.com does not provide a user interface to the database—instead, its focus is on the database itself.
Real World: Database.com Continued
• Administration: Database.com administers all aspects of the database.
• Performance tuning: Database.com monitors and manages the overall database performance.
• Scalability: Database.com can scale a solution up or down dynamically to meet user demands.
• Backups: Database.com manages data backups and redundancy. • Disaster recovery: Database.com provides redundant hardware
and storage to reduce the risk of a disaster.
Cloud-Based Block Storage • In the simplest sense, a block of data storage is a fixed-
sized sequence of bits. The size of the block normally corresponds to an underlying unit of storage on the cloud-based block storage device.
• Some applications work with very large blocks of data, the format of which has meaning only to the application itself—meaning that the data may not map well to storage within a file system or database.
Real World: Amazon EBS
• To support applications with large data block needs, Amazon provides the Amazon Elastic Block Store (EBS), a highly reliable, scalable, and available block storage solution.
• EBS supports block sizes up to a terabyte.
Key Terms
References
Primary:
Jamsa, K. A. (2013). Cloud computing: SaaS, PaaS, IaaS, virtualization, business models, mobile, security
and more. Burlington, MA: Jones & Bartlett Learning.
Secondary:
Erl, T., Mahmood, Z., & Puttini, R. (2014). Cloud computing: concepts, technology, & architecture. Upper
Saddle River, NJ: Prentice Hall.